Commit Graph

53064 Commits

Author SHA1 Message Date
Simon Dardis 7563624fcb [mips] Correct clo/clz predicates
Reviewers: smaksimovic, abeserminji, atanasyan

Differential Revision: https://reviews.llvm.org/D46125

llvm-svn: 331754
2018-05-08 09:50:37 +00:00
Jeremy Morse 4f799c027e [X86] Mark all byval parameters as aliased
This is a fix for PR30290: by marking all byval stack slots as being aliased,
the instruction scheduler is more conservative about rescheduling memory
accesses to such stack slots as an LLVM Value* might alias it. This fixes
errors such as in the patched test case, where reads and writes to a data
structure are illegally mixed.

This could be fixed better in the future with better analysis for the
instruction scheduler to know what Values alias what stack slots.

Differential Revision: https://reviews.llvm.org/D45022

llvm-svn: 331749
2018-05-08 09:18:01 +00:00
Alexander Ivchenko c47f799289 [X86][CET] Shadow stack fix for setjmp/longjmp
This patch adds a shadow stack fix when compiling
setjmp/longjmp with the shadow stack enabled. This
allows setjmp/longjmp to work correctly with CET.

Patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D46181

llvm-svn: 331748
2018-05-08 09:04:07 +00:00
Martin Storsjo 4021cee996 [llvm-rc] Don't strictly require quotes around external file names
Regardless of what docs may say, existing resource files in the
wild can use this syntax.

Rename a file used in an existing test, to make it usable for unquoted
paths.

Differential Revision: https://reviews.llvm.org/D46511

llvm-svn: 331747
2018-05-08 08:47:37 +00:00
Gabor Buella 4a02bf945e [x86] Introduce the enclv instruction
Summary:
and use the -msgx flag as a requirement
for the SGX instructions.

Reviewers: craig.topper, zvi

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D46436

llvm-svn: 331742
2018-05-08 07:11:05 +00:00
Bjorn Pettersson 51cebc98f3 [LCSSA] Do not remove used PHI nodes in formLCSSAForInstructions
Summary:
In formLCSSAForInstructions we speculatively add new PHI
nodes, that sometimes ends up without having any uses. It
has been discovered that sometimes an added PHI node can
appear as being unused in one iteration of the Worklist,
although it can end up being used by a PHI node added in
a later iteration. We now check, a second time, that the
PHI node still is unused before we remove it. This avoids
an assert about "Trying to remove a phi with uses." for the
added test case.

Reviewers: davide, mzolotukhin, mattd, dberlin

Reviewed By: mzolotukhin, dberlin

Subscribers: dberlin, mzolotukhin, davide, bjope, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D46422

llvm-svn: 331741
2018-05-08 06:59:47 +00:00
Gabor Buella 2b5e96004b [x86] Introduce the pconfig instruction
Reviewers: craig.topper, zvi

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D46430

llvm-svn: 331739
2018-05-08 06:47:36 +00:00
Roman Tereshin d2421f9445 [MachineVerifier][GlobalISel] Verifying generic extends and truncates
Making sure we don't truncate / extend pointers, don't try to change
vector topology or bitcast vectors to scalars or back, and most
importantly, don't extend to a smaller type or truncate to a large
one.

Reviewers: qcolombet t.p.northover aditya_nandakumar

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D46490

llvm-svn: 331718
2018-05-08 02:48:15 +00:00
Roman Tereshin 5e51fac39a [MIRParser][GlobalISel] Parsing vector pointer types (<M x pA>)
MIParser wasn't able to parse LLTs like `<4 x p0>`, fixing that.

Reviewers: qcolombet t.p.northover aditya_nandakumar

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D46490

llvm-svn: 331712
2018-05-08 02:02:50 +00:00
Teresa Johnson 59da890c96 [NewPM] Emit inliner NoDefinition missed optimization remark
Summary: Makes this consistent with the old PM.

Reviewers: eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D46526

llvm-svn: 331709
2018-05-08 01:45:46 +00:00
Roman Tereshin 10bc5ad622 Follow Up on [MachineVerifier][GlobalISel] NFC, Improving MO printing and refactoring visitMachineInstrBefore
Fixing accidentally broken CodeGen/X86/verifier-generic-types-1.mir test

llvm-svn: 331695
2018-05-07 23:14:00 +00:00
Roman Tereshin d29fc89222 [MachineVerifier][GlobalISel] Checking that generic instrs have LLTs on all vregs
Every generic machine instruction must have generic virtual registers
only, that is, have a low-level type attached to each operand.

Previously MachineVerifier would catch a type missing on an operand
only if the previous operand for the the same type index exists and
have a type attached to it and it will report it as a type mismatch.
This is incosistent behaviour and a misleading error message.

This commit makes sure MachineVerifier explicitly checks that the
types are there for every operand and if not provides a
straightforward error message.

Reviewers: qcolombet t.p.northover bogner ab

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D46455

llvm-svn: 331694
2018-05-07 22:31:47 +00:00
Roman Tereshin f487edae49 [MachineVerifier][GlobalISel] NFC, Improving MO printing and refactoring visitMachineInstrBefore
This is an NFC pre-commit for the following "Checking that generic
instrs have LLTs on all vregs" commit.

This overloads MachineOperand::print to make it possible to print LLTs
with standalone machine operands.

This also overloads MachineVerifier::print(...MachineOperand...) with
an optional LLT using the newly introduced MachineOperand::print
variant; no actual calls added.

This also refactors MachineVerifier::visitMachineInstrBefore in the
parts dealing with all generic instructions (checking Selected
property, LLTs, and phys regs).

llvm-svn: 331693
2018-05-07 22:31:12 +00:00
Alexander Shaposhnikov a5be91f7d2 [tools] Add missing test dependency
Caught by the build bots.

llvm-svn: 331687
2018-05-07 22:00:59 +00:00
Roman Lebedev 9bd6067db6 [DAGCombiner] Masked merge: enhance handling of 'andn' with immediates
Summary:
Split off from D46031.

The previous patch, D46493, completely disabled unfolding in case of immediates.
But we can do better:
{F6120274} {F6120277}

https://rise4fun.com/Alive/xJS

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D46494

llvm-svn: 331685
2018-05-07 21:52:22 +00:00
Roman Lebedev cc42d08b1d [DagCombiner] Not all 'andn''s work with immediates.
Summary:
Split off from D46031.

In masked merge case, this degrades IPC by decreasing instruction count.
{F6108777}
The next patch should be able to recover and improve this.

This also affects the transform @spatel have added in D27489 / rL289738,
and the test coverage for X86 was missing.
But after i have added it, and looked at the changes in MCA, i'm somewhat confused.
{F6093591} {F6093592} {F6093593}
I'd say this regression is an improvement, since `IPC` increased in that case?

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: andreadb, llvm-commits, spatel

Differential Revision: https://reviews.llvm.org/D46493

llvm-svn: 331684
2018-05-07 21:52:11 +00:00
Dmitry Mikulin 738bac77c1 Remove explicit setting of the CFI jumptable section name, it does not appear
to be needed: jump table sections are created with .cfi.jumptable suffix. With
this change each jump table is placed in a separate section, which allows the
linker to re-order them.

Differential Revision: https://reviews.llvm.org/D46537

llvm-svn: 331680
2018-05-07 21:30:15 +00:00
Simon Pilgrim 061096d2c2 [llvm-mca][x86] Remove addsubpd from SSE2 tests
llvm-svn: 331678
2018-05-07 21:10:48 +00:00
Alexander Shaposhnikov f53d9abd7e [tools] Adjust the lit config for llvm-strip
Caught by the build bots.

llvm-svn: 331676
2018-05-07 21:07:01 +00:00
Simon Pilgrim 1233e1234a [X86] Split WriteFAdd/WriteFCmp/WriteFMul schedule classes
Split to support single/double for scalar, XMM and YMM/ZMM instructions - removing InstrRW overrides for these instructions.

Fixes Atom ADDSUBPD instruction and reclassifies VFPCLASS as WriteFCmp which is closer in behaviour.

llvm-svn: 331672
2018-05-07 20:52:53 +00:00
Martin Storsjo 577b981748 [llvm-rc] Implement the BITMAP resource type
Differential Revision: https://reviews.llvm.org/D46509

llvm-svn: 331670
2018-05-07 20:27:37 +00:00
Martin Storsjo 9410276cf7 [llvm-rc] Allow optional commas between the string table index and value
This form is even used in one of the examples at
https://msdn.microsoft.com/en-us/library/windows/desktop/aa381050(v=vs.85).aspx.

Differential Revision: https://reviews.llvm.org/D46508

llvm-svn: 331669
2018-05-07 20:27:28 +00:00
Martin Storsjo 28ae894a1d [llvm-rc] Exclude padding from sizes in versioninfo resources
Normally when writing something that requires padding, we first
measure the length of the written payload data, then write
padding if necessary.

For a recursive structure like versioninfo, this means that the
padding is excluded from the size of the inner element, but
included in the size of the enclosing block.

Rc.exe excludes the final padding (but not the padding of earlier
children) from all levels of the hierarchy.

To achieve this, don't pad after each block or value, but only
before starting the next one. We still pad after completing the
toplevel versioninfo resource, so this won't affect other resource
types.

Differential Revision: https://reviews.llvm.org/D46510

llvm-svn: 331668
2018-05-07 20:27:23 +00:00
Aaron Smith 47589e09dd [SelectionDAG] Transfer DbgValues when casts are optimized in SelectionDAG::getNode
Summary:
getNode optimizes (ext (trunc x)) to x and the dbgvalue node on trunc is lost. The fix calls transferDbgValues to add the dbgvalue to x.

Add DebugInfo/AArch64/dbg-value-i16.ll

Patch by Sejong Oh!

Reviewers: aprantl, javed.absar, llvm-commits, vsk

Reviewed By: aprantl, vsk

Subscribers: kristof.beyls, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D46348

llvm-svn: 331665
2018-05-07 20:15:50 +00:00
Alexander Shaposhnikov cca6998504 [tools] Introduce llvm-strip
llvm-strip is supposed to be a drop-in replacement for binutils strip.
To start the ball rolling this diff adds the initial bits for llvm-strip,
more features will be added incrementally over time.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D46407

llvm-svn: 331663
2018-05-07 19:32:09 +00:00
Simon Pilgrim e480ed0b9f [X86][AVX2] Tag VPMOVSX/VPMOVZX ymm instructions as WriteShuffle256
These are more like cross-lane shuffles than regular shuffles - we already do this for AVX512 equivalents.

Differential Revision: https://reviews.llvm.org/D46229

llvm-svn: 331659
2018-05-07 18:25:19 +00:00
Krzysztof Parzyszek 786fc3d079 [Hexagon] Move clamping of extended operands directly to MC code emitter
llvm-svn: 331653
2018-05-07 17:34:23 +00:00
Roman Lebedev 0412c90236 [DAGCombine][NFC] Masked merge unfolding: comment: some tests are non-canonical
As requested in https://reviews.llvm.org/D46494#inline-407282

llvm-svn: 331650
2018-05-07 16:42:47 +00:00
Simon Pilgrim 763bf12085 [X86][Znver1] Remove WriteFMul/WriteFRcp InstRW overrides/aliases.
Fixes x87 schedules to more closely match Agner - AMD doesn't tend to "special case" x87 instructions as much as Intel.

llvm-svn: 331645
2018-05-07 16:34:26 +00:00
Simon Pilgrim ac5d0a31ef [X86] Split WriteFDiv schedule classes to support single/double scalar, XMM and YMM/ZMM instructions.
This removes all InstrRW overrides for these instructions - some x87 overrides remain but most use default (and realistic) values.

llvm-svn: 331643
2018-05-07 16:15:46 +00:00
Mark Searles 4a0f2c5047 [AMDGPU][Waitcnt] Remove the old waitcnt pass
Remove the old waitcnt pass ( si-insert-waits ), which is no longer maintained
and getting crufty

Differential Revision: https://reviews.llvm.org/D46448

llvm-svn: 331641
2018-05-07 14:43:28 +00:00
Petar Jovanovic cc4915701c Add option -verify-cfiinstrs to run verifier in CFIInstrInserter
Instead of enabling it for non NDEBUG builds, use -verify-cfiinstrs to
run verifier in CFIInstrInserter. It defaults to false.

Differential Revision: https://reviews.llvm.org/D46444

llvm-svn: 331635
2018-05-07 14:09:33 +00:00
Tim Renouf 18a1e9d03a [AMDGPU] Don't force WQM for DS op
Summary:
Previously, all DS ops forced WQM in a pixel shader. That was a hack to
allow for graphics frontends using ds_swizzle to implement explicit
derivatives, on SI/CI at least where DPP is not available. But it forced
WQM for _any_ DS op.

With this commit, DS ops no longer force WQM. Both graphics frontends
(Mesa and LLPC) need to change to issue an explicit llvm.amdgcn.wqm
intrinsic call when calculating explicit derivatives.

The required Mesa change is: "amd/common: use llvm.amdgcn.wqm for
explicit derivatives".

Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46051

Change-Id: I9b745b626fa91bbd66456e6cf41ee07eeea42f81
llvm-svn: 331633
2018-05-07 13:21:26 +00:00
Simon Pilgrim f3ae50fca2 [X86] Split WriteFRcp/WriteFRsqrt/WriteFSqrt schedule classes
WriteFRcp/WriteFRsqrt are split to support scalar, XMM and YMM/ZMM instructions.

WriteFSqrt is split into single/double/long-double sizes and scalar, XMM, YMM and ZMM instructions.

This removes all InstrRW overrides for these instructions.

NOTE: There were a couple of typos in the Znver1 model - notably a 1cy throughput for SQRT that is highly unlikely and doesn't tally with Agner.

NOTE: I had to add Agner's numbers for several targets for WriteFSqrt80.
llvm-svn: 331629
2018-05-07 11:50:44 +00:00
Jonas Paulsson ebb1605bf3 [SystemZ] Bugfix for MVCLoop CC clobbering.
MVCLoop clobbers CC (since it emits a compare/branch), but this was not
modelled.

Review: Ulrich Weigand
llvm-svn: 331627
2018-05-07 10:48:43 +00:00
Roman Lebedev ffec58bce6 [InstCombine][NFC] Add tests for one more masked merge pattern.
This pattern came up in D46494.
I'm pretty sure we want to canonicalize it from
	(x | ~m) & (y &  m)
to
	(x &  m) | (y & ~m)

https://rise4fun.com/Alive/TEM

llvm-svn: 331625
2018-05-07 09:42:45 +00:00
Craig Topper cb2abc7977 [X86] Enable reciprocal estimates for v16f32 vectors by using VRCP14PS/VRSQRT14PS
Summary:
The legacy VRCPPS/VRSQRTPS instructions aren't available in 512-bit versions. The new increased precision versions are. So we can use those to implement v16f32 reciprocal estimates.

For KNL CPUs we can probably use VRCP28PS/VRSQRT28PS and avoid the NR step altogether, but I leave that for a future patch.

Reviewers: spatel

Reviewed By: spatel

Subscribers: RKSimon, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D46498

llvm-svn: 331606
2018-05-06 17:48:21 +00:00
Craig Topper b02e3dec4b [X86] Add test cases for reciprocal estimation for v16f32 vectors with AVX512F.
We should be able to use the vrsqrt14ps and vrcp14ps instructions for these cases.

llvm-svn: 331605
2018-05-06 17:45:40 +00:00
Amaury Sechet 0ea2936104 Add test cases for large integer legalization of add and sub. NFC
llvm-svn: 331604
2018-05-06 16:00:23 +00:00
Daniel Sanders acc008cb0c [globalisel] Remove redundant -global-isel option from tests that use -run-pass. NFC
As Roman Tereshin pointed out in https://reviews.llvm.org/D45541, the
-global-isel option is redundant when -run-pass is given. -global-isel sets up
the GlobalISel passes in the pass manager but -run-pass skips that entirely and
configures it's own pipeline.

llvm-svn: 331603
2018-05-05 21:19:59 +00:00
Daniel Sanders f84bc3793e [globalisel] Update GlobalISel emitter to match new representation of extending loads
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
  registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
  extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
  improve optimization of extends and truncates, this legality requirement
  would spread without considerable care w.r.t when certain combines were
  permitted.
* The SelectionDAG importer required some ugly and fragile pattern
  rewriting to translate patterns into this style.

This patch changes the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.

Each extending load can be lowered by the legalizer into separate extends
and loads, however a target that supports s1 will need the any-extending
load to extend to at least s8 since LLVM does not represent memory accesses
smaller than 8 bit. The legalizer can widenScalar G_LOAD into an
any-extending load but sign/zero-extending loads need help from something
else like a combiner pass. A follow-up patch that adds combiner helpers for
for this will follow.

The new representation requires that the MMO correctly reflect the memory
access so this has been corrected in a couple tests. I've also moved the
extending loads to their own tests since they are (mostly) separate opcodes
now. Additionally, the re-write appears to have invalidated two tests from
select-with-no-legality-check.mir since the matcher table no longer contains
loads that result in s1's and they aren't legal in AArch64 anymore.

Depends on D45540

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar

Reviewed By: rtereshin

Subscribers: javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D45541

llvm-svn: 331601
2018-05-05 20:53:24 +00:00
Heejin Ahn c86da6bcfe [MIRPraser] Improve error checking for typed immediate operands
Summary:
This improves error checks for typed immediate operands introduced in
D45948 (rL331586), and removes a code block copied by mistake.

Reviewers: rtereshin

Subscribers: dschuff, sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D46491

llvm-svn: 331600
2018-05-05 20:53:23 +00:00
Roman Lebedev a3b0b59f54 [DAGCombiner] Masked merge: don't touch "not" xor's.
Summary:
Split off form D46031.

It seems we don't want to transform the pattern if the `xor`'s are actually `not`'s.
In vector case, this breaks `andnpd` / `vandnps` patterns.

That being said, we may want to re-visit this `not` handling, maybe in D46073.

Reviewers: spatel, craig.topper, javed.absar

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46492

llvm-svn: 331595
2018-05-05 15:45:40 +00:00
Piotr Padlewski e9832dfdf3 [CaptureTracking] Handle capturing of launder.invariant.group
Summary:
launder.invariant.group has the same rules of capturing as
bitcast, gep, etc - the original value is not captured
if the returned pointer is not captured.

With this patch, we mark 40% more functions as noalias when compiling with -fstrict-vtable-pointers;
1078 vs 1778  (39.37%)

Reviewers: sanjoy, davide, nlewycky, majnemer, mehdi_amini

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D32673

llvm-svn: 331587
2018-05-05 10:23:27 +00:00
Heejin Ahn c2ad096845 [MIRParser] Allow register class names in the form of integer/scalar
Summary:
The current code cannot handle register class names like 'i32', which is
a valid register class name in WebAssembly. This patch removes special
handling for integer/scalar/pointer type parsing and treats them as
normal identifiers.

Reviewers: thegameg

Subscribers: jfb, dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D45948

llvm-svn: 331586
2018-05-05 07:05:51 +00:00
Teresa Johnson b77ab0966e [LTO] Allow pass remarks with hotness to be set when emitting to stderr
Summary:
Set setDiagnosticsHotnessRequested before the early exit check for a
diagnostic output file, so that pass remarks with hotness works when
emitting pass remarks to stderr (e.g. via -pass-remarks=.).

Also fix the llvm-lto2 diagnistic handler so that it only calls exit(1)
when the diagnistic is an error type. Otherwise the new test invocation
of llvm-lto2 with -pass-remarks causes it to fail. The new code is
consistent with the diagnostic handler elsewhere (e.g. on the
LLVMContext).

Reviewers: pcc, davide

Subscribers: fhahn, mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D46387

llvm-svn: 331569
2018-05-04 23:59:34 +00:00
Michael Berg 2dcf12ffd4 Mapping SDNode flags to MachineInstr flags
Summary: Providing the glue to map SDNode fast math sub flags to MachineInstr fast math sub flags.

Reviewers: spatel, arsenm, wristow

Reviewed By: spatel

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D46447

llvm-svn: 331567
2018-05-04 23:41:15 +00:00
Philip Reames 5b39acd111 [LICM] Compute a must execute property for the prefix of the header as we go
Computing this property within the existing walk ensures that the cost is linear with the size of the block. If we did this from within isGuaranteedToExecute, it would be quadratic without some very fancy caching.

This allows us to reliably catch a hoistable instruction within a header which may throw at some point *after* our hoistable instruction. It doesn't do anything for non-header cases, but given how common single block loops are, this seems very worthwhile.

llvm-svn: 331557
2018-05-04 21:35:00 +00:00
Konstantin Zhuravlyov c2c2eb7d01 AMDGPU: Add D16 instructions preserve unused bits feature
- Predicate D16 patterns on this new feature
- Added this new feature to gfx900/2/4

Differential Revision: https://reviews.llvm.org/D46366

llvm-svn: 331551
2018-05-04 20:06:57 +00:00
Geoff Berry 8e4958e760 [MachineLICM] Debug intrinsics shouldn't affect hoist decisions
Summary:
When checking if an instruction stores to a given frame index, check
that the instruction can write to memory before looking at the memory
operands list to avoid e.g. DBG_VALUE instructions that reference a
frame index preventing a load from that index from being hoisted.

Reviewers: dblaikie, MatzeB, qcolombet, reames, javed.absar

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D46284

llvm-svn: 331549
2018-05-04 19:25:09 +00:00
Shoaib Meenai 57fadab1cb [ObjCARC] Account for catchswitch in bitcast insertion
A catchswitch is both a pad and a terminator, meaning it must be the
only non-phi instruction in its basic block. When we're inserting a
bitcast in the incoming basic block for a phi, if that incoming block is
a catchswitch, we should go up the dominator tree to find a valid
insertion point rather than attempting to insert before the catchswitch
(which would result in invalid IR).

Differential Revision: https://reviews.llvm.org/D46412

llvm-svn: 331548
2018-05-04 19:03:11 +00:00
Michael Berg 7acc81b744 Fast Math Flag mapping into SDNode
Summary: Adding support for Fast flags in the SDNode to leverage fast math sub flag usage.

Reviewers: spatel, arsenm, jbhateja, hfinkel, escha, qcolombet, echristo, wristow, javed.absar

Reviewed By: spatel

Subscribers: llvm-commits, rampitec, nhaehnle, tstellar, FarhanaAleen, nemanjai, javed.absar, jbhateja, hfinkel, wdng

Differential Revision: https://reviews.llvm.org/D45710

llvm-svn: 331547
2018-05-04 18:48:20 +00:00
Simon Pilgrim 0e51a125ea [X86] Add WriteEMMS scheduler class
Filled in the missing values from Btver2 SoG or Agner

llvm-svn: 331546
2018-05-04 18:16:13 +00:00
Simon Pilgrim d7ffbc5c7e [X86] Finish splitting WriteVecShift and WriteVecIMul to remove InstRW overrides.
llvm-svn: 331543
2018-05-04 17:47:46 +00:00
Adhemerval Zanella 6d56294c7f [AArch64] Add missing testcase for r331522
llvm-svn: 331541
2018-05-04 17:21:26 +00:00
Peter Collingbourne 9096413f9f obj2yaml: Correctly round-trip default alignment.
Previously we were emitting the "cooked" alignment, which made it hard
to distinguish between that and the default alignment.

Differential Revision: https://reviews.llvm.org/D46418

llvm-svn: 331537
2018-05-04 16:28:41 +00:00
Adrian Prantl 3edc63a579 DwarfCompileUnit: Fix another assertion failure on malformed input
that is not rejected by the Verifier.

Thanks to Björn Pettersson for providing a reproducer!

llvm-svn: 331535
2018-05-04 16:10:43 +00:00
Krzysztof Parzyszek effcc2fb79 [Hexagon] Handle non-immediate constants in HexagonSplitDouble
llvm-svn: 331527
2018-05-04 15:04:48 +00:00
Simon Dardis 65b0492f0d [mips] Correct the predicates of sign extension instructions
And eliminatw the duplication of those instructions for microMIPS32r6.

Reviewers: smaksimovic, abeserminji, atanasyan

Differential Revision: https://reviews.llvm.org/D46117

llvm-svn: 331526
2018-05-04 15:00:54 +00:00
Adhemerval Zanella a57ef17ab6 [AArch64] Custom Lower MULLH{S,U} for v16i8, v8i16, and v4i32
This patch adds a custom lowering for ISD::MULH{S,U} used on divide by
constant optimization (DAGCombiner::BuildSDIV and DAGCombiner::BuildUDIV).

New patterns for smull and umull are added, so AArch64ISD::{S,U}MULL
can be correctly lowered to smull2 and umull2.

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46009

llvm-svn: 331522
2018-05-04 14:33:55 +00:00
Krzysztof Parzyszek af73d2bdd9 [Hexagon] Skip reserved physical registers when updating liveness
llvm-svn: 331518
2018-05-04 13:59:05 +00:00
Simon Pilgrim be51b20127 [X86] Add SchedWriteFRnd fp rounding scheduler classes
Split off from SchedWriteFAdd for fp rounding/bit-manipulation instructions.

Fixes an issue on btver2 which only had the ymm version using the JSTC pipe instead of JFPA.

llvm-svn: 331515
2018-05-04 12:59:24 +00:00
Jeremy Morse 07e8daa66b [X86] Add test case for PR30290s failing behaviour
Following the advice in review D45022, this currently tests for the broken llc
output where an instruction is mis-scheduled. This test is committed in advance
to improve the eventual fixing patch in D45022, making the bad behaviour that
that patch fixes clearer.

llvm-svn: 331514
2018-05-04 10:05:10 +00:00
Jeremy Morse 71f17bf855 Word wrap a test-file comment to 80 columns
This is a test commit to check whether my account works.

llvm-svn: 331512
2018-05-04 08:58:06 +00:00
Jonas Paulsson 72fe760592 [RegUsageInfoCollector] Bugfix for handling of register aliases.
Don't assume the alias of a defined reg is always already in the set.

As the test case in https://bugs.llvm.org/show_bug.cgi?id=36587 discovered,
it is wrong to assume that all the aliases of the defined register in the
*current function* is already present in the UsedPhysRegsMask.

This patch changes this so that any definition in the current function of a
phys-reg always results in all its aliases inserted into the set of defined
registers.

Review: Quentin Colombet
https://reviews.llvm.org/D45157

llvm-svn: 331509
2018-05-04 07:50:05 +00:00
Craig Topper a3f39ee33d [LoopIdiomRecognize] Add a test case to show incorrect transformation of an infinite loop with side effets into a countable loop using ctlz.
We currently recognize this idiom where x is signed and thus the shift in an ashr.

int cnt = 0;
while (x) {
  x >>= 1; // arithmetic shift right
  ++cnt;
}

and turn it into (bitwidth - ctlz(x)). And if there is anything else in the loop we will create a new loop that runs that many times.

If x is initially negative, the shift result will never be 0 and thus the loop is infinite. If you put something with side effects in the loop, that side effect will now only happen bitwidth times instead of an infinite number of times.

So this transform is only safe for logical shift right (which we don't currently recognize) or if we can prove that x cannot be negative before the loop.

llvm-svn: 331493
2018-05-03 23:50:29 +00:00
Simon Pilgrim 0aed731516 [X86][Znver1] Use SchedAlias to tag microcoded scheduler classes
Avoids extra entries in the class tables.

Found a typo that missed the MMX_PHSUBSW instruction.

llvm-svn: 331488
2018-05-03 22:12:23 +00:00
Sanjay Patel e7b6654711 [InstCombine] refine select-of-constants to bitwise ops
Add logic for the special case when a cmp+select can clearly be
reduced to just a bitwise logic instruction, and remove an 
over-reaching chunk of general purpose bit magic. The primary goal 
is to remove cases where we are not improving the IR instruction 
count when doing these select transforms, and in all cases here that 
is true.

In the motivating 3-way compare tests, there are further improvements
because we can combine/propagate select values (not sure if that
belongs in instcombine, but it's there for now).

DAGCombiner has folds to turn some of these selects into bit magic,
so there should be no difference in the end result in those cases.
Not all constant combinations are handled there yet, however, so it
is possible that some targets will see more cmov/csel codegen with
this change in IR canonicalization. 

Ideally, we'll go further to *not* turn selects into multiple 
logic/math ops in instcombine, and we'll canonicalize to selects.
But we should make sure that this step does not result in regressions
first (and if it does, we should fix those in the backend).

The general direction for this change was discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html
http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html

Alive proofs for the new bit magic:
https://rise4fun.com/Alive/XG7

Differential Revision: https://reviews.llvm.org/D46086

llvm-svn: 331486
2018-05-03 21:58:44 +00:00
Teresa Johnson 85cc298c1a [ThinLTO] Add support for optimization remarks to thinBackend
Summary:
Support was added to the regular LTO backend, but not thinBackend.
This patch adds that support.

Reviewers: pcc, davide

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D46376

llvm-svn: 331481
2018-05-03 20:24:12 +00:00
Sanjay Patel 52151885e4 [PowerPC] add more FMF debug output; NFC
We can't see all of the problems currently unless
we look at debug output when the global 'unsafe' is
on. It's a mess. This is another attempt to make
sure that D45710 is not making changes unintentionally.

llvm-svn: 331476
2018-05-03 18:49:35 +00:00
Simon Pilgrim f2d2cedab4 [X86] Split WriteVecShift/WriteVarVecShift into MMX, XMM and YMM/ZMM scheduler classes
This took a bit of extra work as on Intel targets the old (V)PSLLDrr/(V)PSLLDrm style instructions act differently - I ended up creating WriteVecShiftImm classes for XMM/YMM/ZMM vector shift by immediate and retaining WriteVecShift as the default (used only by MMX) plus WriteVecShiftX/WriteVecShiftY. X86SchedWriteWidths hides most of this thank goodness.

llvm-svn: 331472
2018-05-03 17:56:43 +00:00
Sanjay Patel e7532d2940 [PowerPC] add tests for FMF propagation; NFC
I'm choosing PPC out of convenience because it does
all of the transforms of interest in these tests by
default. There are multiple FMF problems shown in the 
current checks. D45710 is proposing to fix part of 
that.

llvm-svn: 331471
2018-05-03 17:41:37 +00:00
Bjorn Pettersson 304877e5ec Reapply "[SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)"
Summary:
This reverts SVN r331441 (reapplies r331337), together with a fix
in to handle an already existing fragment expression in the
dbg.value that must be fragmented due to a split PHI node.

This should solve the problem seen in PR37321, which was the
reason for the revert of r331337.

The situation in PR37321 is that we have a PHI node like this

   %u.sroa = phi i80 [ %u.sroa.x, %if.x ],
                     [ %u.sroa.y, %if.y ],
                     [ %u.sroa.z, %if.z ]

and a dbg.value like this

  call void @llvm.dbg.value(metadata i80 %u.sroa,
                            metadata !13,
                            metadata !DIExpression(DW_OP_LLVM_fragment, 0, 80))

The phi node is split into three 32-bit PHI nodes

  %30:gr32 = PHI %11:gr32, %bb.4, %14:gr32, %bb.5, %27:gr32, %bb.8
  %31:gr32 = PHI %12:gr32, %bb.4, %15:gr32, %bb.5, %28:gr32, %bb.8
  %32:gr32 = PHI %13:gr32, %bb.4, %16:gr32, %bb.5, %29:gr32, %bb.8

but since the original value only is 80 bits we need to adjust the size
of the last fragment expression, and with this patch we get

  DBG_VALUE debug-use %30:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 0, 32)
  DBG_VALUE debug-use %31:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 32, 32)
  DBG_VALUE debug-use %32:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 64, 16)

Reviewers: vsk, aprantl, mstorsjo

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46384

llvm-svn: 331464
2018-05-03 17:04:16 +00:00
Roman Lebedev 63ac19365a [CodeGen][X86][NFC] Copy two selectcc tests from AArch64.
These tests are for DAGCombiner::foldSelectCCToShiftAnd().
Right now, they were only tested for AArch64,
but given the upcoming X86 changes to the hasAndNot(),
the test coverage needs to be added.

These tests originated from D27489 / rL289738

llvm-svn: 331454
2018-05-03 13:33:07 +00:00
Simon Pilgrim f7dd6069a5 [X86] Split WriteVecALU/WritePHAdd into XMM and YMM/ZMM scheduler classes
llvm-svn: 331453
2018-05-03 13:27:10 +00:00
Tim Northover 28e0a6f7dd ARM: don't try to over-align large vectors as arguments.
By default LLVM thinks very large vectors get aligned to their size when
passed across functions. Unfortunately no-one told the ARM backend so it
doesn't trigger stack realignment and so accesses can cause the usual
misalignment issues (e.g. a data abort).

This changes the ABI alignment to the stack alignment, which in practice
(and as a bonus) also coincides with the alignment "natural" vectors get.

llvm-svn: 331451
2018-05-03 12:54:25 +00:00
Piotr Padlewski c77ab8ef2f perform DSE through launder.invariant.group
Summary:
Alias Analysis knows that llvm.launder.invariant.group
returns pointer that mustalias argument, but this information
wasn't used, therefor we didn't DSE through launder.invariant.group

Reviewers: chandlerc, dberlin, bogner, hfinkel, efriedma

Reviewed By: dberlin

Subscribers: amharc, llvm-commits, nlewycky, rsmith

Differential Revision: https://reviews.llvm.org/D31581

llvm-svn: 331449
2018-05-03 11:03:53 +00:00
Piotr Padlewski 5dde809404 Rename invariant.group.barrier to launder.invariant.group
Summary:
This is one of the initial commit of "RFC: Devirtualization v2" proposal:
https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing

Reviewers: rsmith, amharc, kuhar, sanjoy

Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45111

llvm-svn: 331448
2018-05-03 11:03:01 +00:00
Simon Pilgrim 93c878c76b [X86] Split WriteVecIMul/WriteVecPMULLD/WriteMPSAD/WritePSADBW into XMM and YMM/ZMM scheduler classes
Also retagged VDBPSADBW instructions as SchedWritePSADBW instead of SchedWriteVecIMul which matches the behaviour on SkylakeServer (the only thing that supports it...)

llvm-svn: 331445
2018-05-03 10:31:20 +00:00
Benjamin Kramer 281919f53f [WebAssembly] MC: Don't litter test directory.
llvm-svn: 331442
2018-05-03 08:25:14 +00:00
Martin Storsjo 67fdea490d Revert "[SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)"
This reverts SVN r331337, see PR37321 for details on the regression
it introduced.

llvm-svn: 331441
2018-05-03 07:09:33 +00:00
Craig Topper 856fd68690 [LoopIdiomRecognize] When looking for 'x & (x -1)' for popcnt, make sure the left hand side of the 'and' matches the left hand side of the 'subtract'
llvm-svn: 331437
2018-05-03 05:48:49 +00:00
Craig Topper a0cba89f86 [LoopIdiomRecognize] Add a test case showing that we transform to ctpop without fully checking the 'x & (x-1)' part.
The code fails to check that the same value is used twice. We only make sure the left hand side of the and is part of the loop recurrence. The 'x' in the subtract can be any value.

llvm-svn: 331436
2018-05-03 05:48:48 +00:00
Max Kazantsev 58fce7e54b Re-enable "[SCEV] Make computeExitLimit more simple and more powerful"
This patch was temporarily reverted because it has exposed bug 37229 on
PowerPC platform. The bug is unrelated to the patch and was just a general
bug in the optimization done for PowerPC platform only. The bug was fixed
by the patch rL331410.

This patch returns the disabled commit since the bug was fixed.

llvm-svn: 331427
2018-05-03 02:37:55 +00:00
Michael Berg 7d1b25d053 MachineInst support mapping SDNode fast math flags for support in Back End code generation
Summary:
Machine Instruction flags for fast math support and MIR print support


Reviewers: spatel, arsenm

Reviewed By: arsenm

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D45781

llvm-svn: 331417
2018-05-03 00:07:56 +00:00
Nemanja Ivanovic 01e2e79abf [PowerPC] Implement isMaskAndCmp0FoldingBeneficial
Sinking the and closer to a compare against zero is beneficial on PPC as it
allows us to emit record-form instructions. In the future, we may expand this
to a larger set of operations that feed compares against zero since PPC has
lots of record-form instructions.

Differential revision: https://reviews.llvm.org/D46060

llvm-svn: 331416
2018-05-02 23:55:23 +00:00
Sam Clegg 4d57fbd02a [WebAssembly] MC: Create and use first class section symbols
Differential Revision: https://reviews.llvm.org/D46335

llvm-svn: 331413
2018-05-02 23:11:38 +00:00
Nemanja Ivanovic 2139e99e47 [PowerPC] No CTR loop if the candidate exiting block is in a different loop
The CTR loops pass will insert the decrementing branch instruction in an exiting
block for the loop being transformed. However if that block is part of another
loop as well (whether a nested loop or with irreducible CFG), it is not valid
to use that exiting block. In fact, if the loop hass irreducible CFG, we don't
bother analyzing it and we just bail on the transformation. In practice, this
doesn't lead to a noticeable reduction in the number of loops transformed by
this pass.

Fixes https://bugs.llvm.org/show_bug.cgi?id=37229

Differential Revision: https://reviews.llvm.org/D46162

llvm-svn: 331410
2018-05-02 22:56:04 +00:00
Chandler Carruth 71c3a3fac5 [GCOV] Emit the writeout function as nested loops of global data.
Summary:
Prior to this change, LLVM would in some cases emit *massive* writeout
functions with many 10s of 1000s of function calls in straight-line
code. This is a very wasteful way to represent what are fundamentally
loops and creates a number of scalability issues. Among other things,
register allocating these calls is extremely expensive. While D46127 makes this
less severe, we'll still run into scaling issues with this eventually. If not
in the compile time, just from the code size.

Now the pass builds up global data structures modeling the inputs to
these functions, and simply loops over the data structures calling the
relevant functions with those values. This ensures that the code size is
a fixed and only data size grows with larger amounts of coverage data.

A trivial change to IRBuilder is included to make it easier to build
the constants that make up the global data.

Reviewers: wmi, echristo

Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D46357

llvm-svn: 331407
2018-05-02 22:24:39 +00:00
Martin Storsjo e3b437935f [llvm-rc] Default to writing the output next to the input, if no output is specified
This matches what rc.exe does if no output is specified.

Differential Revision: https://reviews.llvm.org/D46239

llvm-svn: 331403
2018-05-02 21:15:24 +00:00
Martin Storsjo ca16978967 [llvm-cvtres] Allow parameters preceded by '-' in addition to '/'
The real cvtres.exe also allows parameters in either form.

Differential Revision: https://reviews.llvm.org/D46358

llvm-svn: 331402
2018-05-02 21:15:13 +00:00
Paul Semel 41695f8e73 [llvm-objcopy] Add --discard-all (-x) option
llvm-svn: 331400
2018-05-02 20:19:22 +00:00
Paul Semel 2c0510f040 [llvm-objcopy] Add --weaken option
llvm-svn: 331397
2018-05-02 20:14:49 +00:00
Roman Tereshin 2df4c22915 [GlobalISel][InstructionSelect] Refactoring out a getMatchTable virtual method + other small NFC's
The main goal is to share getMatchTable between the Instruction
Selector and the Testgen.

The commit also contains some NFC only loosely related to refactoring
out the getMatchTable, but strongly related to the initial Testgen
patch (see https://reviews.llvm.org/D43962)

Reviewers: dsanders, aemerson

Reviewed By: dsanders

Subscribers: rovka, kristof.beyls, llvm-commits, dsanders

Differential Revision: https://reviews.llvm.org/D46096

llvm-svn: 331395
2018-05-02 20:07:15 +00:00
Martin Storsjo d1d046aa32 [llvm-rc] Add rudimentary support for codepages
Only support UTF-8 (since LLVM contains UTF-8 parsing support
already, and the code even does that already) and Windows-1252
(where most code points has the same value in unicode). Keep the
existing default as only allowing ASCII input.

Using the option type JoinedOrSeparate, since the real rc.exe
handles options in this form, even if llvm-rc uses Separate for
other similar existing options.

Rename the struct SearchParams to WriterParams since it's now used
for more than just include paths.

Add a missing getResourceTypeName method to the BundleResource class,
to fix error printing from within STRINGTABLE resources (used in
tests).

Differential Revision: https://reviews.llvm.org/D46238

llvm-svn: 331391
2018-05-02 19:43:44 +00:00
Simon Pilgrim 350c22c587 [X86][SNB] Fix scheduling of MMX integer multiply instructions.
The entries were being bound to the wrong class.

llvm-svn: 331388
2018-05-02 19:26:14 +00:00
Simon Pilgrim 6732f6ea51 [X86] Split WriteShuffle/WriteVarShuffle + WriteBlend/WriteVarBlend into XMM and YMM/ZMM scheduler classes
llvm-svn: 331386
2018-05-02 18:48:23 +00:00
Martin Storsjo d0b5034b8a [COFF, ARM64] Hook up a few remaining relocations
Differential Revision: https://reviews.llvm.org/D46355

llvm-svn: 331384
2018-05-02 18:24:37 +00:00
Farhana Aleen 07e612340f [AMDGPU] A trivial fix for a buildbot failure caused by "commit 224a839fcbbead221f872cd32a1dd0c308d37299".
Author: FarhanaAleen
llvm-svn: 331383
2018-05-02 18:16:39 +00:00
Daniel Sanders 8d0d1aa229 [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions.
Summary:
Some of our internal testing detected a major compile time regression which I've
tracked down to:
    r278938 - Revert "Reassociate: Reprocess RedoInsts after each inst".
It appears that processing long chains of reassociatable instructions causes
non-linear (potentially exponential) growth in the number of times an
instruction is revisited. For example, the included test revisits instructions
220 times in a 20-instruction test.

It appears that r278938 reversed the order instructions were visited and that
this is preventing scheduled revisits from being cancelled as a result of
visiting the instructions naturally during normal processing. However, simply
reversing the order also harmed the generated code. Upon closer inspection, it
was discovered that revisits occurred in the opposite order to the first pass
(Thanks to escha for spotting that).

This patch makes the revisit order consistent with the first pass which allows
more revisits to be cancelled. This does appear to have a small impact on the
generated code in few cases but it significantly reduces compile-time.

After this patch, our internal test that was most affected by the regression
dropped from ~2 million revisits to ~4k resulting in Reassociate having 0.46%
of the runtime it had before (99.54% improvement).

Here's the summaries reported by lnt for the LLVM test-suite with --benchmarking-only:
| metric         | geomean before patch | geomean after patch | delta   |
| -----          | -----                | -----               | -----   |
| compile time   | 0.1956               | 0.1261              | -35.54% |
| execution time | 0.3240               | 0.3237              | -       |
| code size      | 7365.4459            | 7365.6079           | -       |

The results have a few wins and losses on compile-time, mostly in the +/- 2.5% range. There was one outlier though:
| Performance Regressions - compile_time | Δ | Previous | Current |
| MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk | 9.82% | 2.0473 | 2.2483 |

Reviewers: javed.absar, dberlin

Reviewed By: dberlin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45734

llvm-svn: 331381
2018-05-02 17:59:16 +00:00
Simon Pilgrim 819f218f07 [X86] Cleanup WriteFShuffle/WriteFVarShuffle (+256 variants) scheduler classes with more common default values
llvm-svn: 331380
2018-05-02 17:58:50 +00:00
Farhana Aleen 150cb6d91a Revert "[AMDGPU] performAddCombine should run after DAG is legalized."
This reverts commit 6b97d2995566b4dddd6bf0d75579ff44501d4494.

llvm-svn: 331371
2018-05-02 16:48:52 +00:00
Farhana Aleen 2f4100f56e [AMDGPU] performAddCombine should run after DAG is legalized.
Summary: performAddCombine should run after DAG is legalized; Otherwise generic optimization
         in the DAGCombiner can optimize an addcarry+trunc into an addcarry instruction with
         illegal types.

Author: FarhanaAleen

Reviewed By: rampitec

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D46337

llvm-svn: 331368
2018-05-02 16:24:10 +00:00
Clement Courbet a1a3095d88 [X86] Fix scheduling info for (V?)SQRTPDm on silvermont.
https://reviews.llvm.org/D46356

llvm-svn: 331356
2018-05-02 13:46:14 +00:00
Sander de Smalen 659a48cd38 [AArch64][SVE] Asm: Support for LDR/STR fill and spill instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D46270

llvm-svn: 331352
2018-05-02 13:32:39 +00:00
Simon Tatham 6a02604ee4 [TableGen] Don't quote variable name when printing !foreach.
An input !foreach expression such as !foreach(a, lst, !add(a, 1))
would be re-emitted by llvm-tblgen -print-records with the first
argument in quotes, giving !foreach("a", lst, !add(a, 1)), which isn't
valid TableGen input syntax.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46352

llvm-svn: 331351
2018-05-02 13:17:26 +00:00
Sander de Smalen 57da042e32 [AArch64][SVE] Asm: Support for scatter ST1 store instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46248

llvm-svn: 331349
2018-05-02 13:00:30 +00:00
Simon Dardis 694fde215e Revert "[mips] Correct the predicates of sign extension instructions"
I accidently committed this patch after asking for a review, but it has not
been reviewed yet.

This reverts r331346.

llvm-svn: 331348
2018-05-02 12:35:29 +00:00
Simon Dardis 7a36495bf7 [mips] Correct the predicates of sign extension instructions
And eliminate the duplication of those instructions for microMIPS32r6.

llvm-svn: 331346
2018-05-02 12:25:33 +00:00
Sander de Smalen 414d2358a4 [AArch64][SVE] Asm: Support for non-temporal, contiguous LDNT1/STNT1 load/store instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D46269

llvm-svn: 331343
2018-05-02 11:48:49 +00:00
Simon Dardis 6cfc9ba5e3 [mips] Correct the predicates for shifts.
Reviewers: smaksimovic, abeserminji, atanasyan

Differential Revision: https://reviews.llvm.org/D46123

llvm-svn: 331341
2018-05-02 09:55:49 +00:00
Simon Pilgrim e93fd5f1e4 [X86] Cleanup WriteFAdd/WriteFCmp scheduler classes with more common default values
Intel models were targeting x87 instead of packed sse.

Also fixes XOP's VFRCZ to use WriteFAdd/WriteFAddY.

llvm-svn: 331340
2018-05-02 09:18:49 +00:00
Sander de Smalen c1e44bdfc7 [AArch64][SVE] Asm: Support for LD1RQ load-and-replicate quad-word vector instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46250

llvm-svn: 331339
2018-05-02 08:49:08 +00:00
Piotr Padlewski f801205e48 Mark invariant.group.barrier as inaccessiblememonly
It turned out that readonly argmemonly is not enough.

  store 42, %p
  %b = barrier(%p)
  store 43, %b
the first store is dead, but because barrier was marked as
reading argument memory, it was considered alive. With
inaccessiblememonly it doesn't read the argument, but
it also can't be CSEd.

based on: https://reviews.llvm.org/D32006

llvm-svn: 331338
2018-05-02 08:22:07 +00:00
Bjorn Pettersson 1c5a05f32c [SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)
Summary:
This is a follow up to rL331182. A PHI node can be split up into
several MIR PHI nodes when being selected. When there is a
dbg.value intrinsic that uses the result of such a PHI node we
need to select several DBG_VALUE instructions, with fragment
expressions, in order to do a correct selection.

Reviewers: rnk, aprantl, vsk

Reviewed By: vsk

Subscribers: mattd, llvm-commits, JDevlieghere, aprantl, gbedwell, rnk

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D46329

llvm-svn: 331337
2018-05-02 06:56:38 +00:00
Farhana Aleen e2dfe8a853 [AMDGPU] Support horizontal vectorization.
Author: FarhanaAleen

Reviewed By: rampitec, arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D46213

llvm-svn: 331313
2018-05-01 21:41:12 +00:00
Sanjay Patel d2025a2e31 [AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare
and (or (lshr X, C), ...), 1 --> (X & C') != 0

I initially thought about implementing the minimal pattern in instcombine as mentioned here:
https://bugs.llvm.org/show_bug.cgi?id=37098#c6

...but we need to do better to catch the more general sequence from the motivating test 
(more than 2 bits in the compare). And a test-suite run with statistics showed that this 
pattern only happened 2 times currently. It would potentially happen more often if 
reassociation worked better (D45842), but it's probably still not too frequent?

This is small enough that I didn't see a need to create a whole new class/file within 
AggressiveInstCombine. There are likely other relatively small matchers like what was 
discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). 
We could potentially also consolidate matchers for ctpop, bswap, etc under here.

Differential Revision: https://reviews.llvm.org/D45986

llvm-svn: 331311
2018-05-01 21:02:09 +00:00
Sanjay Patel f70671582d [AggressiveInstCombine] add more bitfield test patterns; NFC
Add another baseline for D45986 and a pattern that won't be
matched with that patch.

llvm-svn: 331309
2018-05-01 20:55:03 +00:00
Sanjay Patel 2b36e95d45 [PhaseOrdering] add tests for bittest patterns from bitfields; NFC
As mentioned in D45986, there's a potential ordering dependency 
between instcombine and aggressive-instcombine for detecting these,
so I'm adding a few tests to confirm that the expected folds occur
using -O3 (because aggressive-instcombine only runs at -O3 currently).

llvm-svn: 331308
2018-05-01 20:53:44 +00:00
Eli Friedman 763d161eda [AArch64] Add more tests for 64-bit immediate lowering.
This adds a some more tests, and adds some notes to tests which are using
a suboptimal lowering.

The constants with suboptimal lowerings seem to be relatively rare in
practice, but it might be a fun project to work on improvements.

llvm-svn: 331304
2018-05-01 20:00:14 +00:00
Vedant Kumar e23173b677 [DAGCombiner] Fix SDLoc in a (zext (zextload x)) combine (4/N)
The logic for this combine is almost identical to the logic for a
(sext (sextload x)) combine.

This commit factors out the logic so it can be shared by both combines,
and corrects the SDLoc assigned in the zext version of the combine.

Prior to this patch, for the given test case, we would apply the
location associated with the udiv instruction to instructions which
perform the load.

Part of: llvm.org/PR37262

llvm-svn: 331303
2018-05-01 19:51:15 +00:00
Vedant Kumar d7117ed0f9 [DAGCombiner] Fix SDLoc in a (sext (sextload x)) combine (3/N)
Prior to this patch, for the given test case, we would apply the
location associated with the sdiv instruction to instructions which
perform the load.

Part of: llvm.org/PR37262.

Differential Revision: https://reviews.llvm.org/D46222

llvm-svn: 331302
2018-05-01 19:51:15 +00:00
Vedant Kumar cc7b2a55c2 [DAGCombiner] Change the SDLoc on split extloads (2/N)
In DAGCombiner, we try to simplify this pattern:

  ([s|z]ext (load ...))

Conceptually, a new extload which is created while splitting the load
should have the same debug location as the load.

Making this change affects the IROrder of the new load, causing some
test case churn.

In practice, the new location is never different from the location of
the [s|z]ext, at least not during check-llvm or a stage2 build.

Part of: llvm.org/PR37262

Differential Revision: https://reviews.llvm.org/D46156

llvm-svn: 331301
2018-05-01 19:29:15 +00:00
Vedant Kumar ee4bfcaa5a [DAGCombiner] Set the right SDLoc on a newly-created zextload (1/N)
Setting the right SDLoc on a newly-created zextload fixes a line table
bug which resulted in non-linear stepping behavior.

Several backend tests contained CHECK lines which relied on the IROrder
inherited from the wrong SDLoc. This patch breaks that dependence where
feasbile and regenerates test cases where not.

In some cases, changing a node's IROrder may alter register allocation
and spill behavior. This can affect performance. I have chosen not to
prevent this by applying a "known good" IROrder to SDLocs, as this may
hide a more general bug in the scheduler, or cause regressions on other
test inputs.

rdar://33755881, Part of: llvm.org/PR37262

Differential Revision: https://reviews.llvm.org/D45995

llvm-svn: 331300
2018-05-01 19:26:15 +00:00
Konstantin Zhuravlyov 1501af4846 AMDGPU: Remove remnants of gfx901 (it was deprecated some time ago)
llvm-svn: 331298
2018-05-01 18:47:48 +00:00
Simon Pilgrim 21caf0124f [X86] Split WriteFMul/WriteFDiv into XMM and YMM/ZMM scheduler classes
llvm-svn: 331293
2018-05-01 18:22:53 +00:00
David Blaikie aa537da89f llvm-symbolizer: Handle function definitions nested within other functions
LLVM always puts function definition DIEs at the top level, but under
some circumstances GCC does not (at least in this case with member
functions of a function-local type).

To ensure that doesn't appear as though the local type's member function
is unduly inlined within the outer function - ensure the inline
discovery DIE parent walk stops at the first DW_TAG_subprogram.

llvm-svn: 331291
2018-05-01 18:08:45 +00:00
Wei Mi fa862c45bc Use no-op opt run to eliminate the difference in bb pred comment, per chandler's suggestion. It is better than using sed on portability.
llvm-svn: 331286
2018-05-01 17:19:25 +00:00
Konstantin Zhuravlyov ca65870bd0 AMDGPU: Add missing gfx904 tests
llvm-svn: 331284
2018-05-01 17:05:44 +00:00
Daniel Neilson 1091ca4640 [LV] Move test/Transforms/LoopVectorize/pr23997.ll
Summary:
This fixes a build break with r331269.

test/Transforms/LoopVectorize/pr23997.ll

should be in:

test/Transforms/LoopVectorize/X86/pr23997.ll

llvm-svn: 331281
2018-05-01 16:40:45 +00:00
Wei Mi 011b3a5c18 Fix the sed command in test which doesn't work well on BSD.
llvm-svn: 331280
2018-05-01 16:37:27 +00:00
Sam Clegg 7381216710 [WebAssembly] llvm-readobj: display symbols names in relocations
Differential Revision: https://reviews.llvm.org/D46296

llvm-svn: 331279
2018-05-01 16:35:16 +00:00
Simon Pilgrim 5269167f5b [X86] Split WriteFAdd into XMM and YMM/ZMM scheduler classes
Removes more WriteFAdd InstRW overrides

llvm-svn: 331276
2018-05-01 16:13:42 +00:00
Matthew Simpson 661e6a02bd [SLP] Add additional test for transposable binary operations with reuse
llvm-svn: 331274
2018-05-01 15:59:26 +00:00
Sanjay Patel 5727011fd5 [DAG] add test to show FMF mismatch between IR and DAG; NFC
D45710 proposes to change this, but we have no test coverage 
for the first step in this process.

llvm-svn: 331271
2018-05-01 15:43:36 +00:00
Daniel Neilson 9e4bbe801a [LV] Preserve inbounds on created GEPs
Summary:
This is a fix for PR23997.

The loop vectorizer is not preserving the inbounds property of GEPs that it creates.
This is inhibiting some optimizations. This patch preserves the inbounds property in
the case where a load/store is being fed by an inbounds GEP.

Reviewers: mkuper, javed.absar, hsaito

Reviewed By: hsaito

Subscribers: dcaballe, hsaito, llvm-commits

Differential Revision: https://reviews.llvm.org/D46191

llvm-svn: 331269
2018-05-01 15:35:08 +00:00
Wei Mi eec5ba9fae Fix the issue that ComputeValueKnownInPredecessors only handles the case when
phi is on lhs of a comparison op.

For the following testcase,
L1:

  %t0 = add i32 %m, 7
  %t3 = icmp eq i32* %t2, null
  br i1 %t3, label %L3, label %L2

L2:

  %t4 = load i32, i32* %t2, align 4
  br label %L3

L3:

  %t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ]
  %t6 = icmp eq i32 %t0, %t5
  br i1 %t6, label %L4, label %L5

We know if we go through the path L1 --> L3, %t6 should always be true. However
currently, if the rhs of the eq comparison is phi, JumpThreading fails to
evaluate %t6 to true. And we know that Instcombine cannot guarantee always
canonicalizing phi to the left hand side of the comparison operation according
to the operand priority comparison mechanism in instcombine. The patch handles
the case when rhs of the comparison op is a phi.

Differential Revision: https://reviews.llvm.org/D46275

llvm-svn: 331266
2018-05-01 14:47:24 +00:00
Omer Paparo Bivas 54596e1c1a [InstCombine] new testcases for OverflowingBinaryOperators and PossiblyExactOperators transformations; NFC
instcombine should transform the relevant cases if the OverflowingBinaryOperator/PossiblyExactOperator can be proven to be safe.

Change-Id: I7aec62a31a894e465e00eb06aed80c3ea0c9dd45
llvm-svn: 331265
2018-05-01 14:27:10 +00:00
Simon Pilgrim dd8eae128b [X86] Split WriteFShuffle into XMM and YMM/ZMM scheduler classes
Removes more WriteFShuffle InstRW overrides

llvm-svn: 331264
2018-05-01 14:25:01 +00:00
Sander de Smalen 788dc70c78 [AArch64][SVE] Asm: Support for contiguous ST1 (scalar+scalar) store instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D46121

llvm-svn: 331260
2018-05-01 13:36:03 +00:00
Simon Dardis 3d562fb975 Reland r331175: "[mips] Fix the predicates of jump and branch and link instructions"
The previous version of this patch restricted the 'jal' instruction to MIPS and
microMIPSr3. microMIPS32r6 does not have this instruction and instead uses jal
as an alias for balc.

Original commit message:
> Reviewers: smaksimovic, atanasyan, abeserminji
>
> Differential Revision: https://reviews.llvm.org/D46114
>

llvm-svn: 331259
2018-05-01 13:06:49 +00:00
Simon Pilgrim 57f2b185ac [X86] Split WriteVecLogic into XMM and YMM/ZMM scheduler classes
This removes all the WriteVecLogic InstRW overrides.

llvm-svn: 331258
2018-05-01 12:39:17 +00:00
Omer Paparo Bivas 82ef8e19ef [InstCombine] Adjusting bswap pattern matching to hold for And/Shift mixed case
Differential Revision: https://reviews.llvm.org/D45731

Change-Id: I85d4226504e954933c41598327c91b2d08192a9d
llvm-svn: 331257
2018-05-01 12:25:46 +00:00
Andrea Di Biagio d4c58400c5 [X86] Correct spill slot size.
This patch fixes a bug introduced by revision 330778 (originally reviewed at:
https://reviews.llvm.org/D44782), where function isFrameLoadOpcode returned
the wrong number of bytes read for opcodes VMOVSSrm and VMOVSDrm.

This corrects that mistake, and extends the regression test to catch cases where
the dead stores should be removed.

Patch by Jeremy Morse.

Differential Revision: https://reviews.llvm.org/D46256

llvm-svn: 331252
2018-05-01 10:29:38 +00:00
Gabor Buella c8ded04e85 [X86] movdiri and movdir64b instructions
Reviewers: spatel, craig.topper, RKSimon

Reviewed By: craig.topper, RKSimon

Differential Revision: https://reviews.llvm.org/D45983

llvm-svn: 331248
2018-05-01 10:01:16 +00:00
Craig Topper 33dc01d105 [X86] Remove 'opaque ptr' from the intel syntax parser and printer.
Previously for instructions like fxsave we would print "opaque ptr" as part of the memory operand. Now we print nothing.

We also no longer accept "opaque ptr" in the parser. We still accept any size to be specified for these instructions, but we may want to consider only parsing when no explicit size is specified. This what gas does.

llvm-svn: 331243
2018-05-01 04:42:00 +00:00
Eric Christopher 8024425801 Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug info emission."
This appears to have some issues associated with the file directive output
causing multiple global symbols with the name "file" to be emitted into a
startup section. I'm investigating more specific causes and working with the
original author.

This reverts commit r330271.

Also Revert "[DEBUGINFO, NVPTX] Add the test for the debug info of the local"

This reverts commit r330592 and the follow up of 330779 as the testcase is dependent upon r330271.

llvm-svn: 331237
2018-05-01 00:10:13 +00:00
Sanjay Patel 1934cb1c49 [InstCombine] fix test to restore intent
This test had values that differed in only in capitalization,
and that causes problems for the auto-generating check line
script. So I changed that in rL331226, but I accidentally
forgot to change a subsequent use of a param.

llvm-svn: 331228
2018-04-30 21:28:18 +00:00
Sanjay Patel dab8515370 [InstCombine] add tests, update checks; NFC
llvm-svn: 331226
2018-04-30 21:03:36 +00:00
Krzysztof Parzyszek 1cf329c933 [LivePhysRegs] Remove registers clobbered by regmasks from the live set
Dead defs were being removed from the live set (in stepForward), but
registers clobbered by regmasks weren't (more specifically, they were
actually removed by removeRegsInMask, but then they were added back in).

llvm-svn: 331219
2018-04-30 19:38:47 +00:00
Nirav Dave 6c0665e221 [MC] Change AsmParser to leverage Assembler during evaluation
Teach AsmParser to check with Assembler for when evaluating constant
expressions.  This improves the handing of preprocessor expressions
that must be resolved at parse time. This idiom can be found as
assembling-time assertion checks in source-level assemblers. Note that
this relies on the MCStreamer to keep sufficient tabs on Section /
Fragment information which the MCAsmStreamer does not. As a result the
textual output may fail where the equivalent object generation would
pass. This can most easily be resolved by folding the MCAsmStreamer
and MCObjectStreamer together which is planned for in a separate
patch.

Currently, this feature is only enabled for assembly input, keeping IR
compilation consistent between assembly and object generation.

Reviewers: echristo, rnk, probinson, espindola, peter.smith

Reviewed By: peter.smith

Subscribers: eraman, peter.smith, arichardson, jyknight, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45164

llvm-svn: 331218
2018-04-30 19:22:40 +00:00
Matt Arsenault 0084adc516 AMDGPU: Add Vega12 and Vega20
Changes by
  Matt Arsenault
  Konstantin Zhuravlyov

llvm-svn: 331215
2018-04-30 19:08:16 +00:00
Roman Tereshin 46f838f370 [MIR] Reset unique MBB numbering in MachineFunction::reset()
No need to waste space nor number MBBs differently if MF gets recreated.

Reviewers: qcolombet, stoklund, t.p.northover, bogner, javed.absar

Reviewed By: qcolombet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46078

llvm-svn: 331213
2018-04-30 18:58:57 +00:00
Sanjay Patel 1babf5ff32 [DAGCombiner] rename function attribute for disabling ftrunc transform
This is the matching name change for the Clang patch at:
D46236
rL331209

Differential Revision: https://reviews.llvm.org/D46237 

llvm-svn: 331210
2018-04-30 18:20:33 +00:00
Roman Lebedev aa4faec114 [InstCombine] Unfold masked merge with constant mask
Summary:
As discussed in D45733, we want to do this in InstCombine.

https://rise4fun.com/Alive/LGk

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: chandlerc, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D45867

llvm-svn: 331205
2018-04-30 17:59:33 +00:00
Roman Lebedev b7ae4ef82b [InstCombine][NFC] Add tests for unfolding masked merge with constant mask
Summary: As discussed in D45733, we want to do this in InstCombine.

Differential Revision: https://reviews.llvm.org/D45866

llvm-svn: 331204
2018-04-30 17:59:26 +00:00
Ulrich Weigand c3ec80fea1 [SystemZ] Handle SADDO et.al. and ADD/SUBCARRY
This provides an optimized implementation of SADDO/SSUBO/UADDO/USUBO
as well as ADDCARRY/SUBCARRY on top of the new CC implementation.

In particular, multi-word arithmetic now uses UADDO/ADDCARRY instead
of the old ADDC/ADDE logic, which means we no longer need to use
"glue" links for those instructions.  This also allows making full
use of the memory-based instructions like ALSI, which couldn't be
recognized due to limitations in the DAG matcher previously.

Also, the llvm.sadd.with.overflow et.al. intrinsincs now expand to
directly using the ADD instructions and checking for a CC 3 result.

llvm-svn: 331203
2018-04-30 17:54:28 +00:00
Ulrich Weigand b32f3656d2 [SystemZ] Do not use glue to represent condition code dependencies
Currently, an instruction setting the condition code is linked to
the instruction using the condition code via a "glue" link in the
SelectionDAG.  This has a number of drawbacks; in particular, it
means the same CC cannot be used by multiple users.  It also makes
it more difficult to efficiently implement SADDO et. al.

This patch changes the back-end to represent CC dependencies as
normal values during SelectionDAG matching, along the lines of
how this is handled in the X86 back-end already.

In addition to the core mechanics of updating all relevant patterns,
this requires a number of additional changes:

- We now need to be able to spill/restore a CC value into a GPR
  if necessary.  This means providing a copyPhysReg implementation
  for moves involving CC, and defining getCrossCopyRegClass.

- Since we still prefer to avoid such spills, we provide an override
  for IsProfitableToFold to avoid creating a merged LOAD / ICMP if
  this would result in multiple users of the CC.

- combineCCMask no longer requires a single CC user, and no longer
  need to be careful about preventing invalid glue/chain cycles.

- emitSelect needs to be more careful in marking CC live-in to
  the basic block it generates.  Also, we can now optimize the
  case of multiple subsequent selects with the same condition
  just like X86 does.

llvm-svn: 331202
2018-04-30 17:52:32 +00:00
Daniel Sanders 2de9d4ad5d Fix infinite loop after r331115
There are two separate fixes here:
* The lowering code for non-extending loads should report UnableToLegalize instead of emitting the same instruction.
* The target should not be requesting lowering of non-extending loads.

llvm-svn: 331201
2018-04-30 17:20:01 +00:00
Jonas Devlieghere 4bbcb5ab04 [DebugInfo] Prevent infinite recursion for malformed DWARF
This prevents infinite recursion in DWARFDie::findRecursively for
malformed DWARF where a DIE references itself.

This fixes PR36257.

Differential revision: https://reviews.llvm.org/D43092

llvm-svn: 331200
2018-04-30 17:02:41 +00:00
Davide Italiano bd3bf1660b [SLPVectorizer] Debug info shouldn't impact spill cost computation.
<rdar://problem/39794738>

(Also, PR32761).

Differential Revision:  https://reviews.llvm.org/D46199

llvm-svn: 331199
2018-04-30 16:57:33 +00:00
Andrea Di Biagio e047d3529b [llvm-mca] Correctly handle zero-latency stores that consume pipeline resources.
This fixes PR37293.

We can have scheduling classes with no write latency entries, that still consume
processor resources. We don't want to treat those instructions as zero-latency
instructions; they still have to be issued to the underlying pipelines, so they
still consume resource cycles.

This is likely to be a regression which I have accidentally introduced at
revision 330807. Now, if an instruction has a non-empty set of write processor
resources, we conservatively treat it as a normal (i.e. non zero-latency)
instruction.

llvm-svn: 331193
2018-04-30 15:55:04 +00:00
Ulrich Weigand fb56686cd3 [SystemZ] Improve handling of Select pseudo-instructions
If we have LOCR instructions, select them directly from SelectionDAG
instead of first going through a pseudo instruction and then using
the custom inserter to emit the LOCR.

Provide Select pseudo-instructions for VR32/VR64 if we have vector
instructions, to avoid having to go through the first 16 FPRs
unnecessarily.

If we do not have LOCFHR, prefer using LOCR followed by a move
over a conditional branch.

llvm-svn: 331191
2018-04-30 15:49:27 +00:00
Bjorn Pettersson 9a8483a4b2 [BranchFolding] Salvage DBG_VALUE instructions from empty blocks
Summary:
This patch will introduce copying of DBG_VALUE instructions
from an otherwise empty basic block to predecessor/successor
blocks in case the empty block is eliminated/bypassed. It
is currently only done in one identified situation in the
BranchFolding pass, before optimizing on empty block.
It can be seen as a light variant of the propagation done
by the LiveDebugValues pass, which unfortunately is executed
after the BranchFolding pass.

We only propagate (copy) DBG_VALUE instructions in a limited
number of situations:
 a) If the empty BB is the only predecessor of a successor
    we can copy the DBG_VALUE instruction to the beginning of
    the successor (because the DBG_VALUE instruction is always
    part of the flow between the blocks).
 b) If the empty BB is the only successor of a predecessor
    we can copy the DBG_VALUE instruction to the end of the
    predecessor (because the DBG_VALUE instruction is always
    part of the flow between the blocks). In this case we add
    the DBG_VALUE just before the first terminator (assuming
    that the terminators do not impact the DBG_VALUE).

A future solution, to handle more situations, could perhaps
be to run the LiveDebugValues pass before branch folding?

This fix is related to PR37234. It is expected to resolve
the problem seen, when applied together with the fix in
SelectionDAG from here: https://reviews.llvm.org/D46129

Reviewers: #debug-info, aprantl, rnk

Reviewed By: #debug-info, aprantl

Subscribers: ormris, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46184

llvm-svn: 331183
2018-04-30 14:37:46 +00:00
Bjorn Pettersson abafca619b [SelectionDAG] Improve selection of DBG_VALUE using a PHI node result
Summary:
When building the selection DAG at ISel all PHI nodes are
selected and lowered to Machine Instruction PHI nodes before
we start to create any SDNodes. So there are no SDNodes for
values produced by the PHI nodes.

In the past when selecting a dbg.value intrinsic that uses
the value produced by a PHI node we have been handling such
dbg.value intrinsics as "dangling debug info". I.e. we have
not created a SDDbgValue node directly, because there is
no existing SDNode for the PHI result, instead we deferred
the creationg of a SDDbgValue until we found the first use
of the PHI result.

The old solution had a couple of flaws. The position of the
selected DBG_VALUE instruction would end up quite late in a
basic block, and for example not directly after the PHI node
as in the LLVM IR input. And in case there were no use at all
in the basic block the dbg.value could be dropped completely.

This patch introduces a new VREG kind of SDDbgValue nodes.
It is similar to a SDNODE kind of node, but it refers directly
to a virtual register and not a SDNode. When we do selection
for a dbg.value that is using the result of a PHI node we
can do a lookup of the virtual register directly (as it already
is determined for the PHI node) and create a SDDbgValue node
immediately instead of delaying the selection until we find a
use.

This should fix a problem with losing debug info at ISel
as seen in PR37234 (https://bugs.llvm.org/show_bug.cgi?id=37234).
It does not resolve PR37234 completely, because the debug info
is dropped later on in the BranchFolder (see D46184).

Reviewers: #debug-info, aprantl

Reviewed By: #debug-info, aprantl

Subscribers: rnk, gbedwell, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D46129

llvm-svn: 331182
2018-04-30 14:37:39 +00:00
Simon Dardis 5a512d63c9 Revert "[mips] Fix the predicates of jump and branch and link instructions"
That commit broke one of the LLD builders, reverting while I investigate.

This patch reverts r331175.

llvm-svn: 331178
2018-04-30 14:03:35 +00:00
Simon Dardis cc95a9c557 [mips] Fix the predicates of jump and branch and link instructions
Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D46114

llvm-svn: 331175
2018-04-30 13:37:42 +00:00
Andrea Di Biagio 77bd1c748a [llvm-mca] Regenerate test Atom/resources-sse3.s. NFC
Before this change, it wrongly specified -mcpu=slm instead of -mcpu=atom.

llvm-svn: 331170
2018-04-30 12:13:04 +00:00
Andrea Di Biagio e9384eb13b [llvm-mca] Support for in-order CPU for -instruction-tables testing.
Added Intel Atom tests to verify that the tool correctly generates instruction
tables even if the CPU is in-order.

Fixes PR37282.

llvm-svn: 331169
2018-04-30 12:05:34 +00:00
Simon Dardis 57c2095d1b [mips] Fix microMIPS loads and stores.
Previously these instructions were unselectable and instead were generated
through the instruction mapping tables.

Reviewers: atanasyan, smaksimovic, abeserminji

Differential Revision: https://reviews.llvm.org/D46055

llvm-svn: 331165
2018-04-30 09:44:44 +00:00
Sander de Smalen 5861c263e0 [AArch64][SVE] Asm: Improve diagnostics for gather loads.
This patch extends the 'isSVEVectorRegWithShiftExtend' function to 
improve diagnostics for SVE's gather load (scalar + vector) addressing 
modes. Instead of always suggesting the 'unscaled' addressing mode, 
the use of DiagnosticPredicate enables a more specific error message
in the context where the scaling is incorrect. For example:

  ld1h z0.d, p0/z, [x0, z0.d, lsl #2]
                                   ^ 
           shift amount should be '1'

Instead of suggesting the packed, unscaled addressing mode:
  expected 'z[0..31].d, (uxtw|sxtw)'

the assembler now suggests using the proper scaling:
  expected 'z[0..31].d, (lsl|uxtw|sxtw) #1'

Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D46124

llvm-svn: 331162
2018-04-30 07:24:38 +00:00
Craig Topper 5a4c88005c [X86] Add a Requires<[In64BitMode]> to FARJMP64
Otherwise we can try to assemble it in 32-bit mode and throw an assert in the encoder.

llvm-svn: 331161
2018-04-30 06:21:24 +00:00
Craig Topper 64e7a16fe4 [X86] Remove support for accepting 'fnstsw %eax' and 'fnstsw %al'.
I assume this was done because gas accepted it at one point, but current versions of gas don't.

llvm-svn: 331154
2018-04-30 01:53:12 +00:00
Craig Topper 5731558979 [X86] Make 64-bit sysret/sysexit not ambiguous in Intel assembly syntax.
This also makes it default to the 32-bit non REX.W version in 64-bit mode. This seems to be more consistent with gas.

llvm-svn: 331149
2018-04-29 22:55:54 +00:00
Sander de Smalen 50ded90072 [AArch64][SVE] Asm: Support for gather LD1/LDFF1 (vector + imm) load instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46120

llvm-svn: 331145
2018-04-29 17:33:38 +00:00
Simon Pilgrim 8962c344f9 [llvm-mca][X86] Add BT resource tests to all models
llvm-svn: 331144
2018-04-29 15:45:31 +00:00
Simon Pilgrim 2d569361fc [llvm-mca][X86] Add add/adc + sub/sbb resource tests to all models
llvm-svn: 331140
2018-04-29 11:03:25 +00:00
Craig Topper 18c4c8efaf [X86] Add suffixes to the LGDT/LIDT/SGDT/SIDT mnemonics in Intel syntax. Add aliases based on 16/32-bit mode to choose the default.
This allows the instruction selection to follow mode in Intel syntax. And allows a suffix to be used to change size.

This matches gas behavior from what I could tell.

llvm-svn: 331138
2018-04-29 06:24:09 +00:00
Craig Topper ebd3e4a69c [X86] Remove SLDT64m instruction.
It doesn't really exist. The instruction always writes 16-bits of memory. Putting a REX.w on it won't change anything.

While I was touching the encoding tests to remove it, I added some other missing register form test cases.

llvm-svn: 331135
2018-04-29 04:50:53 +00:00
Robert Widmann aec494f3c4 [LLVM-C] Add DIBuilder bindings to create import declarations
Summary: Add bindings to create import declarations for modules, functions, types, and other entities.  This wraps the conveniences available in the existing DIBuilder API, but these seem C++-specific.

Reviewers: whitequark, harlanhaskins, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46167

llvm-svn: 331123
2018-04-28 22:32:07 +00:00
Daniel Sanders 5eb9f581b6 [globalisel][legalizerinfo] Introduce dedicated extending loads and add lowerings for them
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
  registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
  extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
  improve optimization of extends and truncates, this legality requirement
  would spread without considerable care w.r.t when certain combines were
  permitted.
* The SelectionDAG importer required some ugly and fragile pattern
  rewriting to translate patterns into this style.

This patch begins changing the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.

This patch introduces the new generic instructions and new variation on
G_LOAD and adds lowering for them to convert back to the existing
representations.

Depends on D45466

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, aemerson, javed.absar

Reviewed By: aemerson

Subscribers: aemerson, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45540

llvm-svn: 331115
2018-04-28 18:14:50 +00:00
Roman Lebedev 136867931a [InstCombine] Canonicalize variable mask in masked merge
Summary:
Masked merge has a pattern of: `((x ^ y) & M) ^ y`.
But, there is no difference between `((x ^ y) & M) ^ y` and `((x ^ y) & ~M) ^ x`,
We should canonicalize the pattern to non-inverted mask.

https://rise4fun.com/Alive/Yol

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45664

llvm-svn: 331112
2018-04-28 15:45:07 +00:00
Roman Lebedev 6b1e66b188 [InstCombine][NFC] Add tests for variable mask canonicalization in masked merge
Summary:
Masked merge has a pattern of: `((x ^ y) & M) ^ y`.
But, there is no difference between `((x ^ y) & M) ^ y` and `((x ^ y) & ~M) ^ x`,
We should canonicalize the pattern to non-inverted mask.

Differential Revision: https://reviews.llvm.org/D45663

llvm-svn: 331111
2018-04-28 15:45:00 +00:00
Simon Pilgrim 318e9d39ab [llvm-mca][X86] Add double shift resource tests to all relevant models
llvm-svn: 331109
2018-04-28 15:18:49 +00:00
Simon Pilgrim 4d0187c893 [llvm-mca][X86] Add shift/rotate resource tests to all relevant models
I intend to add further instruction tests to the resources-x86_64.s test file as required, but this initial commit is to help remove a load of unnecessary InstRW overrides in a future patch

llvm-svn: 331108
2018-04-28 14:56:18 +00:00
Craig Topper ef3866a859 [X86] Remove REX.W from 64-bit mode BND instructions.
As far as I can tell from the docs, the instructions are automatically 64-bit in 64-bit mode. We don't need REX.W.

llvm-svn: 331102
2018-04-28 06:02:40 +00:00
Craig Topper 8a6532ae84 [X86] Rename BNDMOV instructions and hide redundant instruction encoding from the assembler.
Favor the 0x1a encoding for register/register move to match gas.

The instructions used RM and MR in their name along with rr/rm/mr at the end. To make more consistent with other instructions remove the RM/MR and use rr/rm/mr/rr_REV.

Hide the _REV encoding from the assembler but leave it for the disassembler.

llvm-svn: 331101
2018-04-28 06:02:39 +00:00
Max Kazantsev 303572f3df [NFC] Add some tests that demonstrate unrecognized three-way comparison patterns
llvm-svn: 331100
2018-04-28 04:38:21 +00:00
Jessica Paquette 0b6724917a [MachineOutliner] Add defs to calls + don't track liveness on outlined functions
This commit makes it so that if you outline a def of some register, then the
call instruction created by the outliner actually reflects that the register
is defined by the call. It also makes it so that outlined functions don't
have the TracksLiveness property.

Outlined calls shouldn't break liveness assumptions that someone might make.

This also un-XFAILs the noredzone test, and updates the calls test.

llvm-svn: 331095
2018-04-27 23:36:35 +00:00
Philip Reames 502d4481d4 [LoopGuardWidening] Make PostDomTree optional
The effect of doing so is not disrupting the LoopPassManager when mixing this pass with other loop passes.  This should help locality of access substaintially and avoids the cost of computing PostDom.

The assumption here is that the full GuardWidening (which does use PostDom) is run as a canonicalization before loop opts and that this version is just catching cases exposed by other loop passes.  (i.e. LoopPredication, IndVarSimplify, LoopUnswitch, etc..)

llvm-svn: 331094
2018-04-27 23:15:56 +00:00
Heejin Ahn d20d0648ed [DAGCombiner] Fix a case of 1 in non-splat vector pow2 divisor
Summary:
D42479 (rL329525) enabled SDIV combine for pow2 non-splat vector
dividers. But when there is a 1 in a vector, the instruction sequence to
be generated involves shifting a value by the number of its bit widths,
which is undefined
(c64f4dbfe3/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (L6000-L6006)).

Especially, in architectures that do not support vector instructions,
each of element in a vector will be computed separately using scalar
operations, and then the resulting value will be undef for '1' values
in a vector.

(All 1's vector is fine; only vectors mixed with 1 and others will be
affected.)

Reviewers: RKSimon, jgravelle-google

Subscribers: jfb, dschuff, sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D46161

llvm-svn: 331092
2018-04-27 22:23:11 +00:00
Craig Topper d656410293 [X86] Make the STTNI flag intrinsics use the flags from pcmpestrm/pcmpistrm if the mask instrinsics are also used in the same basic block.
Summary:
Previously the flag intrinsics always used the index instructions even if a mask instruction also exists.

To fix fix this I've created a single ISD node type that returns index, mask, and flags. The SelectionDAG CSE process will merge all flavors of intrinsics with the same inputs to a s ingle node. Then during isel we just have to look at which results are used to know what instruction to generate. If both mask and index are used we'll need to emit two instructions. But for all other cases we can emit a single instruction.

Since I had to do manual isel anyway, I've removed the pseudo instructions and custom inserter code that was working around tablegen limitations with multiple implicit defs.

I've also renamed the recently added sse42.ll test case to sttni.ll since it focuses on that subset of the sse4.2 instructions.

Reviewers: chandlerc, RKSimon, spatel

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46202

llvm-svn: 331091
2018-04-27 22:15:33 +00:00
Adrian Prantl 4b542c6e64 Fix a bug that prevents global variables from having a DW_OP_deref.
For local variables the first DW_OP_deref is consumed by turning the
location kind into a memeory location, but that only makes sense for
values that are in a register to begin with, which cannot happen for
global variables that are attached to a symbol.

rdar://problem/39741860

This reapplies r330970 after fixing an uncovered bug in r331086 and
working around the situation caused by it.

llvm-svn: 331090
2018-04-27 22:05:31 +00:00
Adrian Prantl 210a29de7b Fix a bug in GlobalOpt's handling of DIExpressions.
This patch adds support for fragment expressions
TryToShrinkGlobalToBoolean() which were previously just dropped.

Thanks to Reid Kleckner for providing me a reproducer!

llvm-svn: 331086
2018-04-27 21:41:36 +00:00
Roman Lebedev 6959b8e76f [PatternMatch] Stabilize the matching order of commutative matchers
Summary:
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the `LHS` and `RHS` matchers:
1. match `RHS` matcher to the `first` operand of binary operator,
2. and then match `LHS` matcher to the `second` operand of binary operator.

This works ok.
But it complicates writing of commutative matchers, where one would like to match
(`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side.

This is additionally complicated by the fact that `m_Specific()` stores the `Value *`,
not `Value **`, so it won't work at all out of the box.

The last problem is trivially solved by adding a new `m_c_Specific()` that stores the
`Value **`, not `Value *`. I'm choosing to add a new matcher, not change the existing
one because i guess all the current users are ok with existing behavior,
and this additional pointer indirection may have performance drawbacks.
Also, i'm storing pointer, not reference, because for some mysterious-to-me reason
it did not work with the reference.

The first one appears trivial, too.
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ **operands**:
1. match ~~`RHS`~~ **`LHS`** matcher to the ~~`first`~~ **`second`** operand of binary operator,
2. and then match ~~`LHS`~~ **`RHS`** matcher to the ~~`second`~ **`first`** operand of binary operator.

Surprisingly, `$ ninja check-llvm` still passes with this.
But i expect the bots will disagree..

The motivational unittest is included.
I'd like to use this in D45664.

Reviewers: spatel, craig.topper, arsenm, RKSimon

Reviewed By: craig.topper

Subscribers: xbolva00, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D45828

llvm-svn: 331085
2018-04-27 21:23:20 +00:00
Sanjay Patel 2677038cc0 [Reassociate] add a test with debug info; NFC
As suggested in D45842 
(although still not sure if we're going to advance that),
we must invalidate references to instructions that have
been recycled (operands were changed, so result is different).

llvm-svn: 331083
2018-04-27 21:14:15 +00:00
Philip Reames e4ec473b3f [MustExecute/LICM] Special case first instruction in throwing header
We currently have a hard to solve analysis problem around the order of instructions within a potentially throwing block.  We can't cheaply determine whether a given instruction is before the first potential throw in the block.  While we're working on that in the background, special case the first instruction within the header.

why this particular special case?  Well, headers are guaranteed to execute if the loop does, and it turns out we tend to produce this form in practice.

In a follow on patch, I tend to extend LICM with an alternate approach which works for any instruction in the header before the first throw, but this is the best I can come up with other users of the analysis (such as store promotion.)

Note: I can't show the difference in the analysis result since we're ORing in the expensive instruction walk used by SCEV.  Using the full walk is not suitable for a general solution.
llvm-svn: 331079
2018-04-27 20:44:01 +00:00
Vlad Tsyrklevich 201a1086cf ELFObjectWriter: Allow one unique symver per symbol
Summary:
Only allow a single unique .symver alias per symbol. This matches the
behavior of gas. I noticed that we ignored multiple mismatched symver
directives looking at https://reviews.llvm.org/D45798

Reviewers: pcc, tejohnson, espindola

Reviewed By: pcc

Subscribers: emaste, arichardson, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D45845

llvm-svn: 331078
2018-04-27 20:32:34 +00:00
Jun Bum Lim 9e3e14b5f9 [PostRASink] extend the live-in check for all aliased registers
Extend the live-in check for all aliased registers so that we can
allow sinking Copy instructions when only implicit def is in successor's
live-in.

llvm-svn: 331072
2018-04-27 19:59:20 +00:00
Daniel Sanders 27fe8a5011 [globalisel][legalizerinfo] Add support for legalization based on the MachineMemOperand
Summary:
Currently only the memory size is supported but others can be added as
needed.

narrowScalar for G_LOAD and G_STORE now correctly update the
MachineMemOperand and will refuse to legalize atomics since those need more
careful expansions to maintain atomicity.

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, aemerson, javed.absar

Reviewed By: aemerson

Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45466

llvm-svn: 331071
2018-04-27 19:48:53 +00:00