Commit Graph

160090 Commits

Author SHA1 Message Date
Jonas Devlieghere ad2f95d92d [test][dsymutil] Fix tests for Windows bots.
The UNSUPPORTED directive was not honored by the bot, presumably because
of the FIXME above it. This moves the comment down and removes the
remaining update check from basic-linking-x86.test.

This should un-break: llvm-clang-x86_64-expensive-checks-win/builds/7798/

llvm-svn: 324598
2018-02-08 11:58:16 +00:00
Alexander Ivchenko 836eac3e8d Add missed PostDominatorTree analysis dependency to GVN hoist pass.
Summary:
GVN hoist pass is using PostDominatorTree analysis, therefore the analysis
should be listed in the pass initialization as a dependency.

Reviewed By: sebpop

Differential Revision: https://reviews.llvm.org/D43007

Author: ashlykov <arkady.shlykov@intel.com>
llvm-svn: 324597
2018-02-08 11:45:36 +00:00
Gadi Haber 25dc3d27ea [X86][MC]: Adding test coverage of MC encoding for several small extensions.<NFC>
NFC.
 Adding MC regressions tests to cover several small x86 extensions as follows:
 CLWB, CLZERO, F16C, INVPCID, PKU, POPCNT, RTM, SGX, SHA, SVM, VMFUNC, VTX

This patch is part of a larger task to cover MC encoding of all X86 isa sets started in revision: https://reviews.llvm.org/D39952

Reviewers: RKSimon, craig.topper, zvi, AndreiGrischenko
Differential Revision: https://reviews.llvm.org/D41388

Change-Id: I254508cd17faca00b780be0fc2abf6c71b61faab
llvm-svn: 324595
2018-02-08 11:16:02 +00:00
Jonas Devlieghere d4034d24da Re-land [dsymutil] Upstream update feature
This commit attempts to re-land the r324480 which was reverted in
r324493 because it broke the Windows bots. For now I disabled the two
update tests on Windows until I'm able to debug this.

Differential revision: https://reviews.llvm.org/D42880

llvm-svn: 324592
2018-02-08 10:48:54 +00:00
Serguei Katkov c8016e7a65 [Loop Predication] Teach LP about reverse loops with uge and sge latch conditions
Add support of uge and sge latch condition to Loop Prediction for
reverse loops.

Reviewers: apilipenko, mkazantsev, sanjoy, anna
Reviewed By: anna
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42837

llvm-svn: 324589
2018-02-08 10:34:08 +00:00
Clement Courbet 1b8c08b633 [X86] Fix compilation of r324580.
@ctopper Can you check that the fix is correct ?

llvm-svn: 324586
2018-02-08 09:41:50 +00:00
Stefan Maksimovic 8989940557 Revert accidental changes that snuck in r324584
llvm-svn: 324585
2018-02-08 09:31:48 +00:00
Stefan Maksimovic b3e7ed3b94 [mips] Define certain instructions in microMIPS32r3
Instructions affected:
mthc1, mfhc1, add.d, sub.d, mul.d, div.d,
mov.d, neg.d, cvt.w.d, cvt.d.s, cvt.d.w, cvt.s.d

These instructions are now defined for
microMIPS32r3 + microMIPS32r6 in MicroMipsInstrFPU.td
since they shared their encoding with those already defined
in microMIPS32r6InstrInfo.td and have been therefore
removed from the latter file.

Some instructions present in MicroMipsInstrFPU.td which
did not have both AFGR64 and FGR64 variants defined have
been altered to do so.

Differential revision: https://reviews.llvm.org/D42738

llvm-svn: 324584
2018-02-08 09:25:17 +00:00
Dylan McKay 820553fdb1 [AVR] Fix the testsuite after '%' changed to '$' in MIR
llvm-svn: 324583
2018-02-08 09:17:11 +00:00
Clement Courbet 39911e2ee6 [TargetSchedule] Expose sub-units of a ProcResGroup in MCProcResourceDesc.
Summary:
Right now using a ProcResource automatically counts as usage of all
super ProcResGroups. All this is done during codegen, so there is no
way for schedulers to get this information at runtime.

This adds the information of which individual ProcRes units are
contained in a ProcResGroup in MCProcResourceDesc.

Reviewers: gchatelet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43023

llvm-svn: 324582
2018-02-08 08:46:48 +00:00
Sjoerd Meijer 5ea465ded7 [AArch64] Don't materialize 0 with "fmov h0, .." when FullFP16 is not supported
We were generating "fmov h0, wzr" instructions when FullFP16 is not enabled.
I've not added any tests, because the problem was visible in:
test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll,
which I had to change: I don't think Cyclone has FullFP16 enabled
by default, so it shouldn't be using this v8.2a instruction.

I've also removed these rdar tags, please shout if there are any objections.

Differential Revision: https://reviews.llvm.org/D43020

llvm-svn: 324581
2018-02-08 08:39:05 +00:00
Craig Topper 8d0c8c9be1 [X86] Support folding in a k-register OR when creating KORTEST from scalar compare of a bitcast from vXi1.
This should allow us to remove the kortest intrinsic from IR and use compare+bitcast+or in IR instead.

llvm-svn: 324580
2018-02-08 08:29:43 +00:00
Craig Topper 93505707b6 [X86] Allow KORTEST instruction to be used for testing if a mask is all ones
The KTEST instruction sets the C flag if the result of anding both operands together is all 1s. We can use this to lower (icmp eq/ne (bitcast (vXi1 X), -1)

Differential Revision: https://reviews.llvm.org/D42772

llvm-svn: 324577
2018-02-08 07:54:16 +00:00
Craig Topper f5465f98d2 [X86] Don't emit KTEST instructions unless only the Z flag is being used
Summary:
KTEST has weird flag behavior. The Z flag is set for all bits in the AND of the k-registers being 0, and the C flag is set for all bits being 1. All other flags are cleared.

We currently emit this instruction in EmitTEST and don't check the condition code. This can lead to strange things like using the S flag after a KTEST for a signed compare.

The domain reassignment pass can also transform TEST instructions into KTEST and is not protected against the flag usage either. For now I've disabled this part of the domain reassignment pass. I tried to comment out the checks in the mir test so that we could recover them later, but I couldn't figure out how to get that to work.

This patch moves the KTEST handling into LowerSETCC and now creates a ktest+x86setcc. I've chosen this approach because I'd like to add support for the C flag for all ones in a followup patch. To do that requires that I can rewrite the condition code going in the x86setcc to be different than the original SETCC condition code.

This fixes PR36182. I'll file a PR to fix domain reassignment once this goes in. Should this be merged to 6.0?

Reviewers: spatel, guyblank, RKSimon, zvi

Reviewed By: guyblank

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42770

llvm-svn: 324576
2018-02-08 07:45:55 +00:00
George Rimar d3704f67ad Recommit r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."
With fix: reimplemented.

Original commit message:
Recently introduced convertToDeclaration is very similar
to code used in filterModule function.
Patch reuses it to reduce duplication.

Differential revision: https://reviews.llvm.org/D42971

llvm-svn: 324574
2018-02-08 07:23:24 +00:00
Serguei Katkov 66182d6c38 [SimplifyCFG] Re-apply Relax restriction for folding unconditional branches
The commit rL308422 introduces a restriction for folding unconditional
branches. Specifically if empty block with unconditional branch leads to
header of the loop then elimination of this basic block is prohibited.
However it seems this condition is redundantly strict.
If elimination of this basic block does not introduce more back edges
then we can eliminate this block.

The patch implements this relax of restriction.

The test profile/Linux/counter_promo_nest.c in compiler-rt project
is updated to meet this change.

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl	
Reviewed By: pacxx
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42691

llvm-svn: 324572
2018-02-08 07:16:29 +00:00
Martell Malone 0b75eee4d2 CMAKE: apply -O3 for mingw clang
Differential Revision: https://reviews.llvm.org/D41596

llvm-svn: 324570
2018-02-08 07:13:17 +00:00
Craig Topper c19aed963e [DAGCombiner] Fix a couple mistakes from r324311 by really passing the original load to ExtendSetCCUses.
We're passing the binary op that uses the load instead of the load.

Noticed by inspection. Not sure how to test this because this just prevents the introduction of an extend that will later be truncated and will probably be combined out.

llvm-svn: 324568
2018-02-08 06:27:18 +00:00
Craig Topper 9b9d527427 [DAGCombiner] Don't create truncate nodes in (aext (zextload x)) -> (zextload x) and similar folds. NFCI
The truncate is being used to replace other users of of the load, but we checked that the load only has one use so there are no other uses to replace.

llvm-svn: 324567
2018-02-08 06:04:18 +00:00
Peter Collingbourne 559ff1fe03 ARM: Remove dead code. NFCI.
llvm-svn: 324565
2018-02-08 05:28:39 +00:00
Francis Visoiu Mistrih da89d1812a [CodeGen] Print MachineBasicBlock labels using MIR syntax in -debug output
Instead of:

%bb.1: derived from LLVM BB %for.body

print:

bb.1.for.body:

Also use MIR syntax for MBB attributes like "align", "landing-pad", etc.

llvm-svn: 324563
2018-02-08 05:02:00 +00:00
Craig Topper cbfe41ac2f [DAGCombiner] Avoid creating truncate nodes in (zext (and (load)))->(and (zextload)) fold until we know for sure we're going to need it. NFCI
The truncate is only needed if the load has additional users. It used to get passed to extendSetCCUses so was created early, but that's no longer the case.

llvm-svn: 324562
2018-02-08 04:38:04 +00:00
Craig Topper bf4ed42606 [DAGCombiner] Rename variable to be slightly better. NFC
We were calling a load LN0 but it came from N0.getOperand(0) so its really more like LN00 if we follow the name used in other places.

llvm-svn: 324561
2018-02-08 04:38:02 +00:00
Yonghong Song f2075aef68 bpf: Improve expanding logic in LowerSELECT_CC
LowerSELECT_CC is not generating optimal Select_Ri pattern at the moment. It
is not guaranteed to place ConstantNode at RHS which would miss matching
Select_Ri.

A new testcase added into the existing select_ri.ll, also there is an
existing case in cmp.ll which would be improved to use Select_Ri after this
patch, it is adjusted accordingly.

Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 324560
2018-02-08 04:37:49 +00:00
Peter Collingbourne bae5918d99 gold-plugin: Do not set codegen opt level based on LTO opt level.
The LTO opt level should not affect the codegen opt level, and indeed
it does not affect it in lld. Ideally the codegen opt level should
be controlled by an IR-level attribute based on the compile-time opt
level, but that hasn't been implemented yet.

Differential Revision: https://reviews.llvm.org/D43040

llvm-svn: 324557
2018-02-08 02:41:22 +00:00
Matt Arsenault b02cebf552 AMDGPU: Fix incorrect reordering when inline asm defines LDS address
Defs of operands outside of the instruction's explicit defs need
to be checked.

llvm-svn: 324554
2018-02-08 01:56:14 +00:00
Rafael Espindola 362fccf131 Fix PR36268.
The issue is that clang was first creating a extern_weak hidden GV and
then changing the linkage to external.

Once we know it is not extern_weak we know it must be dso_local.

This patch refactors the code that sets the implicit dso_local to a
helper private function that is used every time we change the linkage
or visibility.

I will commit a patch to clang in a minute.

llvm-svn: 324551
2018-02-08 01:16:05 +00:00
Matt Arsenault c908e3f77a AMDGPU: Don't crash when trying to fold implicit operands
llvm-svn: 324550
2018-02-08 01:12:46 +00:00
Justin Lebar 321b443ef6 [NVPTX] When dying due to a bad address space value, print out the value.
llvm-svn: 324549
2018-02-08 00:50:04 +00:00
Stanislav Mekhanoshin db39b4b0b4 [AMDGPU] Fixed wait count reuse
The code reusing existing wait counts is incorrect since it keeps
adding new operands to an old instruction instead of replacing
the immediate. It was also effectively switched off by the condition
that wait count is not an AMDGPU::S_WAITCNT.

Also switched to BuildMI instead of creating instructions directly.

Differential Revision: https://reviews.llvm.org/D42997

llvm-svn: 324547
2018-02-08 00:18:35 +00:00
Chandler Carruth 0be0cfa65b [x86] Fix nasty bug in the x86 backend that is essentially impossible to
hit from IR but creates a minefield for MI passes.

The x86 backend has fairly powerful logic to try and fold loads that
feed register operands to instructions into a memory operand on the
instruction. This is almost always a good thing, but there are specific
relocated loads that are only allowed to appear in specific
instructions. Notably, R_X86_64_GOTTPOFF is only allowed in `movq` and
`addq`. This patch blocks folding of memory operands using this
relocation unless the target is in fact `addq`.

The particular relocation indicates why we simply don't hit this under
normal circumstances. This relocation is only used for TLS, and it gets
used in very specific ways in conjunction with %fs-relative addressing.
The result is that loads using this relocation are essentially never
eligible for folding into an instruction's memory operands. Unless, of
course, you have an MI pass that inserts usage of such a load. I have
exactly such an MI pass and was greeted by truly mysterious miscompiles
where the linker replaced my instruction with a completely garbage byte
sequence. Go team.

This is the only such relocation I'm aware of in x86, but there may be
others that need to be similarly restricted.

Fixes PR36165.

Differential Revision: https://reviews.llvm.org/D42732

llvm-svn: 324546
2018-02-07 23:59:14 +00:00
Mircea Trofin 06ac8cfbd1 Verify profile data confirms large loop trip counts.
Summary:
Loops with inequality comparers, such as:

   // unsigned bound
   for (unsigned i = 1; i < bound; ++i) {...}

have getSmallConstantMaxTripCount report a large maximum static
trip count - in this case, 0xffff fffe. However, profiling info
may show that the trip count is much smaller, and thus
counter-recommend vectorization.

This change:
- flips loop-vectorize-with-block-frequency on by default.
- validates profiled loop frequency data supports vectorization,
  when static info appears to not counter-recommend it. Absence
  of profile data means we rely on static data, just as we've
  done so far.

Reviewers: twoh, mkuper, davidxl, tejohnson, Ayal

Reviewed By: davidxl

Subscribers: bkramer, llvm-commits

Differential Revision: https://reviews.llvm.org/D42946

llvm-svn: 324543
2018-02-07 23:29:52 +00:00
Craig Topper 37765ff326 [X86] Prune some unreachable 'return SDValue()' paths from LowerSIGN_EXTEND/LowerZERO_EXTEND/LowerANY_EXTEND.
We were doing a lot of whitelisting of what we handle in these routines, but setOperationAction constrains what we can get here. So just add some asserts and prune the unreachable paths.

llvm-svn: 324538
2018-02-07 22:45:38 +00:00
Craig Topper 1db5ebc016 [X86] Remove dead code from EmitTest that looked for an i1 type which should have already been type legalized away. NFC
llvm-svn: 324536
2018-02-07 22:19:26 +00:00
Craig Topper 8baa9c77e3 [X86] When doing callee save/restore for k-registers make sure we don't use KMOVQ on non-BWI targets
If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass.

Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug.

Fixes PR36256

Differential Revision: https://reviews.llvm.org/D42989

llvm-svn: 324533
2018-02-07 21:41:50 +00:00
Craig Topper ce26819f9e [X86] Auto-generate complete checks. NFC
llvm-svn: 324530
2018-02-07 21:29:30 +00:00
Momchil Velikov 74906a467c Revert "[DebugInfo] Improvements to representation of enumeration types (PR36168)"
Revert commit r324489, it broke LLDB tests.

llvm-svn: 324511
2018-02-07 20:28:47 +00:00
Alexey Bataev cd8d6de381 [SLP] Add a tests for PR36280, NFC.
llvm-svn: 324510
2018-02-07 20:11:37 +00:00
Zachary Turner 876dc7124d Generate PDB files for profiling even in Release build.
This patch enables PDB generation for Release build, which has
slightly different optimize option with RelWithDebInfo on windows.

This helps to know slow part of Release build when profiling.

Patch by Takuto Ikuta
Differential Revision: https://reviews.llvm.org/D42632

llvm-svn: 324504
2018-02-07 19:37:52 +00:00
Craig Topper d18430018d [X86] Regenerate test using update_mir_test_checks.py. NFC
llvm-svn: 324497
2018-02-07 18:32:15 +00:00
Rafael Espindola f4e3f3e31c Revert "AMDGPU: Add 32-bit constant address space"
This reverts commit r324487.

It broke clang tests.

llvm-svn: 324494
2018-02-07 18:09:35 +00:00
Jonas Devlieghere 36df7631b4 Revert dsymutil -update commits
Revert "[dsymutil][test] Check the updated dSYM instead of companion file."
Revert "[dsymutil] Upstream update feature."

llvm-svn: 324493
2018-02-07 17:35:27 +00:00
Nirav Dave efed656873 [SelectionDAG] More Aggressibly prune nodes in AddChains. NFCI.
Travel all chains paths to first non-tokenfactor node can be
exponential work. Add simple redundency check to avoid this.
Fixes PR36264.

llvm-svn: 324491
2018-02-07 17:12:34 +00:00
Momchil Velikov c502027efd [DebugInfo] Improvements to representation of enumeration types (PR36168)
This patch is the LLVM part of fixing the issues, described in
https://bugs.llvm.org/show_bug.cgi?id=36168

* The representation of enumerator values in the debug info metadata now
  contains a boolean flag isUnsigned, which determines how the bits of
  the value are interpreted.
* The DW_TAG_enumeration type DIE now always (for DWARF version >= 3)
  includes a DW_AT_type attribute, which refers to the underlying
  integer type, as suggested in DWARFv4 (5.7 Enumeration Type Entries).
* The debug info metadata for enumeration type contains (in flags)
  indication whether this is a C++11 "fixed enum".
* For C++11 enumeration with a fixed underlying type, the DIE also
  includes the DW_AT_enum_class attribute (for DWARF version >= 4).
* Encoding of enumerator constants uses DW_FORM_sdata for signed values
  and DW_FORM_udata for unsigned values, as suggested by DWARFv4 (7.5.4
  Attribute Encodings).

The changes should be backwards compatible:

* the isUnsigned attribute is optional and defaults to false.
* if the underlying type for the enumeration is not available, the
  enumerator values are considered signed.
* the FixedEnum flag defaults to clear.
* the bitcode format for DIEnumerator stores the unsigned flag bit #1 of
  the first record element, so the format does not change and the zero
  previously stored there is consistent with the false default for
  IsUnsigned.

Differential Revision: https://reviews.llvm.org/D42734

llvm-svn: 324489
2018-02-07 16:46:33 +00:00
Marek Olsak 871c30e540 AMDGPU: Add 32-bit constant address space
Note: This is a candidate for LLVM 6.0, because it was planned to be
      in that release but was delayed due to a long review period.

Merge conflict in release_60 - resolution:
    Add "-p6:32:32" into the second (non-amdgiz) string.

Only scalar loads support 32-bit pointers. An address in a VGPR will
fail to compile. That's OK because the results of loads will only be used
in places where VGPRs are forbidden.

Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC.
The tests cover all uses cases we need for Mesa.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D41651

llvm-svn: 324487
2018-02-07 16:01:00 +00:00
Marek Olsak b2cc77985b AMDGPU: Remove the s_buffer workaround for GFX9 chips
Summary:
I checked the AMD closed source compiler and the workaround is only
needed when x3 is emulated as x4, which we don't do in LLVM.

SMEM x3 opcodes don't exist, and instead there is a possibility to use x4
with the last component being unused. If the last component is out of
buffer bounds and falls on the next 4K page, the hw hangs.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D42756

llvm-svn: 324486
2018-02-07 16:00:40 +00:00
Simon Pilgrim b4e789e8f6 [X86][AVX] Add PACKSSDW/PACKUSDW support for truncation of clamped values
SSE and shorter vector sizes will have to wait until we can add support for general SMIN/SMAX matching.

llvm-svn: 324485
2018-02-07 15:48:44 +00:00
Jonas Devlieghere b419108203 [dsymutil][test] Check the updated dSYM instead of companion file.
This patch has llvm-dwarfdump check the whole dSYM, rather than the
hard-coded path to the Mach-O companion file. This might be what's
causing the Windows bot to fail.

llvm-svn: 324483
2018-02-07 15:18:21 +00:00
Clement Courbet 9c22d8018c [SLPVectorizer][NFC] Make a loop more readable.
llvm-svn: 324482
2018-02-07 14:26:43 +00:00
Jonas Devlieghere a4b9417b52 [dsymutil] Upstream update feature.
Now that dsymutil can generate accelerator tables, we can upstream the
update logic that, as the name implies, updates the accelerator tables
in an existing dSYM bundle. In combination with `-minimize` this can be
used to remove redundant .debug_(inlines|pubtypes|pubnames).

Differential revision: https://reviews.llvm.org/D42880

llvm-svn: 324480
2018-02-07 13:51:29 +00:00
Simon Pilgrim c90d79f80a [X86] Regenerate atomic i32 tests
llvm-svn: 324479
2018-02-07 13:28:23 +00:00
Benjamin Kramer 6ddafa565b [Orc] Pacify -pedantic.
llvm-svn: 324478
2018-02-07 12:55:01 +00:00
Simon Atanasyan 70498f81de [mips] Support 'y' operand code to print exact log2 of the operand
llvm-svn: 324477
2018-02-07 12:36:39 +00:00
Simon Atanasyan 737bec38d0 [mips] Handle 'M' and 'L' operand codes for memory operands
Both operand codes now work the same way in case of register or memory
operands. It print high-order or low-order word in a double-word
register or memory location.

llvm-svn: 324476
2018-02-07 12:36:33 +00:00
Pavel Labath 524bd9cd24 [BinaryFormat] Remove dangling declaration of DiscriminantString
The implementation of the function was deleted in r324426. This also
removes the declaration.

llvm-svn: 324474
2018-02-07 11:19:29 +00:00
Max Kazantsev b299ade2c5 Re-enable "[SCEV] Make isLoopEntryGuardedByCond a bit smarter"
The failures happened because of assert which was overconfident about
SCEV's proving capabilities and is generally not valid.

Differential Revision: https://reviews.llvm.org/D42835

llvm-svn: 324473
2018-02-07 11:16:29 +00:00
Clement Courbet 10003e31f4 [MergeICmps] Re-commit rL324317 "Enable the MergeICmps Pass by default."
With fixes from rL324341.

Original commit message:

[MergeICmps] Enable the MergeICmps Pass by default.

Summary: Now that PR33325 is fixed, this should always improve the generated code.

Reviewers: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42793

llvm-svn: 324465
2018-02-07 09:58:55 +00:00
Serguei Katkov 69246ca787 Revert [SCEV] Make isLoopEntryGuardedByCond a bit smarter
Revert rL324453 commit which causes buildbot failures.

Differential Revision: https://reviews.llvm.org/D42835

llvm-svn: 324462
2018-02-07 09:10:08 +00:00
George Rimar 545652491d Revert r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."
It broke BB:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/23721

llvm-svn: 324458
2018-02-07 08:46:36 +00:00
Sjoerd Meijer 8c0739347c [ARM] FP16 mov imm pattern
This is a follow up of r324321, adding a match pattern for mov with a FP16
immediate (also fixing operand vfp_f16imm that wasn't even compiling).

Differential Revision: https://reviews.llvm.org/D42973

llvm-svn: 324456
2018-02-07 08:37:17 +00:00
George Rimar 5f133dc99b [ThinLTO] - Simplify code in ThinLTOBitcodeWriter.
Recently introduced convertToDeclaration is very similar
to code used in filterModule function.
Patch reuses it to reduce duplication.

Differential revision: https://reviews.llvm.org/D42971

llvm-svn: 324455
2018-02-07 08:32:35 +00:00
Max Kazantsev dd5ee6f5d9 [SCEV] Make isLoopEntryGuardedByCond a bit smarter
Sometimes `isLoopEntryGuardedByCond` cannot prove predicate `a > b` directly.
But it is a common situation when `a >= b` is known from ranges and `a != b` is
known from a dominating condition. Thia patch teaches SCEV to sum these facts
together and prove strict comparison via non-strict one.

Differential Revision: https://reviews.llvm.org/D42835

llvm-svn: 324453
2018-02-07 07:56:26 +00:00
Michael Zolotukhin cae66ba5f8 The xfailed test from r324448 passed on one of the bots: remove it entirely for now.
llvm-svn: 324451
2018-02-07 06:54:11 +00:00
Serguei Katkov ebc9031b88 [LoopPrediction] Introduce utility function getLatchPredicateForGuard. NFC.
Factor out getting the predicate for latch condition in a guard to
utility function getLatchPredicateForGuard.

llvm-svn: 324450
2018-02-07 06:53:37 +00:00
Chandler Carruth 282ae1632a [x86/retpoline] Make the external thunk names exactly match the names
that happened to end up in GCC.

This is really unfortunate, as the names don't have much rhyme or reason
to them. Originally in the discussions it seemed fine to rely on aliases
to map different names to whatever external thunk code developers wished
to use but there are practical problems with that in the kernel it turns
out. And since we're discovering this practical problems late and since
GCC has already shipped a release with one set of names, we are forced,
yet again, to blindly match what is there.

Somewhat rushing this patch out for the Linux kernel folks to test and
so we can get it patched into our releases.

Differential Revision: https://reviews.llvm.org/D42998

llvm-svn: 324449
2018-02-07 06:16:24 +00:00
Michael Zolotukhin 1713dd5b8d Xfail the test added in r324445 until the underlying issue in LoopSink is fixed.
llvm-svn: 324448
2018-02-07 06:11:50 +00:00
Eugene Leviant 25347ea895 [LegalizeDAG] Truncate condition operand of ISD::SELECT
Differential revision: https://reviews.llvm.org/D42737

llvm-svn: 324447
2018-02-07 05:38:29 +00:00
Tom Stellard 33445765dd AMDGPU/GlobalISel: Mark 32-bit G_FPTOUI as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D42152

llvm-svn: 324446
2018-02-07 04:47:59 +00:00
Michael Zolotukhin e82e83fcce Follow-up for r324429: "[LCSSAVerification] Run verification only when asserts are enabled."
Before r324429 we essentially didn't have a verification of LCSSA, so
no wonder that it has been broken: currently loop-sink breaks it (the
attached test illustrates the failure).

It was detected during a stage2 RA build, so to unbreak it I'm disabling
the check for now.

llvm-svn: 324445
2018-02-07 04:24:44 +00:00
Teresa Johnson f368101567 [ThinLTO] Serialize WithGlobalValueDeadStripping index flag for distributed backends
Summary:
A recent fix to drop dead symbols (r323633) did not work for ThinLTO
distributed backends because we lose the WithGlobalValueDeadStripping
set on the index during the thin link. This patch adds a new flags
record to the bitcode format for the index, and serializes this flag
for the combined index (it would always be 0 for the per-module index
generated by the compile step, so no need to serialize the new flags
record there until/unless we add another flag that applies to the
per-module indexes).

Generally this flag should always be set for the distributed backends,
which are necessarily performed after the thin link. However, if we were
to simply set this flag on the index applied to the distributed backends
(invoked via clang), we would lose the ability to disable dead stripping
via -compute-dead=false for debugging purposes.

Reviewers: grimar, pcc

Subscribers: mehdi_amini, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D42799

llvm-svn: 324444
2018-02-07 04:05:59 +00:00
Volkan Keles 5838f7c013 GlobalISel: Always check operand types when executing match table
Summary:
Some of the commands tries to get the register without checking
if the specified operands is a register and causing crash. All commands
should check the type of the operand first and reject if the type is
not expected.

Reviewers: dsanders, qcolombet

Reviewed By: qcolombet

Subscribers: qcolombet, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42984

llvm-svn: 324442
2018-02-07 02:44:51 +00:00
Mark Searles 24c92eeb83 [AMDGPU] Suppress redundant waitcnt instrs.
1. Run the memory legalizer prior to the waitcnt pass; keep the policy that the waitcnt pass does not remove any waitcnts within the incoming IR.

2. The waitcnt pass doesn't (yet) track waitcnts that exist prior to the waitcnt pass (it just skips over them); because the waitcnt pass is ignorant of them, it may insert a redundant waitcnt. To avoid this, check the prev instr. If it and the to-be-inserted waitcnt are the same, suppress the insertion. We keep the existing waitcnt under the assumption that whomever, e.g., the memory legalizer, inserted it knows what they were doing.

3. Follow-on work: teach the waitcnt pass to record the pre-existing waitcnts for better waitcnt production.

Differential Revision: https://reviews.llvm.org/D42854

llvm-svn: 324440
2018-02-07 02:21:21 +00:00
Craig Topper 6d9e090a64 [Mips][AMDGPU] Update test cases to not use vector lt/gt compares that can be simplified to an equality/inequality or to always true/false.
For example 'ugt X, 0' can be simplified to 'ne X, 0'. Or 'uge X, 0' is always true.

We already simplify this for scalars in SimplifySetCC, but we don't currently for vectors in SimplifySetCC. D42948 proposes to change that.

llvm-svn: 324436
2018-02-07 00:51:37 +00:00
Matt Arsenault a18b3bcf51 AMDGPU: Select BFI patterns with 64-bit ints
llvm-svn: 324431
2018-02-07 00:21:34 +00:00
Michael Zolotukhin ec8b0ecaab [LCSSAVerification] Run verification only when asserts are enabled.
llvm-svn: 324429
2018-02-07 00:13:08 +00:00
Craig Topper 58ecffd857 [DAGCombiner][AMDGPU][X86] Turn cttz/ctlz into cttz_zero_undef/ctlz_zero_undef if we can prove the input is never zero
X86 currently has a late DAG combine after cttz/ctlz are turned into BSR+BSF+CMOV to detect this and remove the CMOV. But we should be able to do this much earlier and avoid creating the cmov all together.

For the changed AMDGPU test case it appears that previously the i8 cttz was type legalized to i16 which introduced an OR with 256 in order to limit the result to 8 on the widened type. At this point the result is known to never be zero, but nothing checked that. Then operation legalization is told to promote all i16 cttz to i32. This introduces an extend and a truncate and another OR with 65536 to limit the result to 16. With the DAG combiner change we are able to prevent the creation of the second OR since the opcode will have been changed to cttz_zero_undef after the first OR. I the lack of the OR caused the instruction to change to v_ffbl_b32_sdwa

Differential Revision: https://reviews.llvm.org/D42985

llvm-svn: 324427
2018-02-06 23:54:37 +00:00
Adrian Prantl 8c59921ca3 Add DWARF for discriminated unions
n Rust, an enum that carries data in the variants is, essentially, a
discriminated union. Furthermore, the Rust compiler will perform
space optimizations on such enums in some situations. Previously,
DWARF for these constructs was emitted using a hack (a magic field
name); but this approach stopped working when more space optimizations
were added in https://github.com/rust-lang/rust/pull/45225.

This patch changes LLVM to allow discriminated unions to be
represented in DWARF. It adds createDiscriminatedUnionType and
createDiscriminatedMemberType to DIBuilder and then arranges for this
to be emitted using DWARF's DW_TAG_variant_part and DW_TAG_variant.

Note that DWARF requires that a discriminated union be represented as
a structure with a variant part. However, as Rust only needs to emit
pure discriminated unions, this is what I chose to expose on
DIBuilder.

Patch by Tom Tromey!

Differential Revision: https://reviews.llvm.org/D42082

llvm-svn: 324426
2018-02-06 23:45:59 +00:00
Eli Friedman cd07a3e2f9 Place undefined globals in .bss instead of .data
Following up on the discussion from
http://lists.llvm.org/pipermail/llvm-dev/2017-April/112305.html, undef
values are now placed in the .bss as well as null values. This prevents
undef global values taking up potentially huge amounts of space in the
.data section.

The following two lines now both generate equivalent .bss data:

@vals1 = internal unnamed_addr global [20000000 x i32] zeroinitializer, align 4
@vals2 = internal unnamed_addr global [20000000 x i32] undef, align 4 ; previously unaccounted for

This is primarily motivated by the corresponding issue in the Rust
compiler (https://github.com/rust-lang/rust/issues/41315).

Differential Revision: https://reviews.llvm.org/D41705

Patch by varkor!

llvm-svn: 324424
2018-02-06 23:22:14 +00:00
Eli Friedman 98f8bba283 [LivePhysRegs] Fix handling of return instructions.
See D42509 for the original version of this.

Basically, there are two significant changes to behavior here:

- addLiveOuts always adds all pristine registers (even if a block has
no successors).
- addLiveOuts and addLiveOutsNoPristines always add all callee-saved
registers for return blocks (including conditional return blocks).

I cleaned up the functions a bit to make it clear these properties hold.

Differential Revision: https://reviews.llvm.org/D42655

llvm-svn: 324422
2018-02-06 23:00:17 +00:00
Evandro Menezes cb7959fd78 [AArch64] Adjust the cost model for Exynos M3
Fix the modeling of long division and SIMD conversion from integer and
horizontal minimum and maximum.

llvm-svn: 324417
2018-02-06 22:35:47 +00:00
Andrew Kaylor c41499865b Add SelectionDAGDumper support for strict FP nodes
Patch by Kevin P. Neal

llvm-svn: 324416
2018-02-06 22:28:15 +00:00
Lang Hames c998ea3a7e Add OrcJIT dependency for Kaleidoscope Chapter 9.
This should fix the error at
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/10421

llvm-svn: 324413
2018-02-06 22:22:10 +00:00
Adrian Prantl c929f7ad42 Fix a crash when emitting DIEs for variable-length arrays
VLAs may refer to a previous DIE to express the DW_AT_count of their
type. Clang generates an artificial "vla_expr" variable for this. If
this DIE hasn't been created yet LLVM asserts. This patch fixes this
by sorting the local variables so that dependencies come before they
are needed. It also replaces the linear scan in DWARFFile with a
std::map, which can be faster.

Differential Revision: https://reviews.llvm.org/D42940

llvm-svn: 324412
2018-02-06 22:17:45 +00:00
Lang Hames 0ccef9ee16 [ORC] Use explicit constructor calls to fix a builder error at
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/17627

llvm-svn: 324411
2018-02-06 22:17:09 +00:00
Lang Hames e33798f53d [ORC] Remove some unused lambda captures.
llvm-svn: 324410
2018-02-06 21:52:46 +00:00
Craig Topper dfea544c84 [X86] Add test cases that exercise the BSR/BSF optimization combineCMov.
combineCmov tries to remove compares against BSR/BSF if we can prove the input to the BSR/BSF are never zero.

As far as I can tell most of the time codegenprepare despeculates ctlz/cttz and gives us a cttz_zero_undef/ctlz_zero_undef which don't use a cmov.

So the only way I found to trigger this code is to show codegenprepare an illegal type which it won't despeculate.

I think we should be turning ctlz/cttz into ctlz_zero_undef/cttz_zero_undef for these cases before we ever get to operation legalization where the cmov is created. But wanted to add these tests so we don't regress.

llvm-svn: 324409
2018-02-06 21:47:04 +00:00
Sanjay Patel 0cdc273ada [x86] add tests to show demanded bits shortcoming; NFC
llvm-svn: 324408
2018-02-06 21:43:57 +00:00
Lang Hames f3fb98365d [docs] Add out-of-date warnings to the BuildingAJIT tutorial text.
The text will be updated once the ORC API churn dies down.

llvm-svn: 324406
2018-02-06 21:25:20 +00:00
Lang Hames 4b546c9145 [ORC] Start migrating ORC layers to use the new ORC Core.h APIs.
In particular this patch switches RTDyldObjectLinkingLayer to use
orc::SymbolResolver and threads the requried changse (ExecutionSession
references and VModuleKeys) through the existing layer APIs.

The purpose of the new resolver interface is to improve query performance and
better support parallelism, both in JIT'd code and within the compiler itself.

The most visibile change is switch of the <Layer>::addModule signatures from:

Expected<Handle> addModule(std::shared_ptr<ModuleType> Mod,
                           std::shared_ptr<JITSymbolResolver> Resolver)

to:

Expected<Handle> addModule(VModuleKey K, std::shared_ptr<ModuleType> Mod);

Typical usage of addModule will now look like:

auto K = ES.allocateVModuleKey();
Resolvers[K] = createSymbolResolver(...);
Layer.addModule(K, std::move(Mod));

See the BuildingAJIT tutorial code for example usage.

llvm-svn: 324405
2018-02-06 21:25:11 +00:00
Sanjay Patel ffe72034a6 [AArch64] add test to show sub-optimal isel; NFC
llvm-svn: 324404
2018-02-06 21:25:02 +00:00
Sanjay Patel 5fa9b38894 [x86] add test to show missed BMI isel; NFC
llvm-svn: 324403
2018-02-06 21:18:53 +00:00
Daniel Neilson 83cdf6827e [DSE] Upgrade uses of MemoryIntrinic::getAlignment() to new API. (NFC)
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
DeadStoreElimination pass to cease using the old getAlignment() API of MemoryIntrinsic
in favour of getting dest specific alignments through the new API.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 324402
2018-02-06 21:18:33 +00:00
Sanjay Patel 87ce2fd82d [TargetLowering] use local variable to reduce duplication; NFCI
llvm-svn: 324401
2018-02-06 21:09:42 +00:00
Sanjay Patel e96a9014ab [TargetLowering] use local variables to reduce duplication; NFCI
llvm-svn: 324397
2018-02-06 20:49:28 +00:00
Daniel Neilson 5fdf08f7c9 [InferAddressSpaces] Update uses of IRBuilder memory intrinsic creation to new API
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
InferAddressSpaces pass to cease using:
1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific
alignments through the new API.
2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new
API that allows setting source and destination alignments independently.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 324395
2018-02-06 20:33:36 +00:00
Paul Robinson 1f90029a0d [DWARFv5] Emit .debug_line_str (in a non-DWO file).
This should enable the linker to do string-pooling of path names.

Differential Revision: https://reviews.llvm.org/D42707

llvm-svn: 324393
2018-02-06 20:29:21 +00:00
Krzysztof Parzyszek 8abaf8954a [Hexagon] Extract HVX lowering and selection into HVX-specific files, NFC
llvm-svn: 324392
2018-02-06 20:22:20 +00:00
Krzysztof Parzyszek 97a5095db6 [Hexagon] Lower concat of more than 2 vectors into build_vector
llvm-svn: 324391
2018-02-06 20:18:58 +00:00
Alexey Bataev 1e593fe73e [SLP] Update test checks, NFC.
llvm-svn: 324387
2018-02-06 20:00:05 +00:00
Daniel Neilson a894201313 [InlineFunction] Update deprecated use of IRBuilder CreateMemCpy (NFC)
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
InlineFunction pass to ceause using the old IRBuilder CreateMemCpy single-alignment API
in favour of the new API that allows setting source and destination alignments independently.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 324384
2018-02-06 19:14:31 +00:00
Stanislav Mekhanoshin ce2d428a98 [AMDGPU] removed dead code handling rmw in memory legalizer
It was always using cmpxchg path and in rmw and cmpxchg instructions
are not distinguishable in the BE.

Differential Revision: https://reviews.llvm.org/D42976

llvm-svn: 324383
2018-02-06 19:11:56 +00:00
Krzysztof Parzyszek be253e797b [Hexagon] Don't form new-value jumps from floating-point instructions
Additionally, verify that the register defined by the producer is a
32-bit register.

llvm-svn: 324381
2018-02-06 19:08:41 +00:00
Simon Pilgrim 9f2ae7e2d1 [InstCombine][ValueTracking] Match non-uniform constant power-of-two vectors
Generalize existing constant matching to work with non-uniform constant vectors as well.

Differential Revision: https://reviews.llvm.org/D42818

llvm-svn: 324369
2018-02-06 18:39:23 +00:00
Craig Topper 2f6412c389 [X86] Auto-generate checks. NFC
llvm-svn: 324367
2018-02-06 18:18:49 +00:00
Sjoerd Meijer d2718ba95e [ARM] f16 conversions
This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion
match patterns.

Differential Revision: https://reviews.llvm.org/D42954

llvm-svn: 324360
2018-02-06 16:28:43 +00:00
Nirav Dave 27721e8617 [DAG, X86] Improve Dependency analysis when doing multi-node
Instruction Selection

Cleanup cycle/validity checks in ISel (IsLegalToFold,
HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full
search for cycles / dependencies pruning the search when topological
property of NodeId allows.

As part of this propogate the NodeId-based cutoffs to narrow
hasPreprocessorHelper searches.

Reviewers: craig.topper, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D41293

llvm-svn: 324359
2018-02-06 16:14:29 +00:00
Simon Pilgrim e11c64162c Regenerate vector-urem test. NFCI.
llvm-svn: 324357
2018-02-06 16:10:12 +00:00
Marek Olsak 7d92b7e23a AMDGPU: Fix S_BUFFER_LOAD_DWORD_SGPR moveToVALU
Author: Bas Nieuwenhuizen

https://reviews.llvm.org/D42881

llvm-svn: 324353
2018-02-06 15:17:55 +00:00
Krzysztof Parzyszek 1d52a850b3 [Hexagon] Remove leftover assert
llvm-svn: 324352
2018-02-06 15:15:13 +00:00
Krzysztof Parzyszek 88f11003a0 [Hexagon] Split HVX operations on vector pairs
Vector pairs are legal types, but not every operation can work on pairs.
For those operations that are legal for single vectors, generate a concat
of their results on pair halves.

llvm-svn: 324350
2018-02-06 14:24:57 +00:00
Krzysztof Parzyszek 7b52cf1d7f [Hexagon] Add helper functions to identify single/pair vector types, NFC
llvm-svn: 324349
2018-02-06 14:21:31 +00:00
Krzysztof Parzyszek 69f1d7e370 [Hexagon] Handle lowering of SETCC via setCondCodeAction
It was expanded directly into instructions earlier. That was to avoid
loads from a constant pool for a vector negation: "xor x, splat(i1 -1)".
Implement ISD opcodes QTRUE and QFALSE to denote logical vectors of
all true and all false values, and handle setcc with negations through
selection patterns.

llvm-svn: 324348
2018-02-06 14:16:52 +00:00
Simon Pilgrim ae00a71f55 [X86][SSE] Add PACKUS support for truncation of clamped values
Followup to D42544 that matches PACKUSWB cases for non-AVX512, SSE and PACKUSDW cases will have to wait until we can add support for general SMIN/SMAX matching.

llvm-svn: 324347
2018-02-06 14:07:46 +00:00
Tim Renouf 807ecc3d66 [AMDGPU] do not generate .AMDGPU.config for amdpal os type
Summary:
Now we generate PAL metadata for the amdpal os type, there is no need to
generate the .AMDGPU.config section.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37760

Change-Id: I303c5fad66656ce97293da60621afac6595b4c18
llvm-svn: 324346
2018-02-06 13:39:38 +00:00
Sander de Smalen 81fcf865be [AArch64][SVE] Asm: Add AND_ZI instructions and aliases
Summary: Adds support for the SVE AND instruction with vector and logical-immediate operands, and their corresponding aliases.

Reviewers: fhahn, rengolin, samparker, echristo, aadg, kristof.beyls

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D42295

llvm-svn: 324343
2018-02-06 13:13:21 +00:00
Clement Courbet a7a1746865 [MergeICmps] Handle chains with several complex BCE basic blocks.
- Fix condition for detecting that a complex basic block was the first in
   the chain.
 - Add tests.

This was caught by buildbots when submitting rL324319.

llvm-svn: 324341
2018-02-06 12:25:33 +00:00
Simon Pilgrim 90a237bf83 [X86][SSE] Add PACKSS support for truncation of clamped values
Followup to D42544 that matches PACKSSWB cases for non-AVX512, SSE and PACKSSDW cases will have to wait until we can add support for general SMIN/SMAX matching.

llvm-svn: 324339
2018-02-06 12:16:10 +00:00
Hiroshi Inoue ad48d2fe61 [PowerPC] fix up in rL324229, NFC
This patch fixes up my previous commit (add initialization of local variables).

llvm-svn: 324336
2018-02-06 11:34:16 +00:00
Petar Jovanovic 714f241304 [DeadArgumentElim] Set pointer to DISubprogram before calling RAUW. NFC
It is better to update pointer of the DISuprogram before we call RAUW for
still live arguments of the function, because with the change reviewed in
D42541 in RAUW we compare DISubprograms rather than functions itself.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D42794

llvm-svn: 324335
2018-02-06 11:11:28 +00:00
Alexander Ivchenko 6805004cb1 Fix unused variable warning in release mode. NFC.
llvm-svn: 324330
2018-02-06 09:53:02 +00:00
Oliver Stannard 6df8f43c4d [AArch64] Fix spelling of ICH_ELRSR_EL2 system register
This register was mis-spelled as ICH_ELSR_EL2, but has the correct encoding for
ICH_ELRSR_EL2.

llvm-svn: 324325
2018-02-06 09:39:04 +00:00
Oliver Stannard ee0ac39305 [ARM][AArch64] Add CSDB speculation barrier instruction
This adds the CSDB instruction, which is a new barrier instruction
described by the whitepaper at [1].

This is in encoding space which was previously executed as a NOP, so it is
available for all targets that have the relevant NOP encoding space. This
matches the binutils behaviour for these instructions [2][3].

[1] https://developer.arm.com/support/security-update
[2] https://sourceware.org/ml/binutils/2018-01/msg00116.html
[3] https://sourceware.org/ml/binutils/2018-01/msg00120.html

llvm-svn: 324324
2018-02-06 09:24:47 +00:00
Clement Courbet c2109c8af6 [MergeICmps][NFC] Add more assertions.
llvm-svn: 324323
2018-02-06 09:14:00 +00:00
Sjoerd Meijer 89ea2648bb [ARM] Armv8.2-A FP16 code generation (part 3/3)
This adds most of the FP16 codegen support, but these areas need further work:

- FP16 literals and immediates are not properly supported yet (e.g. literal
  pool needs work),
- Instructions that are generated from intrinsics (e.g. vabs) haven't been
  added.

This will be addressed in follow-up patches.

Differential Revision: https://reviews.llvm.org/D42849

llvm-svn: 324321
2018-02-06 08:43:56 +00:00
Clement Courbet 333be329c4 Revert "[MergeICmps] Enable the MergeICmps Pass by default."
Breaks clang-ppc64be-linux-multistage buildbot.

This reverts commit 515bab711f308c2e8299c49dd8c84ea6a2e0b60e.

llvm-svn: 324319
2018-02-06 08:40:18 +00:00
Clement Courbet 7d09780fa2 [MergeICmps] Enable the MergeICmps Pass by default.
Summary: Now that PR33325 is fixed, this should always improve the generated code.

Reviewers: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42793

llvm-svn: 324317
2018-02-06 07:20:33 +00:00
Hiroshi Inoue ba3585eaf2 [ThinLTO] fix test failure without x86 backend
This patch moves ThinLTOBitcodeWriter/module-asm.ll test case into x86 directory to avoid a test failure when x86 backend is not enabled.

llvm-svn: 324316
2018-02-06 07:03:09 +00:00
Craig Topper 94235556aa [X86] Modify a few tests to not use icmps that are provably false.
These used things like unsigned less than zero, which is always false because there is no unsigned number less than zero.

I plan to teach DAG combine to optimize these so need to stop using them.

llvm-svn: 324315
2018-02-06 06:44:05 +00:00
Konstantin Zhuravlyov 8818d13ed2 AMDGPU/MemoryModel: Fix monotonic atomic loads
Those should have glc bit set for system and agent synchronization scopes

llvm-svn: 324314
2018-02-06 04:06:04 +00:00
Peter Collingbourne 29c6f4833c ThinLTOBitcodeWriter: Do not include module-level inline asm in the merged module.
If the inline asm provides the definition of a symbol, this can result
in duplicate symbol errors.

Differential Revision: https://reviews.llvm.org/D42944

llvm-svn: 324313
2018-02-06 03:29:18 +00:00
Craig Topper ee1f34eb9a [DAGCombiner] Pass the original load to ExtendSetCCUses not the turncate.
Summary:
This method is trying to use the truncate node to find which SETCC operand should be replaced directly with the extended load.

This used to work correctly because all uses of the original load were replaced by the truncate before this function was called. So this was used to effectively bypass the truncate and find the load under it.

All but one of the callers now call this before the truncate has replaced the laod so the setcc doesn't yet use the truncate. To account for this we should pass the original load instead.

I changed the order of that one caller to make this work there too.

I don't have a test case because this is probably hidden by later DAG combines causing the extend and truncate to cancel out. I assume this way is a little more efficient and matches what was originally intended.

Reviewers: RKSimon, spatel, niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42878

llvm-svn: 324311
2018-02-06 03:23:27 +00:00
Derek Schuff dc51fb4919 [WebAssembly] Fix test expectations after r324274
Wasm uses the expand action for several FP compare ops, and that behavior
changed.

llvm-svn: 324305
2018-02-06 01:21:17 +00:00
Reid Kleckner acb31b92ee Update test expectations after reverting PLT change
llvm-svn: 324304
2018-02-06 00:56:06 +00:00
Ahmed Charles 646ab87bb4 [RISCV] Add support for %pcrel_lo.
llvm-svn: 324303
2018-02-06 00:55:23 +00:00
Reid Kleckner 697d1bc236 Revert "Don't assume a null GV is local for ELF and MachO."
This reverts r323297.

It breaks building grub.

llvm-svn: 324301
2018-02-06 00:47:14 +00:00
Teresa Johnson 791c98e4c8 [ThinLTO] Remove dead and dropped symbol declarations when possible
Summary:
Removing the dropped symbols will prevent indirect call promotion in the
ThinLTO Backend from adding a new reference to a symbol, which can
result in linker unsats. This can happen when we compile with a sample
profile collected from one binary by used for another, which may have
profiled targets that aren't used in the new binary.

Note that until dropDeadSymbols handles variables and aliases (in
progress), we may not be able to remove the declaration and can still
have an issue.

Reviewers: grimar, davidxl

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D42816

llvm-svn: 324299
2018-02-06 00:43:39 +00:00
Paul Robinson 7b98be2c19 Fix regex from r324279 more better.
llvm-svn: 324298
2018-02-06 00:43:26 +00:00
Craig Topper 9198efceb8 [X86] Auto-generate complete checks. NFC
llvm-svn: 324295
2018-02-05 23:57:03 +00:00
Craig Topper 9c6c7c5e9b [X86] Relax restrictions on what setcc condition codes can be folded with a sext when AVX512 is enabled.
We now allow all signed comparisons and not equal. The complement that needs to be added for this is no worse than the extend. And the vector output forms of pcmpeq/pcmpgt have better latency than the k-register version on SKX.

llvm-svn: 324294
2018-02-05 23:57:01 +00:00
Peter Collingbourne 3fe815d125 LTO: Also include dso-local bit for calls in ThinLTO cache key.
Differential Revision: https://reviews.llvm.org/D42934

llvm-svn: 324291
2018-02-05 23:46:32 +00:00
Sanjay Patel d7c702b451 [LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused (PR35681)
In the motivating case from PR35681 and represented by the macro-fuse-cmp test:
https://bugs.llvm.org/show_bug.cgi?id=35681
...there's a 37 -> 31 byte size win for the loop because we eliminate the big base 
address offsets.

SPEC2017 on Ryzen shows no significant perf difference.

Differential Revision: https://reviews.llvm.org/D42607

llvm-svn: 324289
2018-02-05 23:43:05 +00:00
Francis Visoiu Mistrih 3c748e55d5 [PEI] Fix failing test caused by r324283
X86FrameLowering sets stack size to 0 if redzone is enabled.

llvm-svn: 324285
2018-02-05 23:06:47 +00:00
Francis Visoiu Mistrih 1c55aefd1e [PEI][NFC] Move StackSize opt-remark code next to -warn-stack code
This allows us to make sure we're always having the same sizes in both
remarks and warnings.

llvm-svn: 324283
2018-02-05 22:46:54 +00:00
Paul Robinson ea27528b0a Fix Windows bots for test from r324270.
llvm-svn: 324279
2018-02-05 22:30:00 +00:00
Daniel Neilson 3c23f6668b [LowerMemIntrinsics] Update uses of deprecated MemIntrinsic::getAlignment API (NFC)
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
LowerMemIntrinsics pass to cease using the old getAlignment() API of MemoryIntrinsic in
favour of getting source & dest specific alignments through the new API.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 324278
2018-02-05 22:23:58 +00:00
Sanjay Patel 49aafec2e6 [InstCombine] don't try to evaluate instructions with >1 use (revert r324014)
This example causes a compile-time explosion:

define i16 @foo(i16 %in) {
  %x = zext i16 %in to i32
  %a1 = mul i32 %x, %x
  %a2 = mul i32 %a1, %a1
  %a3 = mul i32 %a2, %a2
  %a4 = mul i32 %a3, %a3
  %a5 = mul i32 %a4, %a4
  %a6 = mul i32 %a5, %a5
  %a7 = mul i32 %a6, %a6
  %a8 = mul i32 %a7, %a7
  %a9 = mul i32 %a8, %a8
  %a10 = mul i32 %a9, %a9
  %a11 = mul i32 %a10, %a10
  %a12 = mul i32 %a11, %a11
  %a13 = mul i32 %a12, %a12
  %a14 = mul i32 %a13, %a13
  %a15 = mul i32 %a14, %a14
  %a16 = mul i32 %a15, %a15
  %a17 = mul i32 %a16, %a16
  %a18 = mul i32 %a17, %a17
  %a19 = mul i32 %a18, %a18
  %a20 = mul i32 %a19, %a19
  %a21 = mul i32 %a20, %a20
  %a22 = mul i32 %a21, %a21
  %a23 = mul i32 %a22, %a22
  %a24 = mul i32 %a23, %a23
  %T = trunc i32 %a24 to i16
  ret i16 %T
}

 

llvm-svn: 324276
2018-02-05 21:50:32 +00:00
Krzysztof Parzyszek fee3f419ae [SDAG] Legalize all CondCodes by inverting them and/or swapping operands
Differential Revision: https://reviews.llvm.org/D42788

llvm-svn: 324274
2018-02-05 21:27:16 +00:00
Daniel Neilson 8acd8b036c [SimplifyLibCalls] Update from deprecated IRBuilder API for creating memory intrinsics (NFC)
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
SimplifyLibCalls pass to cease using the old IRBuilder createMemCpy/createMemMove
single-alignment APIs in favour of the new API that allows setting source and destination
alignments independently.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, r3L24148 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 324273
2018-02-05 21:23:22 +00:00
Paul Robinson 0a22709f06 [DWARF] Regularize dumping strings from line tables.
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.

The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself.  This
matches the format used for dumping strings in the .debug_info
section.

Differential Revision: https://reviews.llvm.org/D42802

llvm-svn: 324270
2018-02-05 20:43:15 +00:00
Sanjay Patel 1c84dd9a8f [InstCombine] add test corresponding to r324252 (PR36225); NFC
As PR36225 shows, we definitely don't want to enable the
canEvaluate* logic with phis. 

There's still a question of whether we should just revert 
r324014 completely because it exposes a compile-time sinkhole
(although that problem might exist independently).

llvm-svn: 324266
2018-02-05 19:59:52 +00:00
Daniel Neilson 01fb57e7a0 Add release note on change to memcpy/memmove/memset builtin signatures
Summary:
The signatures for the builtins @llvm.memcpy, @llvm.memmove, and @llvm.memset
where changed in rL322965. The number of arguments has decreased from five to
four with the removal of the alignment argument. Alignment is now conveyed
by supplying the align parameter attribute on the destination and/or source of
the cpy/move/set.

llvm-svn: 324265
2018-02-05 19:39:38 +00:00
Nirav Dave eedb663221 [X86] Teach DAG unfoldMemoryOperand to reconvert CMPs to tests
Summary:
Copy MI-level cmp->test conversion to SelectionDAG-level memory unfold.
This fixes a regression from upcoming D41293 change.

Reviewers: craig.topper, RKSimon

Reviewed By: craig.topper

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42808

llvm-svn: 324261
2018-02-05 18:58:58 +00:00
Craig Topper 9a06f24704 [X86] Artificially lower the complexity of the scalar ANDN patterns so that AND with immediate will match first.
This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate.

This also allows us to match an and to a movzx after a not.

This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT.

llvm-svn: 324260
2018-02-05 18:31:04 +00:00
Sanjay Patel e9a153f414 [InstCombine] add unsigned saturation subtraction canonicalizations
This is the instcombine part of unsigned saturation canonicalization.
Backend patches already commited: 
https://reviews.llvm.org/D37510
https://reviews.llvm.org/D37534

It converts unsigned saturated subtraction patterns to forms recognized 
by the backend:
(a > b) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b < a) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b > a) ? 0 : a - b -> ((a > b) ? a : b) - b)
(a < b) ? 0 : a - b -> ((a > b) ? a : b) - b)
((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b)

Patch by Yulia Koval!

Differential Revision: https://reviews.llvm.org/D41480

llvm-svn: 324255
2018-02-05 17:53:29 +00:00
Peter Collingbourne b4edfb9af9 LTO: Include dso-local bit in ThinLTO cache key.
Differential Revision: https://reviews.llvm.org/D42713

llvm-svn: 324253
2018-02-05 17:17:51 +00:00
Sanjay Patel 2329fcd293 [InstCombine] only allow narrow/wide evaluation of values with >1 use if that user is a binop
There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi
instructions that might have repeated operands. This is likely a source of an infinite loop.
I haven't manufactured a test case to prove that, but it should be safe to speculatively limit
this transform to binops while we try to create that test.

llvm-svn: 324252
2018-02-05 17:16:50 +00:00
Krzysztof Parzyszek e3ef6e0706 [Hexagon] Memoize instruction positions in BitTracker
llvm-svn: 324250
2018-02-05 17:12:07 +00:00
Craig Topper 57e0643160 [X86] Teach X86DAGToDAGISel::shrinkAndImmediate to preserve upper 32 zeroes of a 64 bit mask.
If the upper 32 bits of a 64 bit mask are all zeros, we have special isel patterns to use a 32-bit and instead of a 64-bit and by relying on the impliciting zeroing of 32 bit ops.

This patch teachs shrinkAndImmediate not to break that optimization.

Differential Revision: https://reviews.llvm.org/D42899

llvm-svn: 324249
2018-02-05 16:54:07 +00:00
Hans Wennborg 22db17cf43 Revert r323472 "[Debug] Add dbg.value intrinsics for PHIs created during LCSSA."
This broke the Chromium build; see PR36238.

> This patch is an enhancement to propagate dbg.value information when
> Phis are created on behalf of LCSSA.  I noticed a case where a value
> carried across a loop was reported as <optimized out>.
>
> Specifically this case:
>
>   int bar(int x, int y) {
>     return x + y;
>   }
>
>   int foo(int size) {
>     int val = 0;
>     for (int i = 0; i < size; ++i) {
>       val = bar(val, i);  // Both val and i are correct
>     }
>     return val; // <optimized out>
>   }
>
> In the above case, after all of the interesting computation completes
> our value is reported as "optimized out." This change will add a
> dbg.value to correct this.
>
> This patch also moves the dbg.value insertion routine from
> LoopRotation.cpp into Local.cpp, so that we can share it in both places
> (LoopRotation and LCSSA).
>
> Patch by Matt Davis!
>
> Differential Revision: https://reviews.llvm.org/D42551

llvm-svn: 324247
2018-02-05 16:10:42 +00:00
Benjamin Kramer 45aa89eb7f BitTracker.h needs a full definition of MachineInstr, so include the defining file.
Patch by Dean Sturtevant!

Differential Revision: https://reviews.llvm.org/D42907

llvm-svn: 324245
2018-02-05 15:56:24 +00:00
Krzysztof Parzyszek ef20447fa0 [Hexagon] Forgot about HexagonISD::VZERO in selecting const vectors
llvm-svn: 324244
2018-02-05 15:52:54 +00:00
Krzysztof Parzyszek 67079be139 [Hexagon] Don't use garbage mask in HvxSelector::shuffp2
The function shuffp2 was breaking up a wide shuffle into a pair of
narrower ones, except that the narrower shuffle masks were actually
uninitialized.

llvm-svn: 324243
2018-02-05 15:46:41 +00:00
Teresa Johnson 5a95c47730 [ThinLTO] Convert dead alias to declarations
Summary:
This complements the fixes in r323633 and r324075 which drop the
definitions of dead functions and variables, respectively.

Fixes PR36208.

Reviewers: grimar, rafael

Subscribers: mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D42856

llvm-svn: 324242
2018-02-05 15:44:27 +00:00
Krzysztof Parzyszek 02947b7112 [Hexagon] Use V6_vmpyih for halfword multiplication
Unlike V6_vmpyhv, it produces the result in the exact form that is
expected without the need for a shuffle.

llvm-svn: 324241
2018-02-05 15:40:06 +00:00
Dmitry Preobrazhensky 0a1ff464e1 [AMDGPU][MC] Corrected dst/data size for MIMG opcodes with d16 modifier
See bug 36154: https://bugs.llvm.org/show_bug.cgi?id=36154

Differential Revision: https://reviews.llvm.org/D42847

Reviewers: cfang, artem.tamazov, arsenm
llvm-svn: 324237
2018-02-05 14:18:53 +00:00
Igor Laevsky 8bf95e250e [llvm-opt-fuzzer] Fix build after rL324225
llvm-svn: 324232
2018-02-05 12:47:40 +00:00
Dmitry Preobrazhensky e3271aee44 [AMDGPU][MC] Added validation of d16 and r128 modifiers of MIMG opcodes
See bugs 36094, 36095:
  https://bugs.llvm.org/show_bug.cgi?id=36094
  https://bugs.llvm.org/show_bug.cgi?id=36095

Differential Revision: https://reviews.llvm.org/D42692

Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 324231
2018-02-05 12:45:43 +00:00
Hiroshi Inoue c5ab1ab797 [PowerPC] Check hot loop exit edge in PPCCTRLoops
PPCCTRLoops transform loops using mtctr/bdnz instructions if loop trip count is known and big enough to compensate for the cost of mtctr.
But if there is a loop exit edge which is known to be frequently taken (by builtin_expect or by PGO), we should not transform the loop to avoid the cost of mtctr instruction. Here is an example of a loop with hot exit edge:

for (unsigned i = 0; i < TripCount; i++) {
  // do something
  if (__builtin_expect(check(), 1))
    break;
  // do something
}

Differential Revision: https://reviews.llvm.org/D42637

llvm-svn: 324229
2018-02-05 12:25:29 +00:00
Clement Courbet eb4f5d2890 [CodeGenSchedule][NFC] Always emit ProcResourceUnits.
Summary:
Right now only the ProcResourceUnits that are directly referenced by
instructions are emitted. This change emits all of them, so that
analysis passes can use the information.
This has no functional impact. It typically adds a few entries (e.g. 4
for X86/haswell) to the generated ProcRes table.

Reviewers: gchatelet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42903

llvm-svn: 324228
2018-02-05 12:23:51 +00:00
Igor Laevsky 14c979da32 [llvm-opt-fuzzer] Avoid adding incorrect inputs to the fuzzer corpus
Differential Revision: https://reviews.llvm.org/D42414

llvm-svn: 324225
2018-02-05 11:05:47 +00:00
James Henderson 10392cdbf7 Fix more print format specifiers in debug_rnglists dumping
See also r324096.

I have made the assumption that DWARF64 is not an issue for the time
being with these fixes.

llvm-svn: 324223
2018-02-05 10:47:13 +00:00
Serguei Katkov 276b32bb14 Revert [SimplifyCFG] Relax restriction for folding unconditional branches
The patch causes the failure of the test
compiler-rt/test/profile/Linux/counter_promo_nest.c

To unblock buildbot, revert the patch while investigation is in progress.

Differential Revision: https://reviews.llvm.org/D42691

llvm-svn: 324214
2018-02-05 09:05:43 +00:00
Craig Topper 5a2bd99a9e [X86] Add isel patterns for selecting masked SUBV_BROADCAST with bitcasts. Remove combineBitcastForMaskedOp.
Add test cases for the merge masked versions to make sure we have all those covered.

llvm-svn: 324210
2018-02-05 08:37:37 +00:00
Max Kazantsev f7667483c1 [NFC] Add tests for PR35743
llvm-svn: 324209
2018-02-05 08:09:49 +00:00
Serguei Katkov 6e93980e82 [SimplifyCFG] Relax restriction for folding unconditional branches
The commit rL308422 introduces a restriction for folding unconditional
branches. Specifically if empty block with unconditional branch leads to
header of the loop then elimination of this basic block is prohibited.
However it seems this condition is redundantly strict.
If elimination of this basic block does not introduce more back edges
then we can eliminate this block.

The patch implements this relax of restriction.

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl	
Reviewed By: pacxx
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42691

llvm-svn: 324208
2018-02-05 07:56:43 +00:00
Craig Topper 6ff5eb5dd5 [X86] Remove unused lambda. NFC
llvm-svn: 324206
2018-02-05 06:56:33 +00:00
Craig Topper 25ceba7f30 [X86] Remove X86ISD::SHUF128 from combineBitcastForMaskedOp. Use isel patterns instead.
We always created X86ISD::SHUF128 with a 64-bit element type so we can use isel patterns to detect a bitconvert to 32-bit to handle masking.

The test changes are because we also match the bitconvert even if there is no masking. This leads to unnecessary isel pattern, but it requires more multiclass hackery in tablegen to get rid of it.

llvm-svn: 324205
2018-02-05 06:00:23 +00:00
Serguei Katkov ec7029c286 Re-apply [SCEV] Fix isLoopEntryGuardedByCond usage
ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check
that SCEV is available at entry point of the loop. It is incorrect and fixed by patch.

To bugs additionally fixed:
assert is moved after the check whether loop is not a nullptr.
Usage of isLoopEntryGuardedByCond in ScalarEvolution::isImpliedCondOperandsViaNoOverflow
is guarded by isAvailableAtLoopEntry.

Reviewers: sanjoy, mkazantsev, anna, dorit, reames
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42417

llvm-svn: 324204
2018-02-05 05:49:47 +00:00
Craig Topper 0398ccd0c9 [X86] Auto-generate full checks. NFC
llvm-svn: 324202
2018-02-04 23:48:51 +00:00
Zvi Rackover 2401d20285 X86 Tests: Add shuffle that can be improved by widening elements. NFC
To be improved by D42044

llvm-svn: 324200
2018-02-04 19:31:14 +00:00
Florian Hahn 642637aab4 [PartialInliner] Update test (NFC).
llvm-svn: 324199
2018-02-04 18:40:24 +00:00
Florian Hahn 8f804fc07d [InlineFunction] Set arg attrs even if there only are VarArg attrs.
When using the partial inliner, we might have attributes for forwarded
varargs, but the CodeExtractor does not create an empty argument
attribute set for regular arguments in that case, because it does not know
of the additional arguments. So in case we have attributes for VarArgs, we
also have to make sure we create (empty) attributes for all regular arguments.

This fixes PR36210.

llvm-svn: 324197
2018-02-04 18:27:47 +00:00
Sander de Smalen 5b691a10c0 [TableGen][AsmMatcherEmitter] Fix tied-constraint checking for InstAliases
Summary:
This is a bit of a reimplementation the work done in
https://reviews.llvm.org/D41446, since that patch only really works for
tied operands of instructions, not aliases.

Instead of checking the constraints based on the matched instruction's opcode,
this patch uses the match-info's convert function to check the operand
constraints for that specific instruction/alias.
This is based on the matched operands for the instruction, not the
resulting opcode of the MCInst.

This patch adds the following enum/table to the *GenAsmMatcher.inc file:
  enum {
    Tie0_1_1,
    Tie0_1_2,
    Tie0_1_5,
    ...
  };

  const char TiedAsmOperandTable[][3] = {
    /* Tie0_1_1 */ { 0, 1, 1 },
    /* Tie0_1_2 */ { 0, 1, 2 },
    /* Tie0_1_5 */ { 0, 1, 5 },
    ...
  };

And it is referenced directly in the ConversionTable, like this:
static const uint8_t ConversionTable[CVT_NUM_SIGNATURES][13] = {
  ...
  { CVT_95_addRegOperands, 1,
    CVT_95_addRegOperands, 2,
    CVT_Tied, Tie0_1_5,
    CVT_95_addRegOperands, 6, CVT_Done },
  ...


The Tie0_1_5 (and corresponding table) encodes that:
* Result operand 0 is the operand to copy (which is e.g. done when
  building up the operands to the MCInst in convertToMCInst())
* Asm operands 1 and 5 should be the same operands (which is checked
  in checkAsmTiedOperandConstraints()).

Reviewers: olista01, rengolin, fhahn, craig.topper, echristo, apazos, dsanders

Reviewed By: olista01

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42293

llvm-svn: 324196
2018-02-04 16:24:17 +00:00
Chad Rosier a097bc69df [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking
The type-shrinking logic in reduction detection, although narrow in scope, is
also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies
the approach to rely on the demanded bits and value tracking analyses, if
available. We currently perform type-shrinking separately for reductions and
other instructions in the loop. Long-term, we should probably think about
computing minimal bit widths in a more complete way for the loops we want to
vectorize.

PR35734
Differential Revision: https://reviews.llvm.org/D42309

llvm-svn: 324195
2018-02-04 15:42:24 +00:00
Craig Topper 8d511a65af [X86] Add DAG combine to turn (bitcast (and/or/xor (bitcast X), Y)) -> (and/or/xor X, (bitcast Y)) when casting between GPRs and mask operations.
This reduces the number of transitions between k-registers and GPRs, reducing the number of instructions.

There's still some room for improvement to remove more transitions, but this is a good start.

llvm-svn: 324184
2018-02-04 01:43:48 +00:00
Craig Topper 17d99f1df4 [X86] Remove unused function argument. NFC
llvm-svn: 324183
2018-02-04 01:43:44 +00:00
Craig Topper fc5bd023dd [DAGCombiner] When folding fold (sext/zext (and/or/xor (sextload/zextload x), cst)) -> (and/or/xor (sextload/zextload x), (sext/zext cst)) make sure we check the legality of the full extended load.
Summary:
If the load is already an extended load we should be using the memory VT for the legality check, not just the VT of the current extension.

I don't have a test case, just noticed it while investigating some load extension improvements.

Reviewers: RKSimon, spatel, niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42783

llvm-svn: 324181
2018-02-03 23:00:31 +00:00
Simon Pilgrim 7aec5063a5 [MIPS] Regenerate vector tests with update script
Hopefully help make this a lot more maintainable

llvm-svn: 324180
2018-02-03 22:11:22 +00:00
Simon Pilgrim 4df6499f10 [SelectionDAG] Don't use simple VT in generic shuffle code
Better to assume that any value type may be commuted, not just MVTs.

No test case right now, but discovered while investigating possible shuffle combines.

llvm-svn: 324179
2018-02-03 21:34:42 +00:00
Simon Pilgrim 8fb1dd8a1d [X86][SSE] Don't chain shuffles together in schedule tests
This is necessary to prevent the shuffles from being combined/simplified in an upcoming patch.

llvm-svn: 324178
2018-02-03 21:20:19 +00:00
Craig Topper 071ad9c6e0 [X86] Remove and autoupgrade kand/kandn/kor/kxor/kxnor/knot intrinsics.
Clang already stopped using these a couple months ago.

The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around.

llvm-svn: 324177
2018-02-03 20:18:25 +00:00
David Green 9688ed61fe Remove unneeded -debug argument from new test
llvm-svn: 324176
2018-02-03 17:33:50 +00:00
Lang Hames 371412b1ec [ORC] Rename NullResolver to NullLegacyResolver.
This resolver conforms to the LegacyJITSymbolResolver interface, and will be
replaced with a null-returning resolver conforming to the newer
orc::SymbolResolver interface in the near future. This patch renames the class
to avoid a clash.

llvm-svn: 324175
2018-02-03 16:52:48 +00:00
David Green 7174023f57 [InstCombine] Allow common type conversions to i8/i16/i32
This, in instcombine, allows conversions to i8/i16/i32 (very
common cases) even if the resulting type is not legal according
to the data layout. This can often open up extra combine
opportunities.

Differential Revision: https://reviews.llvm.org/D42424

llvm-svn: 324174
2018-02-03 16:51:03 +00:00
Alex Bradbury 7c11527b03 [RISCV] Update two RISCV codegen tests after rL323991
From the discussion in D41835 it looks possible the change will be backed out, 
but for now let's fix the RISCV tests.

llvm-svn: 324172
2018-02-03 13:02:30 +00:00
Simon Pilgrim 3d5ee7aa41 Fix MSVC signed/unsigned comparison warning. NFCI.
llvm-svn: 324171
2018-02-03 12:38:56 +00:00
Richard Smith 60707939de Fix incorrect usage of std::is_assignable.
We want to check that we can assign to an lvalue here, not a prvalue.

llvm-svn: 324152
2018-02-02 22:29:54 +00:00
Daniel Neilson 38af2eed51 [InstCombine] Use getDestAlignment in SimplifyMemSet (NFC)
Summary:
Small NFC change to change the name of the function used getting and setting
the alignment of a memset.

llvm-svn: 324148
2018-02-02 22:03:03 +00:00
Craig Topper fae8788cfa [X86] Prefer to create a ISD::SETCC over X86ISD::PCMPEQ in combineVectorSizedSetCCEquality.
This is running pre-legalize, we should try to use target independent nodes. This will give the best opportunity for target independent optimizations.

llvm-svn: 324147
2018-02-02 21:59:46 +00:00
Sanjay Patel a767ee5af0 [InstCombine] make sure tests are providing coverage for the stated pattern; NFC
Without extra instructions and uses, swapMayExposeCSEOpportunities() would change
the icmp (as seen in the check lines), so we were not actually testing patterns 
that should be handled by D41480.

llvm-svn: 324143
2018-02-02 21:40:54 +00:00
Craig Topper 10aa254ecd [X86] Pass SDLoc by const reference in a few more places in X86ISelLowering.cpp. NFC
llvm-svn: 324135
2018-02-02 20:32:00 +00:00
Craig Topper e7e147f52c [X86] Add avx512 command line to ptest.ll to demonstrate that 512-bit vectors are not handled by LowerVectorAllZeroTest.
llvm-svn: 324130
2018-02-02 20:12:45 +00:00
Craig Topper bd2f6e9570 Partially revert r324124 [X86] Add tests for missed opportunities to use ptest for all ones comparison.
Turns out I misunderstood the flag behavior of PTEST because I read the documentation for KORTEST which is different than PTEST/KTEST and made a bad assumption.

Keep the test rename though cause that's useful.

llvm-svn: 324129
2018-02-02 20:12:44 +00:00
Aditya Nandakumar 58eb183128 [GISel][NFC]: Move RegisterBankInfo::getSizeInBits into TargetRegisterInfo.
llvm-svn: 324125
2018-02-02 19:42:07 +00:00
Craig Topper 9c936f88b1 [X86] Add tests for missed opportunities to use ptest for all ones comparison.
Also rename the test from pr12312.ll to ptest.ll so its more recognizable.

llvm-svn: 324124
2018-02-02 19:34:10 +00:00
Alex Denisov a07169e3cb Fix typo
llvm-svn: 324123
2018-02-02 19:20:37 +00:00
Sanjay Patel 1ea8697cdf [InstCombine] simplify logic for swapMayExposeCSEOpportunities; NFCI
llvm-svn: 324122
2018-02-02 19:08:12 +00:00
Sanjay Patel 4ccae1cb2b [InstCombine] fix typos, formatting; NFC
llvm-svn: 324118
2018-02-02 18:39:05 +00:00
Amara Emerson 3838ed0370 [AArch64][GlobalISel] Use getRegClassForTypeOnBank() in selectCopy.
Differential Revision: https://reviews.llvm.org/D42832

llvm-svn: 324110
2018-02-02 18:03:30 +00:00
Sanjay Patel 5b8cb26bcc [InstCombine] add baseline tests for unsigned saturated sub (D41480); NFC
llvm-svn: 324109
2018-02-02 17:43:16 +00:00
Craig Topper e538fc74d4 [X86] Remove checks for FeatureAVX512 from the X86 assembly parser. Remove mcpu/mattr from assembly test command lines.
Summary:
We should always be able to accept AVX512 registers and instructions in llvm-mc. The only subtarget mode that should be checked is 16-bit vs 32-bit vs 64-bit mode.

I've also removed all the mattr/mcpu lines from test RUN lines to be consistent with this. Most were due to AVX512, but a few were for other features.

Fixes PR36202

Reviewers: RKSimon, echristo, bkramer

Reviewed By: echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42824

llvm-svn: 324106
2018-02-02 17:02:58 +00:00
Fangrui Song 3823fc46f5 Make utils/UpdateTestChecks/common.py Python 2/3 compatible and fix print statements.
Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42674

llvm-svn: 324104
2018-02-02 16:41:07 +00:00
Yaxun Liu 2a22c5deff [AMDGPU] Switch to the new addr space mapping by default
This requires corresponding clang change.

Differential Revision: https://reviews.llvm.org/D40955

llvm-svn: 324101
2018-02-02 16:07:16 +00:00
Clement Courbet a43e9653bb Add llc tests for comparison chains.
See https://reviews.llvm.org/D42793#996098 for context.

llvm-svn: 324099
2018-02-02 15:54:17 +00:00
James Henderson 2465633234 Fix type sizes that were causing incorrect string formatting
llvm-svn: 324096
2018-02-02 15:09:31 +00:00
George Rimar 0410afd1a3 [ThinLTO] - Add comment. NFC.
Was requested during review of D42798.

llvm-svn: 324095
2018-02-02 15:06:09 +00:00
Simon Pilgrim 1cb9bc6b6c [X86][SSE] Force double domain for SHUFPD stack folding tests
llvm-svn: 324094
2018-02-02 14:55:20 +00:00
Ivan A. Kosarev ab68bbe515 [Analysis] Support aggregate access types in TBAA
This patch implements analysis for new-format TBAA access tags
with aggregate types as their final access types.

Differential Revision: https://reviews.llvm.org/D41501

llvm-svn: 324092
2018-02-02 14:09:22 +00:00
James Henderson c2dfd502a2 Add missing new files from r324077
Differential Revision: https://reviews.llvm.org/D42481

llvm-svn: 324078
2018-02-02 12:45:57 +00:00
James Henderson 3fcc74500a [DWARF v5] Add limited support for dumping .debug_rnglists
This change adds support to llvm-dwarfdump for dumping DWARF5
.debug_rnglists sections in regular ELF files.

It is not complete, in that several DW_RLE_* encodings are currently
not supported, but does dump the headert and the basic ranges for
DW_RLE_start_length and DW_RLE_start_end encodings.

Obvious next steps are to add verbose dumping that dumps the raw
encodings, rather than the interpreted contents, to add -verify support
of the section (e.g. to show that the correct number of offsets are
specified), add dumping of .debug_rnglists.dwo, and to add support for
other encodings.

Reviewed by: dblaikie, JDevlieghere

Differential Revision: https://reviews.llvm.org/D42481

llvm-svn: 324077
2018-02-02 12:35:52 +00:00
George Rimar f5de271a92 [LTO] - Simplify. NFC.
llvm-svn: 324076
2018-02-02 12:21:26 +00:00
George Rimar 76c5fae2a0 [ThinLTO] - Fix for "ThinLTO inlines variables that should be discarded".
This fixes PR36187.

Patch teaches ThinLTO to drop non-prevailing variables, 
just like we recently did for functions (in r323633).

Differential revision: https://reviews.llvm.org/D42798

llvm-svn: 324075
2018-02-02 12:17:33 +00:00
Sjoerd Meijer 986d64ad73 [ARM] fixed some tabs/whitespaces in test. NFC.
llvm-svn: 324074
2018-02-02 11:51:06 +00:00
Mikael Holmen b69e5b7393 [GlobalOpt] Include padding in debug fragments
Summary:
When creating the debug fragments for a SRA'd variable, use the types'
allocation sizes. This fixes issues where the pass would emit too small
fragments, placed at the wrong offset, for padded types.

An example of this is long double on x86. The type is represented using
x86_fp80, which is 10 bytes, but the value is aligned to 12/16 bytes.
The padding is included in the type's DW_AT_byte_size attribute;
therefore, the fragments should also include that. Newer GCC releases
(I tested 7.2.0) emit 12/16-byte pieces for long double. Earlier
releases, e.g. GCC 5.5.0, behaved as LLVM did, i.e. by emitting a
10-byte piece, followed by an empty 2/6-byte piece for the padding.

Failing to cover all `DW_AT_byte_size' bytes of a value with non-empty
pieces results in the value being printed as <optimized out> by GDB.

Patch by: David Stenberg

Reviewers: aprantl, JDevlieghere

Reviewed By: aprantl, JDevlieghere

Subscribers: llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D42807

llvm-svn: 324066
2018-02-02 10:34:13 +00:00
Jonas Paulsson 422dfbf7cc [SelectionDAG] Consider endianness in scalarizeVectorStore().
When handling vectors with non byte-sized elements, reverse the order of the
elements in the built integer if the target is Big-Endian.

SystemZ tests updated.

Review: Eli Friedman, Ulrich Weigand.
https://reviews.llvm.org/D42786

llvm-svn: 324063
2018-02-02 08:48:02 +00:00
Jonas Paulsson ad089fe46e [SelectionDAG] Add an assert in getNode() for EXTRACT_VECTOR_ELT.
When getNode() is called to create an EXTRACT_VECTOR_ELT, assert that
the result VT is at least as wide as the vector element type.

Review: Eli Friedman
llvm-svn: 324061
2018-02-02 08:21:53 +00:00
Jonas Paulsson 0e50b6ed80 [SystemZ] Update test case (NFC)
test/CodeGen/SystemZ/vec-trunc-to-i1.ll was marked as a temporary
FAIL when it was previously updated when it needed one more COPY.
This was however wrong, since the loop body had been reduced
significantly, and it was actually an improvement.

Review: Ulrich Weigand.
llvm-svn: 324060
2018-02-02 07:52:02 +00:00
Shiva Chen 53489ada12 [RISCV] Add ELFObjectFileBase::getRISCVFeatures let llvm-objdump could get RISCV target feature
llvm-objdump could get C feature by ELF::EF_RISCV_RVC e_flag,
so then we don't have to add -mattr=+c on the command line.

Differential Revision: https://reviews.llvm.org/D42629

llvm-svn: 324058
2018-02-02 06:01:02 +00:00
Craig Topper 76c5ce5184 [X86] Legalize (v64i1 (bitcast (i64 X))) on 32-bit targets by extracting 32-bit halves from i32, bitcasting each to v32i1, and concatenating.
This prevents the scalarization that would otherwise occur.

llvm-svn: 324057
2018-02-02 05:59:33 +00:00
Craig Topper 5570e03b21 [X86] Legalize (i64 (bitcast (v64i1 X))) on 32-bit targets by extracting to v32i1 and bitcasting to i32.
This saves a trip through memory and seems to open up other combining opportunities.

llvm-svn: 324056
2018-02-02 05:59:31 +00:00
Shiva Chen b22c1d29bc [RISCV] Fix c.addi and c.addi16sp immediate constraints which should be non-zero
Differential Revision: https://reviews.llvm.org/D42782

llvm-svn: 324055
2018-02-02 02:43:23 +00:00
Shiva Chen bbf4c5c25e [RISCV] Define getSetCCResultType for setting vector setCC type
To avoid trigger "No default SetCC type for vectors!" Assertion

Differential Revision: https://reviews.llvm.org/D42675

llvm-svn: 324054
2018-02-02 02:43:18 +00:00
Amara Emerson 572f6cecf1 [AArch64][GlobalISel] Fix old use of % sigil in test.
My rebase had missed the new $ sigil we're using.

llvm-svn: 324051
2018-02-02 02:14:42 +00:00
Amara Emerson 98af4664e0 Fix debug spelling in ResetMachineFunction pass.
llvm-svn: 324048
2018-02-02 01:49:59 +00:00
Amara Emerson 58aea52bc4 [GlobalISel] Constrain the dest reg of IMPLICT_DEF.
This fixes a crash where the user is a COPY, which deliberately does not
constrain its source operands, resulting in a vreg without a reg class escaping
selection.

Differential Revision: https://reviews.llvm.org/D42697

llvm-svn: 324047
2018-02-02 01:44:43 +00:00
David Blaikie d8a6f93aae Remove non-modular header containing static utility functions
The one place that uses these functions isn't particularly
long/complicated, so it's easier to just have these inline at that
location than trying to split it out into a true header. (in part also
because of the use of the DEBUG macros, which make this not really a
standalone header even if the static functions were made inline instead)

llvm-svn: 324044
2018-02-02 00:33:50 +00:00
David Blaikie 42bcdffd24 Add missing includes
llvm-svn: 324040
2018-02-02 00:11:09 +00:00
Matthias Braun ca0abaebfb SplitKit: Fix liveness recomputation in some remat cases.
Example situation:
```
BB0:
  %0 = ...
  use %0
  ; ...
  condjump BB1
  jmp BB2

BB1:
  %0 = ...   ; rematerialized def from above (from earlier split step)
  jmp BB2

BB2:
  ; ...
  use %0
```

%0 will have a live interval with 3 value numbers (for the BB0, BB1 and
BB2 parts). Now SplitKit tries and succeeds in rematerializing the value
number in BB2 (This only works because it is a secondary split so
SplitKit is can trace this back to a single original def).

We need to recompute all live ranges affected by a value number that we
rematerialize. The case that we missed before is that when the value
that is rematerialized is at a join (Phi VNI) then we also have to
recompute liveness for the predecessor VNIs.

rdar://35699130

Differential Revision: https://reviews.llvm.org/D42667

llvm-svn: 324039
2018-02-02 00:08:19 +00:00
Vlad Tsyrklevich 3e0e7cd922 Fix broken builds due to mismatched min/max types
llvm-svn: 324038
2018-02-02 00:07:14 +00:00
Vlad Tsyrklevich b2c3ea7603 [cfi-verify] Add blame context printing, and improved print format.
Summary:
This update now allows users to specify `--blame-context` and `--blame-context-all` to print source file blame information for the source of the blame.

Also updates the inline printing to correctly identify the top of the inlining stack for blame information.

Patch by Mitch Phillips!

Reviewers: vlad.tsyrklevich

Subscribers: llvm-commits, kcc, pcc

Differential Revision: https://reviews.llvm.org/D40111

llvm-svn: 324035
2018-02-01 23:45:18 +00:00
Craig Topper 2d67d1e2a8 [X86] Separate the call to LowerVectorAllZeroTest from EmitTest. NFCI
Every instruction that has the word TEST in its name seems to have been buried into EmitTest. But that code is largely concerned with trying to reuse the flags from instructions that update flags in a pretty normal way.

PTEST/TESTP/KTEST do not update flags in a normal way. They only update Z and C and the C flag update is non-standard. Rather than try to bend EmitTest's already complex logic to accomodate this, just move the call up to LowerSETCC and replicate the few pre-checks that are needed.

While there add a FIXME for using the C flag for checking for all 1s which we definitely couldn't do from EmitTEST.

llvm-svn: 324029
2018-02-01 23:21:20 +00:00
Amara Emerson 4d19655a56 [GlobalISel][Legalizer] Relax a legalization loop detecting assert.
Legalizing vectors may keep the element type the same but change the number of
elements, the assert didn't take this into account.

llvm-svn: 324028
2018-02-01 23:10:57 +00:00
Simon Pilgrim d1379c6df1 Fix check-prefixes typo and line endings.
llvm-svn: 324024
2018-02-01 22:32:41 +00:00
Simon Pilgrim 808a0e1589 [X86][SSE] Add SSE41 to variable permute tests
llvm-svn: 324017
2018-02-01 22:05:44 +00:00
Simon Pilgrim 26bf800625 [X86][XOP] Add XOP to variable permute tests
llvm-svn: 324015
2018-02-01 21:57:37 +00:00
Sanjay Patel 3343fcef86 [InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 inst
This is the enhancement suggested in D42536 to fix a shortcoming in 
regular InstCombine's canEvaluate* functionality.
When we have multiple uses of a value, but they're all in one instruction, we can 
allow that expression to be narrowed or widened for the same cost as a single-use 
value.

AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified 
away if the operands are the same value; add becomes shl; shifts with a variable shift 
amount aren't handled.

Differential Revision: https://reviews.llvm.org/D42739

llvm-svn: 324014
2018-02-01 21:55:53 +00:00
Nemanja Ivanovic 77e34f15c9 [PowerPC] Tell VSX swap removal that scalar conversions are lane-sensitive
This is a rather non-controversial change. We were missing these instructions
from the list of instructions that are lane-sensitive. These two put the result
into lane 0 (BE) or 3 (LE) regardless of the input. This patch fixes PR36068.

llvm-svn: 324005
2018-02-01 21:09:04 +00:00
David Blaikie 47ff8f457f Coding Standards: Document library layering requirements & header isolation.
(I suppose these two pieces could be separated - but seemed related
enough)

As discussed on llvm-dev, this documents the general expectation of how
library layering should be handled. There are a few existing cases where
these constraints are not met, but as with most style guide things -
this is forward looking and provides guidance when cleaning up existing
code, it doesn't immediately require that all previous code be cleaned
up to match. (see: naming conventions, etc)

Differential Revision: https://reviews.llvm.org/D42771

llvm-svn: 324004
2018-02-01 21:03:35 +00:00
Craig Topper a5944aade1 [DAGCombiner] When folding (insert_subvector undef, (bitcast (extract_subvector N1, Idx)), Idx) -> (bitcast N1) make sure that N1 has the same total size as the original output
We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits.

Fixes PR36199

Differential Revision: https://reviews.llvm.org/D42809

llvm-svn: 324002
2018-02-01 20:48:50 +00:00
Amara Emerson cbc02c71a4 [GlobalISel] Fix assert failure when legalizing non-power-2 loads.
Until we support extending loads properly we're going to fall back for these.
We already handle stores in the same way, so this is just being consistent.

llvm-svn: 324001
2018-02-01 20:47:03 +00:00