Commit Graph

113980 Commits

Author SHA1 Message Date
Simon Pilgrim 51ff15f472 [X86][SSE] Simplify combineVectorTruncationWithPACKUS. NFCI.
Move code only used by combineVectorTruncationWithPACKUS out of combineVectorTruncation.

llvm-svn: 334201
2018-06-07 14:53:32 +00:00
Hiroshi Inoue 01ef4c2c64 [PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelector
BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation.
However, enforcing 32-bit instruction sometimes results in redundant generated code.
For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF).

uint64_t func(uint64_t a, uint64_t b) {
	return (a & 0xFFFFFFFF) | (b << 32) ;
}

To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group.

Differential Revision: https://reviews.llvm.org/D47867

llvm-svn: 334195
2018-06-07 13:21:14 +00:00
Petar Jovanovic 241f286bd7 [Mips] Silencing warnings in instruction info (NFC)
isORCopyInst and isReadOrWriteToDSPReg functions were producing warning
that some statements my fall through.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D47876

llvm-svn: 334194
2018-06-07 13:06:06 +00:00
Simon Pilgrim 09953d8412 [X86][SSE] Simplify combineVectorTruncationWithPACKSS to reduce code duplication
Simplify combineVectorTruncationWithPACKSS to just a SIGN_EXTEND_INREG followed by using the existing truncateVectorWithPACK instead of duplicating code.

llvm-svn: 334193
2018-06-07 13:01:42 +00:00
Hiroshi Inoue b557846083 [PowerPC] fix trivial typos in comment, NFC
llvm-svn: 334191
2018-06-07 12:49:12 +00:00
Matt Arsenault f1c868ef08 AMDGPU: Fix not including v2f64 in SReg_128
Fixes assertion with calls returning v2f64.

llvm-svn: 334189
2018-06-07 12:16:31 +00:00
Florian Hahn 0d6b01761c [Mem2Reg] Avoid replacing load with itself in promoteSingleBlockAlloca.
We do the same thing in rewriteSingleStoreAlloca.

Fixes PR37632.

Reviewers: chandlerc, davide, efriedma

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D47825

llvm-svn: 334187
2018-06-07 11:09:05 +00:00
Matt Arsenault 697300bd4f AMDGPU: Use scalar operations for f16 fabs/fneg patterns
Fixes unnecessary differences between subtargets.

llvm-svn: 334184
2018-06-07 10:15:20 +00:00
Matt Arsenault 90083d3088 AMDGPU: Try a lot harder to emit scalar loads
This has two main components. First, widen
widen short constant loads in DAG when they have
the correct alignment. This is already done a bit in
AMDGPUCodeGenPrepare, since that has access to
DivergenceAnalysis. This can't help kernarg loads
created in the DAG. Start to use DAG divergence analysis
to help this case.

The second part is to avoid kernel argument lowering
breaking the alignment of short vector elements because
calling convention lowering wants to split everything
into legal register types.

When loading a split type, load the nearest 4-byte aligned
segment and shift to get the desired bits. This extra
load of the earlier argument piece ends up merging,
and the bit extract hopefully folds out.

There are a number of improvements and regressions with
this, but I think as-is this is a better compromise between
several of the worst parts of SelectionDAG.

Particularly when i16 is legal, this produces worse code
for i8 and i16 element vector kernel arguments. This is
partially due to the very weak load merging the DAG does.
It only looks for fairly specific combines between pairs
of loads which no longer appear. In particular this
causes v4i16 loads to be split into 2 components when
previously the two halves were merged.

Worse, because of the newly introduced shifts, there
is a lot more unnecessary vector packing and unpacking code
emitted. At least some of this is due to reporting
false for isTypeDesirableForOp for i16 as a workaround for
the lack of divergence information in the DAG. The cases
where this happens it doesn't actually matter, but the
relevant code in SimplifyDemandedBits doens't have the context
to know to ignore this.

The use of the  scalar cache is probably more important
than the mess of mostly scalar instructions doing this packing
and unpacking. Future work can fix this, possibly by making better
use of the new DAG divergence information for controlling promotion
decisions, or adding another version of shift + trunc + shift
combines that doesn't only know about the used types.

llvm-svn: 334180
2018-06-07 09:54:49 +00:00
Clement Courbet 4281b1d3b5 [X86][NFC] Fix harmless typo in BtVer2 model.
See D46356 for context.

llvm-svn: 334178
2018-06-07 09:26:33 +00:00
Tomasz Krupa f8c7637027 [X86] Block UndefRegUpdate
Summary: Prevent folding of operations with memory loads when one of the sources has undefined register update.

Reviewers: craig.topper

Subscribers: llvm-commits, mike.dvoretsky, ashlykov

Differential Revision: https://reviews.llvm.org/D47621

llvm-svn: 334175
2018-06-07 08:48:45 +00:00
Max Kazantsev b4b2ccea6d [NFC] Use variable instead of accessing pair many times
llvm-svn: 334173
2018-06-07 08:47:19 +00:00
Tomasz Krupa 145825162a Test commit access.
Added a bunch of periods after comments.

llvm-svn: 334171
2018-06-07 08:20:28 +00:00
Clement Courbet 9212ef0a0a [X86][NFC] Fix harmless typos in BDW/ZnVer1 sched models.
See D46356 for context.

llvm-svn: 334164
2018-06-07 07:37:49 +00:00
Karl-Johan Karlsson abb11f805f [BranchFolding] Fix live-in's when hoisting code
Summary:
When the branch folder hoist code into a predecessor it adjust live-in's
in the blocks it hoist code from. However it fail to handle hoisted code
that contain a defed register that originally is live-in in the block
through a super register.

This is fixed by replacing the live-in handling code with calls to
utility functions in LivePhysRegs.

Reviewers: kparzysz, gberry, MatzeB, uweigand, aprantl

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47529

llvm-svn: 334163
2018-06-07 07:20:33 +00:00
Jonas Paulsson e80d405760 [SystemZ] Build Load And Test from scratch in convertToLoadAndTest.
This is needed to get CC operand in right place, as expected by the
SchedModel.

Review: Ulrich Weigand
https://reviews.llvm.org/D47820

llvm-svn: 334161
2018-06-07 05:59:07 +00:00
Michael Zolotukhin 31800864dc SpeculativeExecution Pass: Set PreserveCFG to avoid unnecessary analyses invalidation.
The pass doesn't touch CFG in any way, only moves instructions between
blocks.

llvm-svn: 334150
2018-06-07 00:19:29 +00:00
Stanislav Mekhanoshin df61be70b2 [AMDGPU] Improve reciprocal handling
When denormals are supported we are producing a full division for
1.0f / x. That still can be replaced by the faster version:

    bool c = fabs(x) > 0x1.0p+96f;
    float s = c ? 0x1.0p-32f : 1.0f;
    x *= s;
    return s * v_rcp_f32(x)

in case if requested accuracy is 2.5ulp or less. The same version
is used if denormals are not supported for non 1.0 numerators, where
just v_rcp_f32 is then used for 1.0 numerator.

The optimization of 1/x is extended to the case -1/x, which is the
same except for the resulting sign bit.

OpenCL conformance passed with both enabled and disabled denorms.

Differential Revision: https://reviews.llvm.org/D47805

llvm-svn: 334142
2018-06-06 22:22:32 +00:00
Teresa Johnson 4ffc3e7834 [ThinLTO] Rename index IsAnalysis flag to HaveGVs (NFC)
With the upcoming patch to add summary parsing support, IsAnalysis would
be true in contexts where we are not performing module summary analysis.
Rename to the more specific and approprate HaveGVs, which is essentially
what this flag is indicating.

llvm-svn: 334140
2018-06-06 22:22:01 +00:00
Sanjay Patel 3cd1aa88f9 [InstCombine] fold another shifty abs pattern to cmp+sel (PR36036)
The bug report:
https://bugs.llvm.org/show_bug.cgi?id=36036

...requests a DAG change for this, but an IR canonicalization
probably handles most cases. If we still want to match this
pattern in the backend, there's a proposal for that too:
D47831

Alive proofs including nsw/nuw cases that were first noted in:
D46988

https://rise4fun.com/Alive/Kmp

This patch is largely copied from the existing code that was
initially added with:
D40984
...but I didn't see much gain from trying to share code.

llvm-svn: 334137
2018-06-06 21:58:12 +00:00
Matt Arsenault e9524f1fb3 AMDGPU: Custom lower v2f16 fneg/fabs with illegal f16
Fixes terrible code on targets without f16 support. The
legalization creates a mess that is difficult to recover
from. Also should avoid randomly breaking these tests
multiple times in sequence in future commits.

Some regressions in cases where it happens to be better
to pull the source modifier after the conversion.

llvm-svn: 334132
2018-06-06 21:28:11 +00:00
Roman Lebedev cbf8446359 [InstCombine] PR37603: low bit mask canonicalization
Summary:
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37603 | PR37603 ]].

https://godbolt.org/g/VCMNpS
https://rise4fun.com/Alive/idM

When doing bit manipulations, it is quite common to calculate some bit mask,
and apply it to some value via `and`.

The typical C code looks like:
```
int mask_signed_add(int nbits) {
    return (1 << nbits) - 1;
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 1, %0
  %3 = add nsw i32 %2, -1
  ret i32 %3
}
```

But there is a second, less readable variant:
```
int mask_signed_xor(int nbits) {
    return ~(-(1 << nbits));
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 -1, %0
  %3 = xor i32 %2, -1
  ret i32 %3
}
```

Since we created such a mask, it is quite likely that we will use it in `and` next.
And then we may get rid of `not` op by folding into `andn`.

But now that i have actually looked:
https://godbolt.org/g/VTUDmU
_some_ backend changes will be needed too.
We clearly loose `bzhi` recognition.

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47428

llvm-svn: 334127
2018-06-06 19:38:27 +00:00
Roman Lebedev 488d28d4e5 [X86] Emit BZHI when mask is ~(-1 << nbits))
Summary:
In D47428, i propose to choose the `~(-(1 << nbits))` as the canonical form of low-bit-mask formation.
As it is seen from these tests, there is a reason for that.

AArch64 currently better handles `~(-(1 << nbits))`, but not the more traditional `(1 << nbits) - 1` (sic!).
The other way around for X86.
It would be much better to canonicalize.

This patch is completely monkey-typing.
I don't really understand how this works :)
I have based it on `// x & (-1 >> (32 - y))` pattern.

Also, when we only have `BMI`, i wonder if we could use `BEXTR` with `start=0` ?

Related links:
https://bugs.llvm.org/show_bug.cgi?id=36419
https://bugs.llvm.org/show_bug.cgi?id=37603
https://bugs.llvm.org/show_bug.cgi?id=37610
https://rise4fun.com/Alive/idM

Reviewers: craig.topper, spatel, RKSimon, javed.absar

Reviewed By: craig.topper

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D47453

llvm-svn: 334125
2018-06-06 19:38:16 +00:00
Krzysztof Parzyszek c1e712baa5 [Hexagon] Implement vector-pair zero as V6_vsubw_dv
llvm-svn: 334123
2018-06-06 19:34:40 +00:00
Craig Topper ef813a5226 [X86] Properly disassemble gather/scatter instructions where xmm4/ymm4/zmm4 are used as the index.
These encodings correspond to the cases in the normal encoding scheme where there is no index and our modrm reading code initially decodes it as such. The VSIB handling code tried to compensate for this, but failed to add the base needed to make later code do the right thing.

Fixes PR37712.

llvm-svn: 334121
2018-06-06 19:15:15 +00:00
Craig Topper d04cc8e640 [X86] Rename vy512mem->vy512xmem and vz256xmem->vz256mem.
The index size is represented by the letter after the 'v'. The number represents the memory size. If an 'x' appears after the number its means the index register can be from VR128X/VR256X instead of VR128/VR256.

As vy512mem uses a VR256X index it should have an x.
And vz256mem uses a VR512 index so it shouldn't have an x.

I admit these names kind of suck and are confusing.

llvm-svn: 334120
2018-06-06 19:15:12 +00:00
Simon Pilgrim aef5bdbea1 [X86][BtVer2] Add support for all vector instructions that should match the dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), all these instructions are dependency breaking and zero the destination register.

llvm-svn: 334119
2018-06-06 19:06:09 +00:00
Evandro Menezes b2c8244715 [AArch64, ARM] Add support for Samsung Exynos M4
Create a separate feature set for Exynos M4 and add test cases.

llvm-svn: 334115
2018-06-06 18:56:00 +00:00
Michael Berg cc1c4b6912 guard fsqrt with fmf sub flags
Summary:
This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483.
It contains only context for fsqrt.


Reviewers: spatel, hfinkel, arsenm

Reviewed By: spatel

Subscribers: hfinkel, wdng, andrew.w.kaylor, wristow, efriedma, nemanjai

Differential Revision: https://reviews.llvm.org/D47749

llvm-svn: 334113
2018-06-06 18:47:55 +00:00
Krzysztof Parzyszek 0da1fe3770 [Hexagon] Split CTPOP of vector pairs
llvm-svn: 334109
2018-06-06 18:03:29 +00:00
Petar Jovanovic 8cb6a521be Change TII isCopyInstr way of returning arguments(NFC)
Make TII isCopyInstr() return MachineOperands through pointer to pointer
instead via reference.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D47364

llvm-svn: 334105
2018-06-06 16:36:30 +00:00
David Green 25312b2b6c [GlobalMerge] Set the alignment on merged global structs
If no alignment is set, the abi/preferred alignment of structs will be
used which may be higher than required. This can lead to extra padding
and in the end an increase in data size.

Differential Revision: https://reviews.llvm.org/D47633

llvm-svn: 334099
2018-06-06 14:48:32 +00:00
Tim Northover 9b80060d7b InstCombine: ignore debug instructions during fence combine
We should never get different CodeGen based on whether the code is being
compiled in debug mode so we must skip over @llvm.dbg.value (and similar)
calls.

Should fix at least the worst part of PR37690.

llvm-svn: 334090
2018-06-06 12:46:02 +00:00
Simon Pilgrim f06ff16049 Fix MSVC '*/' found outside of comment warning. NFCI.
llvm-svn: 334086
2018-06-06 11:10:11 +00:00
Ilya Biryukov 3c9c10649b Fix compilation of WebAssembly and RISCV after r334078
llvm-svn: 334085
2018-06-06 10:57:50 +00:00
Simon Pilgrim 3d14158891 [X86][BMI][TBM] Only demand bottom 16-bits of the BEXTR control op (PR34042)
Only the bottom 16-bits of BEXTR's control op are required (0:8 INDEX, 15:8 LENGTH).

Differential Revision: https://reviews.llvm.org/D47690

llvm-svn: 334083
2018-06-06 10:52:10 +00:00
Peter Smith 57f661bd7d [MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup
On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.

Differential Revision: https://reviews.llvm.org/D44928

llvm-svn: 334078
2018-06-06 09:40:06 +00:00
Petar Jovanovic 326ec32403 [MIPS GlobalISel] Add lowerCall
Add minimal support to lower function calls.
Support only functions with arguments/return that go through registers
and have type i32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D45627

llvm-svn: 334071
2018-06-06 07:24:52 +00:00
Petr Hosek fc9b29bd61 [Support] Use zx_cache_flush on Fuchsia to flush instruction cache
Fuchsia doesn't use __clear_cache, instead it provide zx_cache_flush
system call. Use it to flush instruction cache.

Differential Revision: https://reviews.llvm.org/D47753

llvm-svn: 334068
2018-06-06 06:26:18 +00:00
Sanjay Patel 59313be8d3 [CodeGen] assume max/default throughput for unspecified instructions
This is a fix for the problem arising in D47374 (PR37678):
https://bugs.llvm.org/show_bug.cgi?id=37678

We may not have throughput info because it's not specified in the model 
or it's not available with variant scheduling, so assume that those
instructions can execute/complete at max-issue-width.

Differential Revision: https://reviews.llvm.org/D47723

llvm-svn: 334055
2018-06-05 23:34:45 +00:00
Amaury Sechet a79b6b3ef0 [Mips] Remove uneeded variants of ADDC/ADDE lowering
Summary: As it turns out, the lowering for the Mips16* family of target is the exact same thing as what the ops expands to, so the code handling them can be removed and the ops only enabled for the MipsSE* family of targets.

Reviewers: smaksimovic, atanasyan, abeserminji

Subscribers: sdardis, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D47703

llvm-svn: 334052
2018-06-05 22:13:56 +00:00
Guozhi Wei c4c6b548c5 [CodeGenPrepare] Move Extension Instructions Through Logical And Shift Instructions
CodeGenPrepare pass move extension instructions close to load instructions in different BB, so they can be combined later. But the extension instructions can't move through logical and shift instructions in current implementation. This patch enables this enhancement, so we can eliminate more extension instructions.

Differential Revision: https://reviews.llvm.org/D45537

This is re-commit of r331783, which was reverted by r333305. The performance regression was caused by some unlucky alignment, not a code generation problem.

llvm-svn: 334049
2018-06-05 21:03:52 +00:00
Zachary Turner 8ac1c38a72 [FileSystem] Remove OpenFlags param from several functions.
There was only one place in the entire codebase where a non
default value was being passed, and that place was already hidden
in an implementation file.  So we can delete the extra parameter
and all existing clients continue to work as they always have,
while making the interface a bit simpler.

Differential Revision: https://reviews.llvm.org/D47789

llvm-svn: 334046
2018-06-05 19:58:26 +00:00
Matt Arsenault 57e541e87e AMDGPU: Preserve metadata when widening loads
Preserves the low bound of the !range. I don't think
it's legal to do anything with the top half since it's
theoretically reading garbage.

llvm-svn: 334045
2018-06-05 19:52:56 +00:00
Matt Arsenault 9224c00d2b AMDGPU: Use more custom insert/extract_vector_elt lowering
Apply to i8 vectors.

llvm-svn: 334044
2018-06-05 19:52:46 +00:00
Krzysztof Parzyszek b984ffcc71 [Hexagon] Add pattern to generate 64-bit neg instruction
llvm-svn: 334043
2018-06-05 19:52:39 +00:00
Krzysztof Parzyszek d8b093efef [Hexagon] Add more patterns for generating abs/absp instructions
llvm-svn: 334038
2018-06-05 19:00:50 +00:00
Michael Berg 96925fe0df guard fneg with fmf sub flags
Summary: This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483.

Reviewers: spatel, hfinkel

Reviewed By: spatel

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D47389

llvm-svn: 334037
2018-06-05 18:49:47 +00:00
Simon Dardis 0d95ff03f2 [mips] Fix the predicates for arithmetic operations
Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D47635

llvm-svn: 334031
2018-06-05 17:53:22 +00:00
Simon Pilgrim f2f043acbb [X86][SSE] Use multiplication scale factors for v8i16 SHL on pre-AVX2 targets.
Similar to v4i32 SHL, convert v8i16 shift amounts to scale factors instead to improve performance and reduce instruction count. We were already doing this for constant shifts, this adds variable shift support.

Reduces the serial nature of the codegen, which relies on chains of plendvb/pand+pandn+por shifts.

This is a step towards adding support for vXi16 vector rotates.

Differential Revision: https://reviews.llvm.org/D47546

llvm-svn: 334023
2018-06-05 15:17:39 +00:00
Nirav Dave 05b589101e [MC][X86] Allow assembler variable assignment to register name.
Summary:
Allow extended parsing of variable assembler assignment syntax and modify X86 to permit
VAR = register assignment. As we emit these as .set directives when possible, we inline
such expressions in output assembly.

Fixes PR37425.

Reviewers: rnk, void, echristo

Reviewed By: rnk

Subscribers: nickdesaulniers, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D47545

llvm-svn: 334022
2018-06-05 15:13:39 +00:00
Matt Arsenault 191bc71541 DAG: Stop dropping invariant/dereferencable
When legalizing illegal FP load results, this was
for some reason dropping the invariant and dereferencable
memory flags. There doesn't seem to be any reason for this,
and the equivalent isn't done for integer loads.

Fixes an issue in a future AMDGPU commit where some identical
loads fail to merge because one of the loads ends up
dropping the flags.

llvm-svn: 334020
2018-06-05 14:52:24 +00:00
John Brawn e4ff0bd401 [InstCombine] Correct the cmp operand type used when canonicalizing abs/nabs
When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need
to use the type of the existing operand when creating a new operand not the
type of a select operand, as the two may be different.

This fixes PR37686.

llvm-svn: 334019
2018-06-05 14:10:55 +00:00
Gabor Buella 1181f94ae4 [X86] NFC Fix typo introduced in r328016 HSI->HDI
llvm-svn: 334016
2018-06-05 12:55:12 +00:00
Krzysztof Parzyszek aafb8c204c [Hexagon] Minor cleanups in isel lowering
llvm-svn: 334015
2018-06-05 12:49:19 +00:00
Hiroshi Inoue 955655f558 [PowerPC] reduce rotate in BitPermutationSelector
BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers.
Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation.

For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation.
This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5.

This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first.

Differential Revision: https://reviews.llvm.org/D47765

llvm-svn: 334011
2018-06-05 11:58:01 +00:00
Simon Pilgrim fef9b6eea6 [X86][SSE] Add target shuffle support to X86TargetLowering::computeKnownBitsForTargetNode
Ideally we'd use resolveTargetShuffleInputs to handle faux shuffles as well but:
(a) that code path doesn't handle general/pre-legalized ops/types very well.
(b) I'm concerned about the compute time as they recurse to calls to computeKnownBits/ComputeNumSignBits which would need depth limiting somehow.

llvm-svn: 334007
2018-06-05 10:52:29 +00:00
Gabor Buella 349ffcee87 [X86] NFC Refactor some code in InstPrinters
Summary:
Bringing some come duplicated in the AT&T and the Intel printers
into a common parent class.

Reviewers: craig.topper

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D47682

llvm-svn: 334005
2018-06-05 10:41:39 +00:00
Peter Smith ef945b2240 [MC][ARM] Add range checking for Thumb2 resolved fixups.
When the branch target of a Thumb2 unconditional or conditonal branch is
resolved at assembly time, no range checking is performed on the result
leading to incorrect immediates. This change adds a range check:
+- 16 Megabytes for unconditional branches, +- 1 Megabyte for the
conditional branch.

Differential Revision: https://reviews.llvm.org/D46306

llvm-svn: 333997
2018-06-05 10:00:56 +00:00
Simon Pilgrim 7bbe7a2920 [X86][SSE] Add basic PACKUS support to X86TargetLowering::computeKnownBitsForTargetNode
Helps improve analysis of saturation ops

llvm-svn: 333995
2018-06-05 09:45:03 +00:00
Peter Smith 0aafe0cee5 [MC][ARM] Correct Thumb BL instruction range
The Thumb BL range is + or - either 16 Megabytes or 4 Megabytes depending
on whether the CPU supports Thumb2 or the v8-m baseline ops. The existing
check for BL range is incorrectly set at +- 32 Megabytes. This change
corrects the higher range and uses the lower range if the featurebits
don't have the necessary support for it.

Differential Revision: https://reviews.llvm.org/D46305

llvm-svn: 333991
2018-06-05 09:32:28 +00:00
Alexander Ivchenko 964b27fa21 [X86][CET] Shadow stack fix for setjmp/longjmp
This is the new version of D46181, allowing setjmp/longjmp
to work correctly with the Intel CET shadow stack by storing
SSP on setjmp and fixing it on longjmp. The patch has been
updated to use the cf-protection-return module flag instead
of HasSHSTK, and the bug that caused D46181 to be reverted
has been fixed with the test expanded to track that fix.

patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D47311

llvm-svn: 333990
2018-06-05 09:22:30 +00:00
Craig Topper f17b33d6c6 [X86] Make all instructions that operate on MMX types, but were added after the initial MMX support via one of the SSE features flags make them require the MMX feature as well.
Passing -mattr=-mmx needs to disable these instructions since the MMX register class won't have been set up. But we don't want -mattr=-mmx to disable SSE so we have to do it separately.

llvm-svn: 333984
2018-06-05 06:20:06 +00:00
Nirav Dave e5eb99668c [RegAllocGreedy] Use simpler map class for EvicteeInfo. NFCI.
RegAlloc keeps a insertion-time ordered map of evictee information,
but we only use membership. Replace MapVector with contextually
equivalent DenseMap which is smaller and faster.

llvm-svn: 333981
2018-06-05 03:16:28 +00:00
Francis Visoiu Mistrih 2c0ef67327 Use MF instead of Fn for MachineFunction references. NFC
llvm-svn: 333973
2018-06-05 00:27:28 +00:00
Francis Visoiu Mistrih ca69b3bf6d [ShrinkWrap] Add optimization remarks to the shrink-wrapping pass
Start by emitting remarks for very basic unsupported cases such as
irreducible CFGs and EHFunclets. The end goal is to be able to cover all
the cases where we give up with an explanation.

llvm-svn: 333972
2018-06-05 00:27:24 +00:00
Amara Emerson d496cc8ffb [MIRParser] Add parser support for 'true' and 'false' i1s.
We already output true and false in the printer, but the parser isn't able to
read it.

Differential Revision: https://reviews.llvm.org/D47424

llvm-svn: 333970
2018-06-05 00:17:13 +00:00
Reid Kleckner adcaddb6da Fix -Wcovered-switch-default warning and clang-format it
llvm-svn: 333967
2018-06-04 23:47:29 +00:00
David Blaikie 10d25ffe7d Move Compiler.h from Demangle back to Support
Code review feedback from r328123 prefers copying the few feature test
macros used by Demangle into there, rather than sinking the header into
an odd corner like Demangle.

llvm-svn: 333965
2018-06-04 22:53:38 +00:00
Derek Schuff 72f19241d6 Simplified WebAssemblyAsmBackend by removing explicit ELF variant.
The ELF version was broken (does not deal with wasm specific fixups),
and now is slightly less broken. It will be removed in its entirety
in the future which this change makes slightly easier (just remove
the IsELF bool).

Differential Revision: https://reviews.llvm.org/D47745

Patch by Wouter van Oortmerssen

llvm-svn: 333964
2018-06-04 22:53:36 +00:00
Sanjay Patel dcb8d304c3 [InstCombine] refine UB-handling in shuffle-binop transform
As noted in rL333782, we can be both better for optimization and
safer with this transform:
BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask

The only potentially unsafe-to-speculate binops are integer div/rem.
All other binops are always safe (although I don't see a way to
assert that in code here).

For opcodes like shifts that can produce poison, it can't matter
here because we know the lanes with undef are dropped by the
subsequent shuffle.

Differential Revision: https://reviews.llvm.org/D47686

llvm-svn: 333962
2018-06-04 22:26:45 +00:00
David Blaikie 31b98d2e99 Move Analysis/Utils/Local.h back to Transforms
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)

llvm-svn: 333954
2018-06-04 21:23:21 +00:00
Jessica Paquette aa087327ce [MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.h
This is setting up to fix bug 37573 cleanly.

This moves data structures that are technically both used in some way by the
target and the general-purpose outlining algorithm into MachineOutliner.h. In
particular, the `Candidate` class is of importance.

Before, the outliner passed the locations of `Candidates` to the target, which
would then make some decisions about the prospective outlined function. This
change allows us to just pass `Candidates` along to the target. This will allow
the target to discard `Candidates` that would be considered unsafe before cost
calculation. Thus, we will be able to remove the unsafe candidates described in
the bug without resorting to torching the entire prospective function.

Also, as a side-effect, it makes the outliner a bit cleaner.

https://bugs.llvm.org/show_bug.cgi?id=37573

llvm-svn: 333952
2018-06-04 21:14:16 +00:00
Alexander Ivchenko 2f038c4094 [X86][ELF][CET] Adding the .note.gnu.property ELF section in X86
In preparation for the proposed linker ABI changes
(https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf,
https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-cet.pdf),
this patch enables emission of the .note.gnu.property section to
ELF object files when building CET-enabled modules.

patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D47145

llvm-svn: 333951
2018-06-04 21:07:35 +00:00
Scott Linder ba81d7f1eb [CodeGen] Always update divergence in SelectionDAG::UpdateNodeOperands
Some overloads failed to update divergence.

Differential Revision: https://reviews.llvm.org/D47148

llvm-svn: 333947
2018-06-04 20:19:45 +00:00
Zachary Turner 63db25ba0d [Support] Add functions that operate on native file handles on Windows.
Windows' CRT has a limit of 512 open file descriptors, and fds which are
generated by converting a HANDLE via _get_osfhandle count towards this
limit as well.

Regardless, often you find yourself marshalling back and forth between
native HANDLE objects and fds anyway. If we know from the getgo that
we're going to need to work directly with the handle, we can cut out the
marshalling layer while also not contributing to filling up the CRT's
very limited handle table.

On Unix these functions just delegate directly to the existing set of
functions since an fd *is* the native file type. It would be nice, very
long term, if we could convert most uses of fds to file_t.

Differential Revision: https://reviews.llvm.org/D47688

llvm-svn: 333945
2018-06-04 19:38:11 +00:00
Amaury Sechet da661e9236 [DAGcombine] Teach the combiner about -a = ~a + 1
Summary: This include variant for add, uaddo and addcarry. usubo and subcarry require the carry to be flipped to preserve semantic, but we chose to do the transform anyway in that case as to push the transform down the carry chain.

Reviewers: efriedma, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46505

llvm-svn: 333943
2018-06-04 19:23:22 +00:00
Amaury Sechet 93a7d2aa3c Get rid of SETCCE
Summary: It has been deprecated in favor of SETCCCARRY for a year now and isn't used by any in tree backend.

Reviewers: efriedma, craig.topper, dblaikie, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47685

llvm-svn: 333939
2018-06-04 18:36:22 +00:00
Dmitry Mikulin 4539487650 In thin and full LTO + CFI, direct function calls may go through jump table
entries to reach the target. Since these calls don't require type checks,
we can short-circuit them to their real targets, except in cases when they
can be pre-empted.

Differential Revision: https://reviews.llvm.org/D46326

llvm-svn: 333937
2018-06-04 18:18:12 +00:00
Craig Topper 1f956e2c5f [X86] Don't pass ParitySrc array into isAddSubOrSubAddMask. Instead use a bool output parameter to get the real piece of info we care about. NFC
The ParitySrc array is more of an implementation detail. A single bool to get the final parity is sufficient.

llvm-svn: 333935
2018-06-04 17:58:45 +00:00
Stanislav Mekhanoshin 838c07c531 [AMDGPU] Small refactoring in the scheduler
After last changes some code can be simplified.

Differential Revision: https://reviews.llvm.org/D47661

llvm-svn: 333934
2018-06-04 17:57:40 +00:00
Stanislav Mekhanoshin 28624f94d5 [AMDGPU] Factored out common part of GCNRPTracker::reset()
Differential Revision: https://reviews.llvm.org/D47664

llvm-svn: 333931
2018-06-04 17:21:54 +00:00
Sam Clegg 675a51750a [MachO] Add out-of-bounds check to MachOObjectFile.cpp
This is a followup to rL333496.

Differential Revision: https://reviews.llvm.org/D47544

llvm-svn: 333929
2018-06-04 17:01:20 +00:00
Sam Clegg 537afe6f0e [WebAssembly] Fix .td files after rL333900
Differential Revision: https://reviews.llvm.org/D47727

llvm-svn: 333928
2018-06-04 16:59:26 +00:00
John Brawn c5a6392be3 [ValueTracking] Match select abs pattern when there's an sext involved
When checking a select to see if it matches an abs, allow the true/false values
to be a sign-extension of the comparison value instead of requiring that they're
directly the comparison value, as all the comparison cares about is the sign of
the value.

This fixes a regression due to r333702, where we were no longer generating ctlz
due to isKnownNonNegative failing to match such a pattern.

Differential Revision: https://reviews.llvm.org/D47631

llvm-svn: 333927
2018-06-04 16:53:57 +00:00
Mark Searles f0b93f1e9e [AMDGPU][Waitcnt] Fix handling of flat instrs
On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order.

Differential Revision: https://reviews.llvm.org/D46616

llvm-svn: 333926
2018-06-04 16:51:59 +00:00
Simon Pilgrim 7c000d4267 [X86] Only accept const SelectionDAG to resolveTargetShuffleInputs/getFauxShuffleMask
These methods should only be using SelectionDAG for analysis (known/sign bits etc), not node creation.

llvm-svn: 333925
2018-06-04 16:48:13 +00:00
Benjamin Kramer f663eba561 [NVPTX] Delete dead code from the AsmPrinter.
llvm-svn: 333924
2018-06-04 16:12:33 +00:00
Andrea Di Biagio 39e5a5695f [RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html

This fixes PR36672.

The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes.  This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).

Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
 1) a change that teaches llvm-mca how to resolve variant scheduling classes.
 2) a change to the BtVer2 scheduling model that allows us to special-case
    packed XOR zero-idioms (this partially fixes PR36671).

Differential Revision: https://reviews.llvm.org/D47374 

llvm-svn: 333909
2018-06-04 15:43:09 +00:00
Krzysztof Parzyszek 623eb54361 [SelectionDAG] Add missing closing parentheses in comments, NFC
llvm-svn: 333907
2018-06-04 14:54:53 +00:00
Nicolai Haehnle 59198ed040 AMDGPU: Make various NamedOperands upper case
Summary:
Avoid name clashes with the corresponding bit fields in the instruction
encoding.

Change-Id: Id1644e703e976e78f7af93788d9f44cb48c3251f

Reviewers: arsenm, rampitec, kzhuravl

Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47433

llvm-svn: 333905
2018-06-04 14:45:20 +00:00
Nicolai Haehnle 01d261f18d TableGen: Streamline the semantics of NAME
Summary:
The new rules are straightforward. The main rules to keep in mind
are:

1. NAME is an implicit template argument of class and multiclass,
   and will be substituted by the name of the instantiating def/defm.

2. The name of a def/defm in a multiclass must contain a reference
   to NAME. If such a reference is not present, it is automatically
   prepended.

And for some additional subtleties, consider these:

3. defm with no name generates a unique name but has no special
   behavior otherwise.

4. def with no name generates an anonymous record, whose name is
   unique but undefined. In particular, the name won't contain a
   reference to NAME.

Keeping rules 1&2 in mind should allow a predictable behavior of
name resolution that is simple to follow.

The old "rules" were rather surprising: sometimes (but not always),
NAME would correspond to the name of the toplevel defm. They were
also plain bonkers when you pushed them to their limits, as the old
version of the TableGen test case shows.

Having NAME correspond to the name of the toplevel defm introduces
"spooky action at a distance" and breaks composability:
refactoring the upper layers of a hierarchy of nested multiclass
instantiations can cause unexpected breakage by changing the value
of NAME at a lower level of the hierarchy. The new rules don't
suffer from this problem.

Some existing .td files have to be adjusted because they ended up
depending on the details of the old implementation.

Change-Id: I694095231565b30f563e6fd0417b41ee01a12589

Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D47430

llvm-svn: 333900
2018-06-04 14:26:05 +00:00
Simon Dardis fb4dde1142 [mips] Restore the availablity of trap for microMIPS
Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D47584

llvm-svn: 333895
2018-06-04 12:50:32 +00:00
Luke Geeson 43e4367961 [AArch64] Audit on rL333634 to fix FP16 Disasm BitPatterns
llvm-svn: 333879
2018-06-04 09:41:32 +00:00
Sander de Smalen d0a6f6a502 [AArch64][SVE] Fix range for DUP immediates (16bit elts)
For immediates used in DUP instructions that have the range
-128 to 127, or a multiple of 256 in the range -32768 to 32512,
one could argue that when the result element size is 16bits (.h),
the value can be considered both signed and unsigned.

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47619

llvm-svn: 333873
2018-06-04 07:24:23 +00:00
Sander de Smalen fd54a781f6 [AArch64][SVE] Asm: Print indexed element 0 as FPR.
Print the first indexed element as a FP register, for example:

  mov z0.d, z1.d[0]

Is now printed as:

  mov z0.d, d1

Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47571

llvm-svn: 333872
2018-06-04 07:07:35 +00:00
Sander de Smalen c33d668ab7 [AArch64][SVE] Asm: Support for indexed DUP instructions.
Unpredicated copy of indexed SVE element to SVE vector,
along with MOV-aliases.

For example:

  dup     z0.h, z1.h[0]

duplicates the first 16-bit element from z1 to all elements in
the result vector z0.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D47570

llvm-svn: 333871
2018-06-04 06:40:55 +00:00
Sander de Smalen 367a53b059 [AArch64][SVE] Asm: Support for FCPY immediate instructions.
Predicated copy of floating-point immediate value to SVE vector,
along with MOV-aliases.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: javed.absar

Differential Revision: https://reviews.llvm.org/D47518

llvm-svn: 333869
2018-06-04 05:58:06 +00:00
Sander de Smalen 512d57f1a5 [AArch64][SVE] Asm: Support for CPY immediate instructions
Predicated copy of possibly shifted immediate value into SVE
vector, along with MOV-aliases.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47517

llvm-svn: 333868
2018-06-04 05:40:46 +00:00
Serguei Katkov d894fb4288 [InstCombine] Fix div handling
When we optimize select basing on fact that div by 0 is undef
we should not traverse the instruction which are not guaranteed to
transfer execution to next instruction. Guard intrinsic is an example.

Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47576

llvm-svn: 333864
2018-06-04 02:52:36 +00:00
Vedant Kumar adbd27a599 [Debugify] Don't apply DI before the bitcode writer pass
Applying synthetic debug info before the bitcode writer pass has no
testing-related purpose. This commit prevents that from happening.

It also adds tests which check that IR produced with/without
-debugify-each enabled is identical after stripping. This makes it
possible to check that individual passes (or full pipelines) are
invariant to debug info.

llvm-svn: 333861
2018-06-04 00:11:49 +00:00
Craig Topper 9923eac358 [X86] Remove and autoupgrade masked avx512vnni intrinsics using the unmasked intrinsics and select instructions.
llvm-svn: 333857
2018-06-03 23:24:17 +00:00
Lang Hames d6155ff002 [ORC] Add a constructor to create an IRMaterializationUnit from a module and
pre-existing SymbolFlags and SymbolToDefinition maps.

This constructor is useful when delegating work from an existing
IRMaterialiaztionUnit to a new one, as it avoids the cost of re-computing these
maps.

llvm-svn: 333852
2018-06-03 19:22:48 +00:00
Sanjay Patel 3bd957b7ae [InstCombine] improve sub with bool folds
There's a patchwork of existing transforms trying to handle
these cases, but as seen in the changed test, we weren't
catching them all.

llvm-svn: 333845
2018-06-03 16:35:26 +00:00
Amaury Sechet 99909e9308 Remove SETCCE use from Lanai's backend
Summary: This creates a small perf regression, but after talking with Jacques Pienaar, he was good with it to get things moving toward removng SETCCE.

Reviewers: jpienaar, bryant

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47626

llvm-svn: 333838
2018-06-03 12:56:24 +00:00
Ivan A. Kosarev 60a991ed1a [NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part)
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.

Differential Revision: https://reviews.llvm.org/D47120

llvm-svn: 333825
2018-06-02 16:40:03 +00:00
Ivan A. Kosarev 73c5337a64 Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)"
The LLVM part was committed instead of the Clang part.

Differential Revision: https://reviews.llvm.org/D47121

llvm-svn: 333824
2018-06-02 16:38:38 +00:00
Michael J. Spencer ae6eeaea92 [MC] Add assembler support for .cg_profile.
Object FIle Representation
At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like:

.cg_profile a, b, 32
.cg_profile freq, a, 11
.cg_profile freq, b, 20

When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight).

Differential Revision: https://reviews.llvm.org/D44965

llvm-svn: 333823
2018-06-02 16:33:01 +00:00
Craig Topper 93d8fbd8f2 [X86] Add tied source operand to AVX5124FMAPS and AVX5124VNNIW instructions.
This doesn't affect the assembly or disassembly, but is more accurate.

llvm-svn: 333822
2018-06-02 16:30:39 +00:00
Craig Topper 27234f1d8f [X86] Fix warning message for AVX5124FMAPS and AVX5124VNNIW instructions in the assembly parser.
The caret was positioned on the wrong operand. It's too hard to get right so just put the caret at the beginning of the instruction.

llvm-svn: 333821
2018-06-02 16:30:36 +00:00
Sanjay Patel bbc6d60677 [InstCombine] call simplify before trying vector folds
As noted in the review thread for rL333782, we could have
made a bug harder to hit if we were simplifying instructions
before trying other folds. 

The shuffle transform in question isn't ever a simplification;
it's just a canonicalization. So I've renamed that to make that 
clearer.

This is NFCI at this point, but I've regenerated the test file 
to show the cosmetic value naming difference of using 
instcombine's RAUW vs. the builder.

Possible follow-ups:
1. Move reassociation folds after simplifies too.
2. Refactor common code; we shouldn't have so much repetition.

llvm-svn: 333820
2018-06-02 16:27:44 +00:00
Ivan A. Kosarev 51f19b9ee1 [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.

Differential Revision: https://reviews.llvm.org/D47121

llvm-svn: 333819
2018-06-02 16:26:42 +00:00
Fangrui Song 8ca769d204 [Support] Remove unused raw_ostream::handle whose anchor role was superseded by anchor()
llvm-svn: 333817
2018-06-02 06:00:35 +00:00
Craig Topper 1534929623 [X86] Add encoding information for the AVX5124FMAPS and AVX5124VNNIW instructions so they can be assembled and disassembled.
These instructions are unusual in that they operate on 4 consecutive registers so supporting them in codegen will be more difficult than normal.

Includes an assembler check to warn if the source register is not the first register of a 4 register group.

llvm-svn: 333812
2018-06-02 02:15:10 +00:00
Chandler Carruth 9281503e8f [PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses.
Summary:
I noticed this issue because we didn't put the primary cloned loop into
the `NonChildClonedLoops` vector and so never iterated on it. Once
I fixed that, it made it clear why I had to do a really complicated and
unnecesasry dance when updating the loops to remain in canonical form --
I was unwittingly working around the fact that the primary cloned loop
wasn't in the expected list of cloned loops. Doh!

Now that we include it in this vector, we don't need to return it and we
can consolidate the update logic as we correctly have a single place
where it can be handled.

I've just added a test for the iteration order aspect as every time
I changed the update logic partially or incorrectly here, an existing
test failed and caught it so that seems well covered (which is also
evidenced by the extensive working around of this missing update).

Reviewers: asbirlea, sanjoy

Subscribers: mcrosier, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47647

llvm-svn: 333811
2018-06-02 01:29:01 +00:00
Roman Tereshin cf88ffaaf9 [DebugInfo] Refactoring DIType::setFlags to DIType::cloneWithFlags, NFC
and using the latter in DIBuilder::createArtificialType and
DIBuilder::createObjectPointerType methods as well as introducing
mirroring DISubprogram::cloneWithFlags and
DIBuilder::createArtificialSubprogram methods.

The primary goal here is to add createArtificialSubprogram to support
a pass downstream while keeping the method consistent with the
existing ones and making sure we don't encourage changing already
created DI-nodes.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D47615

llvm-svn: 333806
2018-06-01 23:15:09 +00:00
Craig Topper 3828ce7eab [X86] Do something sensible when an expand load intrinsic is passed a 0 mask.
Previously we just returned undef, but really we should be returning the pass thru input. We also need to make sure we preserve the chain output that the original intrinsic node had to maintain connectivity in the DAG. So we should just return the incoming chain as the output chain.

llvm-svn: 333804
2018-06-01 22:59:07 +00:00
Vedant Kumar 7224c08141 Add a debug dump for DbgValueHistoryMap
This makes it easier to inspect the results of
DbgValueHistoryCalculator.

Differential Revision: https://reviews.llvm.org/D47663

llvm-svn: 333801
2018-06-01 22:33:15 +00:00
Craig Topper aa747412b1 [X86] Add isel patterns to use vexpand with zero masking when the passthru value is a zero vector.
llvm-svn: 333800
2018-06-01 22:28:28 +00:00
Zachary Turner b44d7a0da1 Move some function declarations out of WindowsSupport.h
The idea behind WindowsSupport.h is that it's in the source directory so
that windows.h'isms don't leak out into the larger LLVM project. To that
end, any symbol that references a symbol from windows.h must be in this
private header, and not in a public header.

However, we had some useful utility functions in WindowsSupport.h which
have no dependency on the Windows API, but still only make sense on
Windows. Those functions should be usable outside of Support since there
is no risk of causing a windows.h leak. Although this introduces some
preprocessor logic in some header files, It's not too egregious and it's
better than the alternative of duplicating a ton of code.

Differential Revision: https://reviews.llvm.org/D47662

llvm-svn: 333798
2018-06-01 22:23:46 +00:00
Karl-Johan Karlsson 6d52e5c3e4 [ConstantFold] Disallow folding vector geps into bitcasts
Summary:
Getelementptr returns a vector of pointers, instead of a single address,
when one or more of its arguments is a vector. In such case it is not
possible to simplify the expression by inserting a bitcast of operand(0)
into the destination type, as it will create a bitcast between different
sizes.

Reviewers: majnemer, mkuper, mssimpso, spatel

Reviewed By: spatel

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D46379

llvm-svn: 333783
2018-06-01 19:34:35 +00:00
Sanjay Patel 66f7e19f6a [InstCombine] fix vector shuffle transform to replace undef elements (PR37648)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37648
...was created with the enhancement to this transform with rL332479.

The urem test shows the disaster potential: any undef divisor lane makes
the whole op undef.

The test diffs show that vector demanded elements turns some of the potential, 
but not all, unused binop operands back into undef already.

llvm-svn: 333782
2018-06-01 19:23:18 +00:00
Simon Atanasyan e80c3ce9cc [mips] Support 64-bit offsets for lb/sb/ld/sd/lld ... instructions
The `MipsAsmParser::loadImmediate` can load immediates of various sizes
into a register. Idea of this change is to use `loadImmediate` in the
`MipsAsmParser::expandMemInst` method to load offset into a register and
then call required load/store instruction.

The patch removes separate `expandLoadInst` and `expandStoreInst`
methods and does everything in the `expandMemInst` method to escape code
duplication.

Differential Revision: https://reviews.llvm.org/D47316

llvm-svn: 333774
2018-06-01 16:37:53 +00:00
Simon Atanasyan 3a44bcf95a [mips] Extend list of relocations supported by the `.reloc` directive
Supporting GOT and TLS related relocations by the `.reloc` directive is
useful for purpose of testing various tools like a linker, for example.

llvm-svn: 333773
2018-06-01 16:37:42 +00:00
Krzysztof Parzyszek bc68385dad [Hexagon] Avoid UB when shifting unsigned integer left by 32
llvm-svn: 333771
2018-06-01 15:39:10 +00:00
Vlad Tsyrklevich 6867ab7c90 [ThinLTOBitcodeWriter] Emit summaries for regular LTO modules
Summary:
Emit summaries for bitcode modules that are only destined for the
regular LTO portion of the build so they can participate in
summary-based dead stripping.

This change reduces the size of a nacl_helper build with cfi-icall
enabled by 7%, removing the majority of the overhead due to enabling
cfi-icall. The cfi-icall size increase was caused by compiling in lots
of unused code and cfi-icall generating jumptable references to unused
symbols that could no longer be removed by -Wl,-gc-sections. Increasing
the visibility of summary-based dead stripping prevented jumptable
entries being created for unused symbols from the regular LTO portion
of the build.

Reviewers: pcc

Reviewed By: pcc

Subscribers: dschuff, mehdi_amini, inglorion, eraman, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D47594

llvm-svn: 333768
2018-06-01 15:20:47 +00:00
Nirav Dave fc9a700f94 [DAG] Avoid checking for consecutive stores in store merge. NFCI.
llvm-svn: 333766
2018-06-01 15:05:55 +00:00
Nirav Dave 39ece11ae5 [DAG] Simplify Expression. NFC.
llvm-svn: 333765
2018-06-01 15:05:30 +00:00
Nirav Dave 0fc27acaa2 [DAG] Remove untriggerable check. NFCI.
Candidate check precludes this check.

llvm-svn: 333764
2018-06-01 15:05:05 +00:00
Nirav Dave a74921a696 [DAG] Prune store merge legal store check to stop invalid size. NFCI.
Do not consider store sizes large than the maximum legal store size.

llvm-svn: 333763
2018-06-01 15:04:40 +00:00
Krzysztof Parzyszek aec2c0c9b6 [Hexagon] Select HVX code for vector CTPOP, CTLZ, and CTTZ
llvm-svn: 333760
2018-06-01 14:52:58 +00:00
Hiroshi Inoue 9796b47df1 [NFC] Zero initialize local variables
This patch makes local variables zero initialized to avoid broken values in debug output.

llvm-svn: 333754
2018-06-01 14:23:15 +00:00
Krzysztof Parzyszek 0b6187c1a9 [SelectionDAG] Expand UADDO/USUBO into ADD/SUBCARRY if legal for target
Additionally, implement handling of ADD/SUBCARRY on Hexagon, utilizing
the UADDO/USUBO expansion.

Differential Revision: https://reviews.llvm.org/D47559

llvm-svn: 333751
2018-06-01 14:00:32 +00:00
Amaury Sechet 8467411dad Set ADDE/ADDC/SUBE/SUBC to expand by default
Summary:
They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while.

Target that uses these opcodes are changed in order to ensure their behavior doesn't change.

Reviewers: efriedma, craig.topper, dblaikie, bkramer

Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D47422

llvm-svn: 333748
2018-06-01 13:21:33 +00:00
Amara Emerson 5a3bb68e12 [AArch64][GlobalISel] Zero-extend s1 values when returning.
Before we were relying on the any extend of the s1 to s32, but
for AAPCS we need to zero-extend it to at least s8.

Fixes PR36719

Differential Revision: https://reviews.llvm.org/D47425

llvm-svn: 333747
2018-06-01 13:20:32 +00:00
Florian Hahn 8a17f1f43e Revert r333740: IPSCCP] Use PredicateInfo to propagate facts from cmp.
This is breaking the clang-with-thin-lto-ubuntu bot.

llvm-svn: 333745
2018-06-01 12:58:43 +00:00
Sander de Smalen f95ea047e5 [AArch64][SVE] Asm: Support for FDUP_ZI (copy fp immediate) instruction.
Unpredicated copy of floating-point immediate value into SVE vector,
along with MOV-aliases.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47482

llvm-svn: 333744
2018-06-01 12:54:46 +00:00
Simon Dardis 351aa594f6 [mips] Guard more aliases correctly.
Also, duplicate an alias for microMIPS.

llvm-svn: 333741
2018-06-01 10:57:13 +00:00
Florian Hahn f4df554f32 Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

Differential Revision: https://reviews.llvm.org/D45330

llvm-svn: 333740
2018-06-01 10:48:54 +00:00
Simon Dardis 54217598b6 [mips] Guard 'nop' properly and add mips16's nop instruction
Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D47583

llvm-svn: 333739
2018-06-01 10:46:00 +00:00
Pavel Labath d6ca063907 DWARFAcceleratorTable: Add an iterator-based api for accessing names in the index
Summary:
Back when we were introducing the DWARF v5 name index, there was a
short discussion whether we shouldn't have a nicer api for iterating
over the index. At that time, I did not find it necessary since the
iteration over names was done only from within the index itself (and I
figured the internal implementation can deal with a slightly rough
interface).

However, now I ran into a use for this kind of API in LLDB (for finding
all names matching a regular expression), so it looked like a nice
opportunity to introduce one. To make the API more useful, I've made the
NameTableEntry class a bit smarter: it now stores the string section
reference (so it can return its name) and its position in the name index
(mainly useful for dumping/logging).

I also convert the internal users to use the new API, which also gives
test coverage for the added code.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47590

llvm-svn: 333738
2018-06-01 10:33:11 +00:00
Simon Dardis ee67dcb837 [mips] Select the correct instruction for computing frameindexes
Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D47582

llvm-svn: 333736
2018-06-01 10:07:10 +00:00
Gabor Buella 27c96d3d20 NFC Avoid a warning in WasmEHPrepare.cpp
```
../lib/CodeGen/WasmEHPrepare.cpp:166:30: warning: extra ‘;’ [-Wpedantic]
                 false, false);
                              ^
```

llvm-svn: 333732
2018-06-01 07:47:46 +00:00
Sander de Smalen 97ca6b9e09 [AArch64][SVE] Asm: Support for DUPM (masked immediate) instruction.
Unpredicated copy of repeating immediate pattern to SVE vector, along
with MOV-aliases.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D47328

llvm-svn: 333731
2018-06-01 07:25:46 +00:00
Craig Topper c3cf55b935 [X86][Disassembler] Make it an error to set EVEX.R' to 0 when modrm.reg encodes a GPR.
This is different than the behavior of EVEX.X extending modrm.rm to 5 bits.

llvm-svn: 333728
2018-06-01 06:11:29 +00:00
Craig Topper 0838c4d6bc [X86][Disassembler] Ignore EVEX.X extension of modrm.rm to 5-bits when modrm.rm encodes a k-register.
llvm-svn: 333727
2018-06-01 05:36:08 +00:00
Craig Topper 74a61b02e0 [X86][Disassembler] Clamp index to 4-bits when decoding GPR registers.
A 5-bit value can occur when EVEX.X is 0 due to it being used to extend modrm.rm to encode XMM16-31. But if modrm.rm instead encodes a GPR, the Intel documentation says EVEX.X should be ignored so just mask it to 4 bits once we know its a GPR.

llvm-svn: 333725
2018-06-01 05:12:44 +00:00
Craig Topper 5b1dd01e57 [X86][Disassembler] Make sure EVEX.X is not used to extend base registers of memory operations.
This was an accidental side effect of EVEX.X being used to encode XMM16-XMM31 using modrm.rm with modrm.mod==0x3.

I think there are still more bugs related to this.

llvm-svn: 333722
2018-06-01 04:29:34 +00:00
Craig Topper c6b2c2bb70 [X86][Disassembler] Use a local variable instead of using a field in the instruction object. NFC
llvm-svn: 333721
2018-06-01 04:29:30 +00:00
Tom Stellard e43778895c AMDGPU/R600: Move intrinsics to IntrinsicsAMDGPU.td
Reviewers: arsenm, nhaehnle, jvesely

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D47487

llvm-svn: 333720
2018-06-01 02:19:46 +00:00