Commit Graph

50266 Commits

Author SHA1 Message Date
Zvi Rackover cf0999887a X86 Tests: Add zext cases in (trunc (subvector)) test. NFC
Cases were missing as observed in D41927

llvm-svn: 322297
2018-01-11 17:50:34 +00:00
Simon Pilgrim 8de035670e [X86][SSE] Drop old insertps stack folding test
Broken test from old attempt for folding tables - we don't peek through extract_subvector spills at all (which is why it doesn't fold), and we already have foldMemoryOperandCustom to handle insertps immediate correction anyway.

llvm-svn: 322292
2018-01-11 16:57:58 +00:00
Benjamin Kramer 738e6e7cb0 [InstCombine] Apply the fix from r322284 for sin / cos -> tan too
llvm-svn: 322285
2018-01-11 15:33:21 +00:00
Benjamin Kramer 44993ede60 [InstCombine] For cos/sin -> tan copy attributes from cos instead of the
parent function

Ideally we should merge the attributes from the functions somehow, but
this is obviously an improvement over taking random attributes from the
caller which will trip up the verifier if they're nonsensical for an
unary intrinsic call.

llvm-svn: 322284
2018-01-11 15:19:02 +00:00
Sanjay Patel e63d8dda5a [ValueTracking] recognize min/max-of-min/max with notted ops (PR35875)
This was originally planned as the fix for:
https://bugs.llvm.org/show_bug.cgi?id=35834
...but simpler transforms handled that case, so I implemented a 
lesser solution. It turns out we need to handle the case with 'not'
ops too because the real code example that we are trying to solve:
https://bugs.llvm.org/show_bug.cgi?id=35875
...has extra uses of the intermediate values, so we can't rely on 
smaller canonicalizations to get us to the goal.

As with rL321672, I've tried to show every possibility in the
codegen tests because that's the simplest way to prove we're doing
the right thing in the wide variety of permutations of this pattern.

We can also show an InstCombine win because we added a fold for
this case in:
rL321998 / D41603

An Alive proof for one variant of the pattern to show that the 
InstCombine and codegen results are correct:
https://rise4fun.com/Alive/vd1

Name: min3_nots
  %nx = xor i8 %x, -1
  %ny = xor i8 %y, -1
  %nz = xor i8 %z, -1
  %cmpxz = icmp slt i8 %nx, %nz
  %minxz = select i1 %cmpxz, i8 %nx, i8 %nz
  %cmpyz = icmp slt i8 %ny, %nz
  %minyz = select i1 %cmpyz, i8 %ny, i8 %nz
  %cmpyx = icmp slt i8 %y, %x
  %r = select i1 %cmpyx, i8 %minxz, i8 %minyz
=>
  %cmpxyz = icmp slt i8 %minxz, %ny
  %r = select i1 %cmpxyz, i8 %minxz, i8 %ny

Name: min3_nots_alt
  %nx = xor i8 %x, -1
  %ny = xor i8 %y, -1
  %nz = xor i8 %z, -1
  %cmpxz = icmp slt i8 %nx, %nz
  %minxz = select i1 %cmpxz, i8 %nx, i8 %nz
  %cmpyz = icmp slt i8 %ny, %nz
  %minyz = select i1 %cmpyz, i8 %ny, i8 %nz
  %cmpyx = icmp slt i8 %y, %x
  %r = select i1 %cmpyx, i8 %minxz, i8 %minyz
=>
  %xz = icmp sgt i8 %x, %z
  %maxxz = select i1 %xz, i8 %x, i8 %z
  %xyz = icmp sgt i8 %maxxz, %y
  %maxxyz = select i1 %xyz, i8 %maxxz, i8 %y
  %r = xor i8 %maxxyz, -1

llvm-svn: 322283
2018-01-11 15:13:47 +00:00
Sanjay Patel e0df4650f8 [InstCombine] add min3-with-nots test (PR35875); NFC
llvm-svn: 322281
2018-01-11 14:53:45 +00:00
Simon Pilgrim 6e6da3f449 [X86][SSE] Add ISD::VECTOR_SHUFFLE to faux shuffle decoding
Primarily, this allows us to use the aggressive extraction mechanisms in combineExtractWithShuffle earlier and make use of UNDEF elements that may be lost during lowering.

llvm-svn: 322279
2018-01-11 14:25:18 +00:00
Zvi Rackover 3ee66d9cd1 X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices
Summary:
As RKSimon suggested in pr35820, in the case that Src is smaller in
bit-size than Indices, need to widen Src to avoid type mismatch.

Fixes pr35820

Reviewers: RKSimon, craig.topper

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41865

llvm-svn: 322272
2018-01-11 12:26:52 +00:00
Alex Bradbury 0715d35ed5 [RISCV] Reserve an emergency spill slot for the register scavenger when necessary
Although the register scavenger can often find a spare register, an emergency 
spill slot is needed to guarantee success. Reserve this slot in cases where 
the function is known to have a large stack (meaning the scavenger may be 
needed when forming stack addresses).

llvm-svn: 322269
2018-01-11 11:17:19 +00:00
Andrew V. Tischenko d037b1446b Implementation of X86Operand::print.
Differential Revision: https://reviews.llvm.org/D41610

llvm-svn: 322267
2018-01-11 10:31:01 +00:00
Stefan Maksimovic 5481c2176e [Mips] Handle one byte unsupported relocations
Fail gracefully instead of crashing upon encountering
this type of relocation.

Differential revision: https://reviews.llvm.org/D41857

llvm-svn: 322266
2018-01-11 10:07:47 +00:00
Sander de Smalen ba5fd775ad [AArch64][SVE] Asm: Negative tests for predicated ADD/SUB register constraints
Summary: Patch [3/3] in a series to add operand constraint checks for SVE's predicated ADD/SUB.

Reviewers: rengolin, mcrosier, evandro, fhahn, echristo

Reviewed By: rengolin, fhahn

Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D41447

llvm-svn: 322265
2018-01-11 10:02:27 +00:00
Aaron Smith a73fa2a0ed [CodeView] Fix the type for a variadic argument
Summary:
- MSVC uses the none type for a variadic argument in CodeView
- Add a unit test

Reviewers: zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D41931

llvm-svn: 322257
2018-01-11 06:42:11 +00:00
Dmitry Venikov e5fbf591a7 [InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x)
Summary: This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag

Reviewers: hfinkel, spatel

Reviewed By: spatel

Subscribers: andrew.w.kaylor, efriedma, scanon, llvm-commits

Differential Revision: https://reviews.llvm.org/D41286

llvm-svn: 322255
2018-01-11 06:33:00 +00:00
Craig Topper 0b59034b15 [X86] Optimize v2i32/v2f32 scatters.
If the index is v2i64 we can use the scatter instruction that has v4i32/v4f32 data register, v2i64 index, and v2i1 mask. Similar was already done for gather.

Implement custom widening for v2i32 data to remove the code that reverses type legalization during lowering.

llvm-svn: 322254
2018-01-11 06:31:28 +00:00
Sanjay Patel f16fe0f205 [AArch64] add tests for notted variants of min/max; NFC
Like rL321668 / rL321672, the planned optimizer change to 
fix these will be in ValueTracking, but we can test the
changes cleanly here with AArch64 codegen.

llvm-svn: 322238
2018-01-10 23:31:42 +00:00
Matthias Braun e3a8db7ba1 Revert "AArch64: Fix emergency spillslot being out of reach for large callframes"
Revert for now as the testcase is hitting a pre-existing verifier error
that manifest as a failure when expensive checks are enabled (or
-verify-machineinstrs) is used.

This reverts commit r322200.

llvm-svn: 322231
2018-01-10 22:36:28 +00:00
Alexey Bataev 90e29b81d6 [SLP] Add/update tests for SLP vectorizer, NFC.
llvm-svn: 322225
2018-01-10 21:29:18 +00:00
Alex Bradbury 315cd3ace4 [RISCV] Implement support for the BranchRelaxation pass
Branch relaxation is needed to support branch displacements that overflow the
instruction's immediate field.

Differential Revision: https://reviews.llvm.org/D40830

llvm-svn: 322224
2018-01-10 21:05:07 +00:00
Matthias Braun 725ad0eee0 TargetLoweringBase: The ios simulator has no bzero function.
Make sure I really get back to the beahvior before my rewrite in r321035
which turned out not to be completely NFC as I changed the behavior for
the ios simulator environment.

llvm-svn: 322223
2018-01-10 20:49:57 +00:00
Alex Bradbury e027c93ac2 [RISCV] Implement branch analysis
This is a prerequisite for the branch relaxation pass, and allows a number of
optimisation passes (e.g. BranchFolding and MachineBlockPlacement) to work.

Differential Revision: https://reviews.llvm.org/D40808

llvm-svn: 322222
2018-01-10 20:47:00 +00:00
Alex Bradbury 70f137b6bf [RISCV] Add support for llvm.{frameaddress,returnaddress} intrinsics
llvm-svn: 322218
2018-01-10 20:12:00 +00:00
Alex Bradbury 9330e64485 [RISCV] Add basic support for inline asm constraints
llvm-svn: 322217
2018-01-10 20:05:09 +00:00
Alex Bradbury 9fea4881d0 [RISCV] Support stack frames and offsets up to 32-bits
Differential Revision: https://reviews.llvm.org/D40807

llvm-svn: 322216
2018-01-10 19:53:46 +00:00
Alex Bradbury c85be0de56 [RISCV] Support for varargs
Includes support for expanding va_copy. Also adds support for using 'aligned'
registers when necessary for vararg calls, and ensure the frame pointer always
points to the bottom of the vararg spill region. This is necessary to ensure
that the saved return address and stack pointer are always available at fixed
known offsets of the frame pointer.

Differential Revision: https://reviews.llvm.org/D40805

llvm-svn: 322215
2018-01-10 19:41:03 +00:00
Craig Topper af4eb17223 [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes
Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend.

Most of this patch is just making sure we copy the scale around everywhere.

Differential Revision: https://reviews.llvm.org/D40055

llvm-svn: 322210
2018-01-10 19:16:05 +00:00
Jessica Paquette c191f1097c [MachineOutliner] Outline ADRPs
ADRP instructions weren't being outlined because they're PC-relative and thus
fail the LR checks. This patch adds a special case for ADRPs to
getOutliningType to make sure that ADRPs can be outlined and updates the MIR
test.

llvm-svn: 322207
2018-01-10 18:49:57 +00:00
Sanjay Patel d04026ea43 [InstCombine] add test to show missed bswap; NFC
D41353 / D41233 are proposing to alter the shl/and canonicalization,
but I think that would just move an existing pattern-matching hole
to a different place.

llvm-svn: 322206
2018-01-10 18:47:21 +00:00
Matthias Braun b42ffa1283 AArch64: Fix emergency spillslot being out of reach for large callframes
Large callframes (calls with several hundreds or thousands or
parameters) could lead to situations in which the emergency spillslot is
out of range to be addressed relative to the stack pointer.
This commit forces the use of a frame pointer in the presence of large
callframes.

This commit does several things:
- Compute max callframe size at the end of instruction selection.
- Add mirFileLoaded target callback. Use it to compute the max callframe size
  after loading a .mir file when the size wasn't specified in the file.
- Let TargetFrameLowering::hasFP() return true if there exists a
  callframe > 255 bytes.
- Always place the emergency spillslot close to FP if we have a frame
  pointer.
- Note that `useFPForScavengingIndex()` would previously return false
  when a base pointer was available leading to the emergency spillslot
  getting allocated late (that's the whole effect of this callback).
  Which made no sense to me so I took this case out: Even though the
  emergency spillslot is technically not referenced by FP in this case
  we still want it allocated early.

Differential Revision: https://reviews.llvm.org/D40876

llvm-svn: 322200
2018-01-10 18:16:24 +00:00
Simon Pilgrim f74e3f45dc [X86][MMX] Add test for PR35869
llvm-svn: 322197
2018-01-10 17:05:03 +00:00
Zvi Rackover a27442f4f4 X86 Tests: Add isel tests for truncate-extract_vector-extend. NFC.
To be improved in a future patch

llvm-svn: 322192
2018-01-10 14:56:15 +00:00
Dmitry Preobrazhensky 3afbd825a3 [AMDGPU][MC][GFX8][GFX9] Added XNACK_MASK support
See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764

Differential Revision: https://reviews.llvm.org/D41614

Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 322189
2018-01-10 14:22:19 +00:00
Simon Pilgrim a0c59cce0e [X86][SSE] Add some basic FABS combine tests
llvm-svn: 322182
2018-01-10 13:28:34 +00:00
Bjorn Pettersson 3851496e6e Avoid inlining if there is byval arguments with non-alloca address space
Summary:
After teaching InlineCost more about address spaces ()
another fault was detected in the inliner. If an argument has
the byval attribute the parameter might be copied to an alloca.
That part seems to work fine even if the argument has a different
address space than the alloca address space. However, if the
address spaces differ, then the inlined function still might
refer to the parameter using the original address space (the
inliner does not handle that situation very well).

This patch avoids the problem by simply disallowing inlining
when there are byval arguments with address space that differs
from the alloca address space.

I'm not really sure how to transform the code if we want to
get inlining for this situation. I assume that it never has
been working, and that the fixes in r321809 just exposed an
old problem.

Fault found by skatkov (Serguei Katkov). It is mentioned in
follow up comments to https://reviews.llvm.org/D40455.

Reviewers: skatkov

Reviewed By: skatkov

Subscribers: uabelho, eraman, llvm-commits, haicheng

Differential Revision: https://reviews.llvm.org/D41898

llvm-svn: 322181
2018-01-10 13:01:18 +00:00
Simon Pilgrim a330a407c4 [X86][SSE] Add v2f64 u2 shuffle test
Adds missing coverage for SHUFPD undef argument lowering, and also shows a missed opportunity to remove a unnecessary move compared to 02 shuffle mask.

llvm-svn: 322175
2018-01-10 12:23:39 +00:00
Sander de Smalen a7ec090eaa [AArch64][SVE] Asm: Add support for (mov|dup) of scalar
Summary: This patch adds support for 'dup' (Scalar -> SVE) and its corresponding 'mov' alias.

Reviewers: fhahn, rengolin, evandro, echristo

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D41822

llvm-svn: 322172
2018-01-10 11:32:47 +00:00
Diana Picus e3591f3a17 [ARM GlobalISel] Add inst selector tests for G_FNEG s32 and s64
G_FNEG is already handled by the TableGen'erated code. Just add a few
tests to make sure everything works as expected.

llvm-svn: 322170
2018-01-10 11:13:36 +00:00
Diana Picus 0ed7513c83 [ARM GlobalISel] Map G_FNEG to the FPR bank
llvm-svn: 322169
2018-01-10 11:13:31 +00:00
Diana Picus f949a0abac [ARM GlobalISel] Legalize G_FNEG for s32 and s64
For hard float, it is legal.

For soft float, we need to lower to 0 - x first, and then we can use the
libcall for G_FSUB. This is undoing some of the canonicalization
performed by the IRTranslator (which introduces G_FNEG when it sees a
0 - x). Ideally, that canonicalization would be performed by a
pre-legalizer pass that would allow targets to opt out of this behaviour
rather than dance around it in the legalizer.

llvm-svn: 322168
2018-01-10 10:45:34 +00:00
Jonas Paulsson 1a76f3a2c2 Temporarily revert
"[SystemZ]  Check for legality before doing LOAD AND TEST transformations."

, due to test failures.

llvm-svn: 322165
2018-01-10 10:05:55 +00:00
Diana Picus 8f14886630 [ARM GlobalISel] Legalize s32/s64 G_FCONSTANT
Legal for hard float.
Change to G_CONSTANT for soft float (but preserve the binary
representation).

llvm-svn: 322164
2018-01-10 10:01:49 +00:00
Jonas Paulsson 9222b91e24 [SelectionDAGBuilder] Chain prefetches less aggressively.
Prefetches used to always be chained between any previous and following
memory accesses. The problem with this was that later optimizations, such as
folding of a load into the user instruction, got disrupted.

This patch relaxes the chaining of prefetches in order to remedy this.

Reveiw: Hal Finkel
https://reviews.llvm.org/D38886

llvm-svn: 322163
2018-01-10 09:33:00 +00:00
Diana Picus 734a5e8912 [ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits
Make G_CONSTANT narrow for any scalars larger than 32 bits.

llvm-svn: 322162
2018-01-10 09:32:01 +00:00
Jonas Paulsson d9dde1ac56 [SystemZ] Check for legality before doing LOAD AND TEST transformations.
Since a load and test instruction treat its operands as signed, it can only
replace a logical compare for EQ/NE uses.

Review: Ulrich Weigand
https://bugs.llvm.org/show_bug.cgi?id=35662

llvm-svn: 322161
2018-01-10 09:18:17 +00:00
Puyan Lotfi fe6c9cbb24 [MIR] Repurposing '$' sigil used by external symbols. Replacing with '&'.
Planning to add support for named vregs. This puts is in a conundrum since
physregs are named as well. To rectify this we need to use a sigil other than
'%' for physregs in MIR. We've settled on using '$' for physregs but first we
must repurpose it from external symbols using it, which is what this commit is
all about. We think '&' will have familiar semantics for C/C++ users.

llvm-svn: 322146
2018-01-10 00:56:48 +00:00
Vlad Tsyrklevich cdec22ef9a LowerTypeTests: Add limited support for aliases
Summary:
LowerTypeTests moves some function definitions from individual object
files to the merged module, leaving a stub to be called in the merged
module's jump table. If an alias was pointing to such a function
definition LowerTypeTests would fail because the alias would be left
without a definition to point to.

This change 1) emits information about aliases to the ThinLTO summary,
2) replaces aliases pointing to function definitions that are moved to
the merged module with function declarations, and 3) re-emits those
aliases in the merged module pointing to the correct function
definitions.

The patch does not correctly fix all possible mis-uses of aliases in
LowerTypeTests. For example, it does not handle aliases with a different
type from the pointed to function.

The addition of alias data increases the size of Chrome build artifacts
by less than 1%.

Reviewers: pcc

Reviewed By: pcc

Subscribers: mehdi_amini, eraman, mgrang, llvm-commits, eugenis, kcc

Differential Revision: https://reviews.llvm.org/D41741

llvm-svn: 322139
2018-01-10 00:00:51 +00:00
Michael Zolotukhin 1f562176e9 [LoopRotate] Detect loops with indirect branches better (we're giving up on them).
llvm-svn: 322137
2018-01-09 23:54:35 +00:00
Adrian McCarthy db2736ddd8 Reland "Emit Function IDs table for Control Flow Guard"
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.

The original patch didn't have the lit.local.cfg file that restricts the new
test to x86, thus the new test was failing on the non-x86 bots.

Differential Revision: https://reviews.llvm.org/D40531

The reverts r322008, which was a revert of r322005.

This reverts commit a05b89f9aca70597dc79fe97bc49b50b51f525ba.

llvm-svn: 322136
2018-01-09 23:49:30 +00:00
Sam Clegg ea7caceedc [WebAssembly] Add COMDAT support
This adds COMDAT support to the Wasm object-file format.
Spec: https://github.com/WebAssembly/tool-conventions/pull/31

Corresponding LLD change:
https://bugs.llvm.org/show_bug.cgi?id=35533, and D40845

Patch by Nicholas Wilson

Differential Revision: https://reviews.llvm.org/D40844

llvm-svn: 322135
2018-01-09 23:43:14 +00:00
Paul Robinson 29f5f987f1 [DWARFv5] MC support for MD5 file checksums
Extend .file directive syntax to allow specifying an MD5 checksum for
the source file.  Emit the checksums in DWARF v5 line tables.

llvm-svn: 322134
2018-01-09 23:31:48 +00:00
Jake Ehrlich 99482fda95 temp
llvm-svn: 322132
2018-01-09 23:00:25 +00:00
Rafael Espindola d707c37072 Use a MCExpr for the size of MCFillFragment.
This allows the size to be found during ralaxation. This fixes
pr35858.

llvm-svn: 322131
2018-01-09 22:48:37 +00:00
Sam Clegg 270ed1b39c [WebAssembly] MC: Use zero for provisional value of undefined symbols
This is more in line with what happens in the final
executable when symbols are undefined (i.e. weak
references).

Differential Revision: https://reviews.llvm.org/D41840

llvm-svn: 322130
2018-01-09 22:44:02 +00:00
Rafael Espindola 94a72b9918 Add a test.
Currently we don't have any tests for this error case.

llvm-svn: 322129
2018-01-09 22:30:54 +00:00
Chris Bieneman abdea268c1 [IPSCCP] Remove calls without side effects
Summary:
When performing constant propagation for call instructions we have historically replaced all uses of the return from a call, but not removed the call itself. This is required for correctness if the calls have side effects, however the compiler should be able to safely remove calls that don't have side effects.

This allows the compiler to completely fold away calls to functions that have no side effects if the inputs are constant and the output can be determined at compile time.

Reviewers: davide, sanjoy, bruno, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38856

llvm-svn: 322125
2018-01-09 21:58:46 +00:00
Stefan Pintilie 1712700842 [PowerPC] Manually schedule the prologue and epilogue
This patch makes the following changes to the schedule of instructions in the
prologue and epilogue.

The stack pointer update is moved down in the prologue so that the callee saves
do not have to wait for the update to happen.
Saving the lr is moved down in the prologue to hide the latency of the mflr.
The stack pointer is moved up in the epilogue so that restoring of the lr can
happen sooner.
The mtlr is moved up in the epilogue so that it is away form the blr at the end
of the epilogue. The latency of the mtlr can now be hidden by the loads of the
callee saved registers.

This commit is almost identical to this one: r322036 except that two warnings
that broke build bots have been fixed.

The revision number is D41737 as before.

llvm-svn: 322124
2018-01-09 21:57:49 +00:00
Sam Clegg e53af7f6df [WebAssembly] Explicitly specify function/global index space in YAML
These indexes are useful because they are not always zero based and
functions and globals are referenced elsewhere by their index.

This matches what we already do for the type index space.

Differential Revision: https://reviews.llvm.org/D41877

llvm-svn: 322121
2018-01-09 21:38:53 +00:00
Tim Renouf d68fa1be57 [SelectionDAG] Fixed f16-from-vector promotion problem
Summary:
In the case of an fp_extend of v1f16 to v1f32 where the v1f16 is the
result of a bitcast from i16, avoid creating an illegal fp16_to_fp where
the input is not a vector and the result is a v1f32.

V2: The fix is now to avoid vector scalarization creating a v1->scalar
bitcast.

Reviewers: srhines, t.p.northover

Subscribers: nhaehnle, llvm-commits, dstuttard, t-tye, yaxunl, wdng, kzhuravl, arsenm

Differential Revision: https://reviews.llvm.org/D41126

llvm-svn: 322120
2018-01-09 21:36:25 +00:00
Tim Renouf 6eaad1e539 [AMDGPU] Fixed incorrect uniform branch condition
Summary:
I had a case where multiple nested uniform ifs resulted in code that did
v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and
s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first
ensuring that bits for inactive lanes were clear.

There was already code for inserting an "s_and_b64 vcc, exec, vcc" to
clear bits for inactive lanes in the case that the branch is instruction
selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in
SIFixSGPRCopies. I have added the same code into SILowerControlFlow for
the case that the branch is instruction selected as s_cbranch_vccnz.

This de-optimizes the code in some cases where the s_and is not needed,
because vcc is the result of a v_cmp, or multiple v_cmp instructions
combined by s_and/s_or. We should add a pass to re-optimize those cases.

Reviewers: arsenm, kzhuravl

Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle

Differential Revision: https://reviews.llvm.org/D41292

llvm-svn: 322119
2018-01-09 21:34:43 +00:00
Daniel Berlin 56cca7437c NewGVN: Fix PR/33367, which was causing us to delete non-copy intrinsics accidentally in some rare cases
llvm-svn: 322115
2018-01-09 20:12:42 +00:00
Hubert Tong 55662a8e9f Profiling tests: Endianess XFAIL for powerpc- (32-bit)
Add powerpc- (32-bit) as XFAIL for tests that are documented either in-
line or via commit messages as expected to fail on big-endian systems.

Tests not documented in-line are documented in commit messages as
follows:
r211172 - test/tools/llvm-cov/llvm-cov.test
r247920 - test/Transforms/SampleProfile/gcc-simple.ll

llvm-svn: 322114
2018-01-09 20:09:23 +00:00
Easwaran Raman bdf20261d8 Add a pass to generate synthetic function entry counts.
Summary:
This pass synthesizes function entry counts by traversing the callgraph
and using the relative block frequencies of the callsites. The intended
use of these counts is in inlining to determine hot/cold callsites in
the absence of profile information.

The pass is split into two files with the code that propagates the
counts in a callgraph in a Utils file. I plan to add support for
propagation in the thinlto link phase and the propagation code will be
shared and hence this split. I did not add support to the old PM since
hot callsite determination in inlining is not possible in old PM
(although we could use hot callee heuristic with synthetic counts in the
old PM it is not worth the effort tuning it)

Reviewers: davidxl, silvas

Subscribers: mgorny, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D41604

llvm-svn: 322110
2018-01-09 19:39:35 +00:00
Alexey Bataev 771ec9f399 [COST]Fix PR35865: Fix cost model evaluation for shuffle on X86.
Summary:
If the vector type is transformed to non-vector single type, the compile
may crash trying to get vector information about non-vector type.

Reviewers: RKSimon, spatel, mkuper, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41862

llvm-svn: 322106
2018-01-09 19:08:22 +00:00
Sanjay Patel 6fb1357c35 [InstCombine] weaken assertions for icmp folds (PR35846)
Because of potential UB (known bits conflicts with an llvm.assume),
we have to check rather than assert here because InstSimplify doesn't
kill the compare:
https://bugs.llvm.org/show_bug.cgi?id=35846

llvm-svn: 322104
2018-01-09 18:56:03 +00:00
Teresa Johnson ba22da0da3 Fix crash when linking metadata with ODR type uniquing
Summary:
With DebugTypeODRUniquing enabled, during IR linking debug metadata
in the destination module may be reached from the source module.
This means that ConstantAsMetadata nodes (e.g. on DITemplateValueParameter)
may contain a value the destination module. When trying to map such
metadata nodes, we will attempt to map a GV already in the dest module.
linkGlobalValueProto will end up with a source GV that is the same as
the dest GV as well as the new GV. Trying to access the TypeMap for the
source GV type, which is actually a dest GV type, hits an assertion
since it appears that we have mapped into the source module (because the
type is the value not a key into the map).

Detect that we don't need to access the TypeMap in this case, since
there is no need to create a bitcast from the new GV to the source GV
type as they GV are the same.

Fixes PR35722.

Reviewers: mehdi_amini, pcc

Subscribers: probinson, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D41624

llvm-svn: 322103
2018-01-09 18:32:53 +00:00
Max Moroz 975eacada5 [lit] Implement "-r" option for builtin "diff" command + a test using that.
Summary:
That would allow to recursively compare directories in tests using
"diff -r" on Windows in a similar way as it can be done on Linux or Mac.

Reviewers: zturner, morehouse, vsk

Reviewed By: zturner

Subscribers: kcc, llvm-commits

Differential Revision: https://reviews.llvm.org/D41776

llvm-svn: 322102
2018-01-09 18:23:34 +00:00
Craig Topper c4d2dd80b6 [X86] Add a DAG combine to combine (sext (setcc)) with VLX
Normally target independent DAG combine would do this combine based on getSetCCResultType, but with VLX getSetCCResultType returns a vXi1 type preventing the DAG combining from kicking in.

But doing this combine can allow us to remove the explicit sign extend that would otherwise be emitted.

This patch adds a target specific DAG combine to combine the sext+setcc when the result type is the same size as the input to the setcc. I've restricted this to FP compares and things that can be represented with PCMPEQ and PCMPGT since we don't have full integer compare support on the older ISAs.

Differential Revision: https://reviews.llvm.org/D41850

llvm-svn: 322101
2018-01-09 18:14:22 +00:00
Francis Visoiu Mistrih 7d9bef8f5c [CodeGen] Don't print "pred:" and "opt:" in -debug output
In -debug output we print "pred:" whenever a MachineOperand is a
predicate operand in the instruction descriptor, and "opt:" whenever a
MachineOperand is an optional def in the instruction descriptor.

Differential Revision: https://reviews.llvm.org/D41870

llvm-svn: 322096
2018-01-09 17:31:07 +00:00
Sander de Smalen 906a5deace Recommit r322073: [AArch64][SVE] Asm: Add predicated ADD/SUB instructions
Fixed issue that was found on sanitizer-x86_64-linux-fast.
I changed the result type of 'Parser.getTok().getString().lower()'
in AArch64AsmParser::tryParseSVEPredicateVector() from 'StringRef' to
'auto', since StringRef::lower() returns a std::string.

llvm-svn: 322092
2018-01-09 17:01:27 +00:00
Zvi Rackover 72b0bb1405 X86 Tests: Update more isel tests with FastVariableShuffle feature
Summary:
Added the FastVariableShuffle feature to cases that resembled processors
for which this fearure is on.
For AVX2 there are processors with and w/o this fearue enable.
For AVX512 only KNL does enable this feature so cases which only have
+avx512f were left without the FastVariableShuffle enabled.

Reviewers: RKSimon, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41851

llvm-svn: 322090
2018-01-09 16:26:06 +00:00
Zvi Rackover b11e84c1d8 X86 Tests: Add common check prefix to test-case. NFC.
As suggested in D41851

llvm-svn: 322089
2018-01-09 16:14:15 +00:00
Francis Visoiu Mistrih 72cc21eefe [CodeGen] Print frame-setup/destroy flags in -debug output like we do in MIR
Currently the MachineInstr::print function prints the
frame-setup/frame-destroy differently than it does in MIR.

Instead of:

  %x21 = LDR %sp, -16; flags: FrameDestroy

print:

  %x21 = frame-destroy LDR %sp, -16

llvm-svn: 322088
2018-01-09 16:11:51 +00:00
Sanjay Patel 37e28e40cb [SelectionDAG] lower math intrinsics to finite version of libcalls when possible (PR35672)
Ingredients in this patch:
1. Add HANDLE_LIBCALL defs for finite mathlib functions that correspond to LLVM intrinsics.
2. Plumbing to send TargetLibraryInfo down to SelectionDAGLegalize.
3. Relaxed math and library checking in SelectionDAGLegalize::ConvertNodeToLibcall() to choose finite libcalls.

There was a bug about determining the availability of the finite calls that should be fixed with:
rL322010

Not in this patch:
This doesn't resolve the question/bug of clang creating the intrinsic IR in the first place.
There's likely follow-up work needed to support the long double variants better.
There's room for improvement to reduce the code duplication.
Create finite calls that don't originate from a corresponding intrinsic or DAG node?

Differential Revision: https://reviews.llvm.org/D41338

llvm-svn: 322087
2018-01-09 15:41:00 +00:00
Francis Visoiu Mistrih 2b3bd30637 [CodeGen] Don't print register classes in -debug output
Since register classes and banks are already printed with the register
definition, don't print it at the end of every instruction anymore.

This follows MIR in this regard and is another step to the unification
of the two formats.

llvm-svn: 322086
2018-01-09 15:39:44 +00:00
Nirav Dave 30304a3bd7 [DAG] Elide overlapping stores
Relanding after fixing handling of pre-indexed memory operations in
BaseIndexOffset analysis (r322003).

Extend overlapping store elision to handle overwrites of stores by
larger stores.

Reviewers: craig.topper, rnk, t.p.northover

Subscribers: javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40969

llvm-svn: 322085
2018-01-09 15:23:12 +00:00
Petar Jovanovic 1d26c7e4ff [EarlyCSE] Salvage debug info during DCE
EarlyCSE did not try to salvage debug info during erasing of instructions.
This change fixes it.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D41496

llvm-svn: 322083
2018-01-09 15:08:37 +00:00
Simon Pilgrim 5d909be91b [InstCombine] Check for out of range ashr values using APInt before calling getZExtValue
Reduced from oss-fuzz #5032 test case

llvm-svn: 322078
2018-01-09 14:23:46 +00:00
Sander de Smalen 6595603187 Reverted r322073 because of AddressSanitizer failure on
sanitizer-x86_64-linux-fast builder.

llvm-svn: 322077
2018-01-09 13:51:09 +00:00
Simon Pilgrim 9cf3e765d8 [X86][AVX] Add v2i64/v2f64 load tests
Ensure these use insertions, not masked load ops

llvm-svn: 322076
2018-01-09 13:35:18 +00:00
Sander de Smalen 1f97363e5f [AArch64][SVE] Asm: Add predicated ADD/SUB instructions
Summary:
Add the predicated ADD/SUB instructions and corresponding tests.

Patch [3/3] in a series to add predicated ADD/SUB instructions for SVE.

Reviewers: rengolin, mcrosier, evandro, fhahn, echristo

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D41443

llvm-svn: 322073
2018-01-09 12:43:46 +00:00
Simon Pilgrim 94357afd26 [InstCombine] Add pow2 mul -> shl tests for vectors with uniform/non-uniform constants
llvm-svn: 322072
2018-01-09 11:55:27 +00:00
Francis Visoiu Mistrih dbf2c48fc7 [MIR] Add support for the frame-destroy MachineInstr flag
We are printing / parsing the `frame-setup` MachineInstr flag but not
the `frame-destroy` one.

Differential Revision: https://reviews.llvm.org/D41509

llvm-svn: 322071
2018-01-09 11:33:22 +00:00
Nikolai Bozhenov eededdade9 [Nios2] Arithmetic instructions for R1 and R2 ISA.
Summary:
This commit enables some of the arithmetic instructions for Nios2 ISA (for both
R1 and R2 revisions), implements facilities required to emit those instructions
and provides LIT tests for added instructions.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D41236

Author: belickim <mateusz.belicki@intel.com>
llvm-svn: 322069
2018-01-09 11:15:08 +00:00
Oren Ben Simhon 1c6308ecd5 Instrument Control Flow For Indirect Branch Tracking
CET (Control-Flow Enforcement Technology) introduces a new mechanism called IBT (Indirect Branch Tracking).
According to IBT, each Indirect branch should land on dedicated ENDBR instruction (End Branch).
The new pass adds ENDBR instructions for every indirect jmp/call (including jumps using jump tables / switches).
For more information, please see the following:
https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

Differential Revision: https://reviews.llvm.org/D40482

Change-Id: Icb754489faf483a95248f96982a4e8b1009eb709
llvm-svn: 322062
2018-01-09 08:51:18 +00:00
Craig Topper def1c30c66 [X86] Allow more cmpps/pd immediate encodings to be commuted during isel.
The code that checks the immediate wasn't masking to the lower 3-bits like the code in X86InstrInfo.cpp that's used by the peephole pass does.

llvm-svn: 322060
2018-01-09 07:09:34 +00:00
Serguei Katkov 6a7a4c6a55 [SCEV] Do not cache S -> V if S is not equivalent of V
SCEV tracks the correspondence of created SCEV to original instruction.
However during creation of SCEV it is possible that nuw/nsw/exact flags are
lost.

As a result during expansion of the SCEV the instruction with nuw/nsw/exact
will be used where it was expected and we produce poison incorreclty.

Reviewers: sanjoy, mkazantsev, sebpop, jbhateja
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41578

llvm-svn: 322058
2018-01-09 06:47:14 +00:00
Serguei Katkov 4d1dd6b53a [CGP] Fix Complex addressing mode for offset
If the offset is differ in two addressing mode we can continue only if
ScaleReg is not set due to we will use it as merge of different offsets.

It should fix PR35799 and PR35805.

Reviewers: john.brawn, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41227

llvm-svn: 322056
2018-01-09 04:37:06 +00:00
Sean Fertile 33a17762bb [PowerPC] Can not assume an intrinsic argument is a simple type.
The CTRLoop pass performs checks on the argument of certain libcalls/intrinsics,
and assumes the arguments must be of a simple type. This isn't always the case
though. For example if we unroll and vectorize a loop we may end up with vectors
larger then the largest legal type, along with intrinsics that operate on those
wider types. This happened in the ffmpeg build, where we unrolled a loop and
ended up with a sqrt intrinsic that operated on V16f64, triggering an assertion.

Differential Revision: https://reviews.llvm.org/D41758

llvm-svn: 322055
2018-01-09 03:03:41 +00:00
Stefan Pintilie 7e10987b12 Revert "[PowerPC] Manually schedule the prologue and epilogue"
[PowerPC] This reverts commit r322036.

Failing build bots. Revert the commit now.

llvm-svn: 322051
2018-01-09 01:06:21 +00:00
Craig Topper cc342d465e [X86] Remove llvm.x86.avx512.cvt*2mask.* intrinsics and autoupgrade to (icmp slt X, 0)
I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way.

llvm-svn: 322050
2018-01-09 00:50:47 +00:00
Jessica Paquette 3291e7353e [MachineOutliner] AArch64: Handle instrs that use SP and will never need fixups
This commit does two things. Firstly, it adds a collection of flags which can
be passed along to the target to encode information about the MBB that an
instruction lives in to the outliner.

Second, it adds some of those flags to the AArch64 outliner in order to add
more stack instructions to the list of legal instructions that are handled
by the outliner. The two flags added check if

- There are calls in the MachineBasicBlock containing the instruction
- The link register is available in the entire block

If the link register is available and there are no calls, then a stack
instruction can always be outlined without fixups, regardless of what it is,
since in this case, the outliner will never modify the stack to create a
call or outlined frame.

The motivation for doing this was checking which instructions are most often
missed by the outliner. Instructions like, say

%sp<def> = ADDXri %sp, 32, 0; flags: FrameDestroy

are very common, but cannot be outlined in the case that the outliner might
modify the stack. This commit allows us to outline instructions like this.
  

llvm-svn: 322048
2018-01-09 00:26:18 +00:00
Stefan Pintilie 55bfdd040a [PowerPC] Manually schedule the prologue and epilogue
This patch makes the following changes to the schedule of instructions in the
prologue and epilogue.

The stack pointer update is moved down in the prologue so that the callee saves
do not have to wait for the update to happen.
Saving the lr is moved down in the prologue to hide the latency of the mflr.
The stack pointer is moved up in the epilogue so that restoring of the lr can
happen sooner.
The mtlr is moved up in the epilogue so that it is away form the blr at the end
of the epilogue. The latency of the mtlr can now be hidden by the loads of the
callee saved registers.

Differential Revision: https://reviews.llvm.org/D41737

llvm-svn: 322036
2018-01-08 22:23:10 +00:00
Petar Jovanovic 9f279a4e11 Add lit.local.cfg in test/DebugInfo/MIR/Mips/
Add test/DebugInfo/MIR/Mips/lit.local.cfg so no tests are run if Mips is
not a supported target.
This should resolve buildbot failures seen after r322015.

llvm-svn: 322020
2018-01-08 19:44:03 +00:00
Sanjay Patel 7dfe96ad16 [ValueTracking] remove overzealous assert
The test is derived from a failing fuzz test:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5008

Credit to @rksimon for pointing out the problem.

llvm-svn: 322016
2018-01-08 18:31:13 +00:00
Petar Jovanovic e9500ba745 [LiveDebugValues] Change condition for block termination recognition
The last iterator of MBB should be recognized as MBB.end() not as
MBB.instr_end() which could return bundled instruction that is not iterable
with basic iterator.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D41626

llvm-svn: 322015
2018-01-08 18:21:15 +00:00
Sanjay Patel 52149f0305 [TargetLibraryInfo] fix finite mathlib function availability
This patch was part of:
https://reviews.llvm.org/D41338
...but we can expose the bug in IR via constant propagation
as shown in the test. Unless the triple includes 'linux', we
should not fold these because the functions don't exist on
other platforms (yet?).

llvm-svn: 322010
2018-01-08 17:38:09 +00:00
Adrian McCarthy ce63a925cc Revert "Emit Function IDs table for Control Flow Guard"
The new test fails on the Hexagon bot.  Reverting while I investigate.

This reverts https://reviews.llvm.org/rL322005

This reverts commit b7e0026b4385180c378edc658ec91a39566f2942.

llvm-svn: 322008
2018-01-08 17:12:01 +00:00
Davide Italiano 9a60d2c157 [CVP] Replace incoming values from unreachable blocks with undef.
This is an attempt of fixing PR35807.
Due to the non-standard definition of dominance in LLVM, where uses in
unreachable blocks are dominated by anything, you can have, in an
unreachable block:

  %patatino = OP1 %patatino, CONSTANT

When `SimplifyInstruction` receives a PHI where an incoming value is of
the aforementioned form, in some cases, loops indefinitely.

What I propose here instead is keeping track of the incoming values
from unreachable blocks, and replacing them with undef. It fixes this
case, and it seems to be good regardless (even if we can't prove that
the value is constant, as it's coming from an unreachable block, we
can ignore it).

Differential Revision:  https://reviews.llvm.org/D41812

llvm-svn: 322006
2018-01-08 16:34:06 +00:00
Adrian McCarthy cf6e6c82c1 Emit Function IDs table for Control Flow Guard
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.

Differential Revision: https://reviews.llvm.org/D40531

llvm-svn: 322005
2018-01-08 16:33:42 +00:00
Aleksandar Beserminji f02ad15ff1 [mips] Improve diagnostics for instruction mapping
This patch improves diagnostic for case when mapped instruction 
does not contain a field listed under RowFields.

Differential Revision: https://reviews.llvm.org/D41778

llvm-svn: 322004
2018-01-08 16:25:40 +00:00
Sanjay Patel 31b4b76f99 [InstCombine] fold min/max tree with common operand (PR35717)
There is precedence for factorization transforms in instcombine for FP ops with fast-math. 
We also have similar logic in foldSPFofSPF().

It would take more work to add this to reassociate because that's specialized for binops, 
and min/max are not binops (or even single instructions). Also, I don't have evidence that 
larger min/max trees than this exist in real code, but if we find that's true, we might
want to reorganize where/how we do this optimization.

In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have:

int test(int xc, int xm, int xy) {
  int xk;
  if (xc < xm)
    xk = xc < xy ? xc : xy;
  else
    xk = xm < xy ? xm : xy;
  return xk;
}

This patch solves that problem because we recognize more min/max patterns after rL321672

https://rise4fun.com/Alive/Qjne
https://rise4fun.com/Alive/3yg

Differential Revision: https://reviews.llvm.org/D41603

llvm-svn: 321998
2018-01-08 15:05:34 +00:00
Momchil Velikov ac7c5c1d92 [ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz
The patch makes the unwind information not mention registers, which were pushed
solely for the purpose of saving stack adjustment instructions.

Differential revision: https://reviews.llvm.org/D41300
Fixes https://bugs.llvm.org/show_bug.cgi?id=35379

llvm-svn: 321996
2018-01-08 14:47:19 +00:00
Alexey Bataev 5b9a77d4ea [SLP] Fix PR35777: Incorrect handling of aggregate values.
Summary:
Fixes the bug with incorrect handling of InsertValue|InsertElement
instrucions in SLP vectorizer. Currently, we may use incorrect
ExtractElement instructions as the operands of the original
InsertValue|InsertElement instructions.

Reviewers: mkuper, hfinkel, RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41767

llvm-svn: 321994
2018-01-08 14:43:06 +00:00
Alexey Bataev 118a0a2c38 [SLP] Fix PR35628: Count external uses on extra reduction arguments.
Summary:
If the vectorized value is marked as extra reduction argument, its users
are not considered as external users. Patch fixes this.

Reviewers: mkuper, hfinkel, RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41786

llvm-svn: 321993
2018-01-08 14:33:11 +00:00
Sam Parker 3800f0f11d [DAGCombine] Fix for PR35761
I had falsely assumed that constant operands would be operand(1) of
the bin ops that may need their constant operand to be masked.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35761

Differential Revision: https://reviews.llvm.org/D41667

llvm-svn: 321991
2018-01-08 13:21:24 +00:00
Momchil Velikov d17dabca31 [ARM] Fix PR35481
This patch allows `r7` to be used, regardless of its use as a frame pointer, as
a temporary register when popping `lr`, and also falls back to using a high
temporary register if, for some reason, we weren't able to find a suitable low
one.

Differential revision: https://reviews.llvm.org/D40961
Fixes https://bugs.llvm.org/show_bug.cgi?id=35481

llvm-svn: 321989
2018-01-08 11:32:37 +00:00
Sam Parker 51164c409d [X86] Renamed CodeGen test
llvm-svn: 321988
2018-01-08 10:56:44 +00:00
Craig Topper f090e8a89a [X86] Replace CVT2MASK ISD opcode with PCMPGTM compared to zero.
CVT2MASK is just checking the sign bit which can be represented with a comparison with zero.

llvm-svn: 321985
2018-01-08 06:53:54 +00:00
Craig Topper a2018e799a [X86] Add patterns to allow 512-bit BWI compare instructions to be used for 128/256-bit compares when VLX is not available.
llvm-svn: 321984
2018-01-08 06:53:52 +00:00
Petr Hosek 66aea6eb98 Don't try to run MCJIT/OrcJIT EH tests when C++ library is statically linked
These tests assumes availability of external symbols provided by the
C++ library, but those won't be available in case when the C++ library
is statically linked because lli itself doesn't need these.

This uses llvm-readobj -needed-libs to check if C++ library is linked as
shared library and exposes that information as a feature to lit.

Differential Revision: https://reviews.llvm.org/D41272

llvm-svn: 321981
2018-01-08 02:48:41 +00:00
Petr Hosek b3f802265e [llvm-readobj] Support -needed-libs option for Mach-O files
This implements the -needed-libs option in Mach-O dumper.

Differential Revision: https://reviews.llvm.org/D41527

llvm-svn: 321980
2018-01-08 02:23:10 +00:00
Craig Topper 03d8e516cf [X86] Add VSHUFF32X4 and similar instructions to load folding tables.
llvm-svn: 321978
2018-01-07 23:30:20 +00:00
Davide Italiano 4c39758a38 [SLPVectorizer] Reintroduce std::stable_sort(properlyDominates()).
The approach was never discussed, I wasn't able to reproduce this
non-determinism, and the original author went AWOL.
After a discussion on the ML, Philip suggested to revert this.

llvm-svn: 321974
2018-01-07 22:06:24 +00:00
Zvi Rackover 93b8bd4955 X86 Tests: Add Tests for PMADDWD selection. NFC.
Support for ISel to be added.

llvm-svn: 321970
2018-01-07 20:21:10 +00:00
Simon Pilgrim 998180dad3 [DAG] Fix for Bug PR34620 - Allow SimplifyDemandedBits to look through bitcasts
Allow SimplifyDemandedBits to use TargetLoweringOpt::computeKnownBits to look through bitcasts. This can help simplifying in some cases where bitcasts of constants generated during or after legalization can't be folded away, and thus didn't get picked up by SimplifyDemandedBits. This fixes PR34620, where a redundant pand created during legalization from lowering and lshr <16xi8> wasn't being simplified due to the presence of a bitcasted build_vector as an operand.

Committed on the behalf of @sameconrad (Sam Conrad)

Differential Revision: https://reviews.llvm.org/D41643

llvm-svn: 321969
2018-01-07 19:09:40 +00:00
Craig Topper d58c165545 [X86] Make v2i1 and v4i1 legal types without VLX
Summary:
There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type.

It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway.

This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly.

We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added.

I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all.

There's definitely room for improvement with some follow up patches.

Reviewers: RKSimon, zvi, guyblank

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41560

llvm-svn: 321967
2018-01-07 18:20:37 +00:00
Florian Hahn 55be37e7d4 [CodeExtractor] Use subset of function attributes for extracted function.
In addition to target-dependent attributes, we can also preserve a
white-listed subset of target independent function attributes. The white-list
excludes problematic attributes, most prominently:

* attributes related to memory accesses, as alloca instructions
  could be moved in/out of the extracted block

* control-flow dependent attributes, like no_return or thunk, as the
  relerelevant instructions might or might not get extracted.

Thanks @efriedma and @aemerson for providing a set of attributes that cannot be
propagated.


Reviewers: efriedma, davidxl, davide, silvas

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D41334

llvm-svn: 321961
2018-01-07 11:22:25 +00:00
Craig Topper a21f551109 [X86] Add the 16 and 8-bit CRC32 instructions to the load folding tables.
llvm-svn: 321958
2018-01-07 06:48:20 +00:00
Craig Topper 89293a2a94 [X86] Add 128 and 256-bit VPOPCNTD/Q instructions to load folding tables.
llvm-svn: 321953
2018-01-07 06:24:27 +00:00
Craig Topper 11aede13db [X86] Add EVEX vcvtph2ps to the load folding tables.
llvm-svn: 321951
2018-01-07 06:24:24 +00:00
Craig Topper 40cc8338f7 [X86] Remove cvtps2ph xmm->xmm from store folding tables. Add the evex versions of cvtps2ph to the store folding tables.
The memory form of the xmm->xmm version only writes 64-bits. If we use it in the folding tables and its get used for a stack spill, only half the slot will be written. Then a reload may read all 128-bits which will pull in garbage. But without the spill the upper bits of the register would have been zero. By not folding we would preserve the zeros.

llvm-svn: 321950
2018-01-07 06:24:23 +00:00
Craig Topper 61d8a60e23 [X86] Remove memory forms of EVEX encoded vcvttss2si/vcvttsd2si from asm matcher table.
This is also needed to fix PR35837.

llvm-svn: 321946
2018-01-06 21:27:25 +00:00
Craig Topper 0f4ccb7806 [X86] Add load folding pattern to EVEX vcvttss2si/vcvtsd2si.
llvm-svn: 321945
2018-01-06 21:02:26 +00:00
Florian Hahn a82eef2363 [InlineFunction] Preserve calling convention when forwarding VarArgs.
Reviewers: efriedma, rnk, davide

Reviewed By: rnk, davide

Differential Revision: https://reviews.llvm.org/D41556

llvm-svn: 321943
2018-01-06 20:56:27 +00:00
Florian Hahn de10e6e064 [InlineFunction] Preserve attributes when forwarding VarArgs.
Reviewers: rnk, efriedma

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D41555

llvm-svn: 321942
2018-01-06 20:46:00 +00:00
Florian Hahn 80788d8088 [InlineFunction] Inline vararg functions that do not access varargs.
If the varargs are not accessed by a function, we can inline the
function.

Reviewers: dblaikie, chandlerc, davide, efriedma, rnk, hfinkel

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D41335

llvm-svn: 321940
2018-01-06 19:45:40 +00:00
Craig Topper a49c354a08 [X86] Remove memory forms of EVEX encoded vcvtsd2si/vcvtss2si from the assembler matcher table
We should always prefer the VEX encoded version of these instructions. There is no advantage to the EVEX version.

Fixes PR35837.

llvm-svn: 321939
2018-01-06 19:20:33 +00:00
Sanjay Patel 26a6fcde83 [InstCombine] relax use constraint for min/max (~a, ~b) --> ~min/max(a, b)
In the minimal case, this won't remove instructions, but it still improves
uses of existing values.

In the motivating example from PR35834, it does remove instructions, and
sets that case up to be optimized by something like D41603:
https://reviews.llvm.org/D41603

llvm-svn: 321936
2018-01-06 17:34:22 +00:00
Sanjay Patel f7e775291e [InstCombine] add more tests for max(~a, ~b) and PR35834; NFC
llvm-svn: 321935
2018-01-06 17:14:46 +00:00
Sanjay Patel 5a48aef3f0 [x86, MemCmpExpansion] allow 2 pairs of loads per block (PR33325)
This is the last step needed to fix PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325

We're trading branch and compares for loads and logic ops. 
This makes the code smaller and hopefully faster in most cases.

The 24-byte test shows an interesting construct: we load the trailing scalar 
elements into vector registers and generate the same pcmpeq+movmsk code that 
we expected for a pair of full vector elements (see the 32- and 64-byte tests).

Differential Revision: https://reviews.llvm.org/D41714

llvm-svn: 321934
2018-01-06 16:16:04 +00:00
Craig Topper 36d8da3358 [X86] When parsing rounding mode operands, provide a proper end location so we don't crash when trying to print an error message using it.
llvm-svn: 321930
2018-01-06 06:41:07 +00:00
Craig Topper 8c2ea74e74 [X86] Call lowerShuffleAsRepeatedMaskAndLanePermute from lowerV4I64VectorShuffle.
llvm-svn: 321929
2018-01-06 06:08:04 +00:00
Craig Topper af1d257571 [X86] Run dos2unix on a test file. NFC
llvm-svn: 321928
2018-01-06 06:08:02 +00:00
Vedant Kumar 1f6f5f1df9 [Debugify] Handled unsized types
llvm-svn: 321918
2018-01-06 00:37:01 +00:00
Craig Topper e2659d8383 [X86] Add vcvtsd2sil/vcvtsd2siq etc. InstAliases to the EVEX-encoded instructions.
This matches their VEX equivalents.

llvm-svn: 321912
2018-01-05 23:13:54 +00:00
Adrian McCarthy 74bfafa10e Re-land "Fix faulty assertion in debug info"
This had been reverted because the new test failed on non-X86 bots.  I moved
the new test to the appropriate subdirectory to correct this.

Differential Revision: https://reviews.llvm.org/D41264
Original submission:  r321122 (which was reverted by r321125)

This reverts commit 3c1639b5703c387a0d8cba2862803b4e68dff436.

llvm-svn: 321911
2018-01-05 23:01:04 +00:00
Krzysztof Parzyszek b0b52618c0 [Hexagon] Even simpler patterns for sign- and zero-extending HVX vectors
Recommit r321897 with updated testcases.

llvm-svn: 321908
2018-01-05 22:31:11 +00:00
Bjorn Pettersson 5ffb1c0ff0 [DebugInfo] Align comments in debug_loc section
Summary:
This commit updates the BufferByteStreamer, used by DebugLocStream
to buffer bytes/comments to put in the debug_loc section, to
make sure that the Buffer and Comments vectors are synced.
Previously, when an SLEB128 or ULEB128 was emitted together with
a comment, the vectors could be out-of-sync if the LEB encoding
added several entries to the Buffer vectors, while we only added
a single entry to the Comments vector.

The goal with this is to get the comments in the debug_loc
section in the .s file correctly aligned.

Example (using ARM as target):
Instead of

  .byte 144                     @ sub-register DW_OP_regx
  .byte 128                     @ 256
  .byte 2                       @ DW_OP_piece
  .byte 147                     @ 8
  .byte 8                       @ sub-register DW_OP_regx
  .byte 144                     @ 257
  .byte 129                     @ DW_OP_piece
  .byte 2                       @ 8
  .byte 147                     @
  .byte 8                       @

we now get

  .byte 144                     @ sub-register DW_OP_regx
  .byte 128                     @ 256
  .byte 2                       @
  .byte 147                     @ DW_OP_piece
  .byte 8                       @ 8
  .byte 144                     @ sub-register DW_OP_regx
  .byte 129                     @ 257
  .byte 2                       @
  .byte 147                     @ DW_OP_piece
  .byte 8                       @ 8

Reviewers: JDevlieghere, rnk, aprantl

Reviewed By: aprantl

Subscribers: davide, Ka-Ka, uabelho, aemerson, javed.absar, kristof.beyls, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D41763

llvm-svn: 321907
2018-01-05 22:20:30 +00:00
Zachary Turner 7f5fb676c0 Fix some opt-viewer test issues and disable on Windows.
Differential Revision: https://reviews.llvm.org/D41784

llvm-svn: 321905
2018-01-05 22:05:13 +00:00
Krzysztof Parzyszek 4ed8ef6f8e Revert r321894: it requires a part of another commit that is not ready yet
Commit message:
[Hexagon] Add patterns for sext_inreg of HVX vector types

llvm-svn: 321904
2018-01-05 21:57:43 +00:00
Craig Topper 29476ab0bd [X86] Add InstAliases for 'vmovd' with GR64 registers to select EVEX encoded instructions as well.
Without this we allow "vmovd %rax, %xmm0", but not "vmovd %rax, %xmm16"

This exists due to continue a silly bug where really old versions of the GNU assembler required movd instead of movq on these instructions. This compatibility hack then crept forward to avx version too, but we didn't propagate it to avx512.

llvm-svn: 321903
2018-01-05 21:57:23 +00:00
Adrian Prantl 146ed408f4 dwarfdump: Match the --uuid output with that of Darwin dwarfdump.
This option is widely used by scripts and there is no reason to break them.

rdar://problem/36032398

llvm-svn: 321901
2018-01-05 21:44:17 +00:00
Craig Topper 004867312e [X86] Stop printing moves between VR64 and GR64 with 'movd' mnemonic. Use 'movq' instead.
This behavior existed to work with an old version of the gnu assembler on MacOS that only accepted this form. Newer versions of GNU assembler and the current LLVM derived version of the assembler on MacOS support movq as well.

llvm-svn: 321898
2018-01-05 20:55:12 +00:00
Krzysztof Parzyszek f9d01a12d1 [Hexagon] Add patterns for truncating HVX vector types
Only non-bool vectors.

llvm-svn: 321895
2018-01-05 20:48:03 +00:00
Krzysztof Parzyszek 9d0c6355a0 [Hexagon] Add patterns for sext_inreg of HVX vector types
llvm-svn: 321894
2018-01-05 20:46:41 +00:00
Krzysztof Parzyszek 0f5d976aa0 [Hexagon] Add a bitcast to required type in LowerHvxMul
llvm-svn: 321893
2018-01-05 20:45:34 +00:00
Krzysztof Parzyszek 66ee123d61 [Hexagon] Add pattern for vsplat to v8i8
llvm-svn: 321892
2018-01-05 20:43:56 +00:00
Douglas Yung 578ce90635 [llvm-cov] Change test to use FileCheck instead of grep.
Reviewed by Paul Robinson

llvm-svn: 321888
2018-01-05 20:00:18 +00:00
Serge Guelton 4c975578b4 Limit size of non-GlobalValue name
Otherwise, in some extreme test case, very long names are created and the
compiler consumes large amount of memory. Size limit is set to a relatively
high value not to disturb debugging.

Compiler flag -non-global-value-max-name-size=<value> can be used to customize
the size.

Differential Revision: https://reviews.llvm.org/D41296

llvm-svn: 321886
2018-01-05 19:41:19 +00:00
Jake Ehrlich 27a29b0290 [llvm-objcopy] Add --localize-hidden option
This change adds support in llvm-objcopy for GNU objcopy's --localize-hidden
option. This option changes every hidden or internal symbol into a local symbol.

llvm-svn: 321884
2018-01-05 19:19:09 +00:00
Sanjay Patel 5b6aacf2c1 [InstCombine] add folds for min(~a, b) --> ~max(a, b)
Besides the bug of omitting the inverse transform of max(~a, ~b) --> ~min(a, b),
the use checking and operand creation were off. We were potentially creating 
repeated identical instructions of existing values. This led to infinite
looping after I added the extra folds.

By using the simpler m_Not matcher and not creating new 'not' ops for a and b,
we avoid that problem. It's possible that not using IsFreeToInvert() here is
more limiting than the simpler matcher, but there are no tests for anything
more exotic. It's also possible that we should relax the use checking further
to handle a case like PR35834:
https://bugs.llvm.org/show_bug.cgi?id=35834
...but we can make that a follow-up if it is needed. 

llvm-svn: 321882
2018-01-05 19:01:17 +00:00
Brian Gesiak 0000060274 [llvm-mt] Remove platform-specific path in test
Summary:
Remove a platform-specific path separator added to the llvm-mt help text test
in https://reviews.llvm.org/D41732.

Test Plan: `check-llvm`

llvm-svn: 321881
2018-01-05 18:23:22 +00:00
Brian Gesiak 7b84de792b [Option] Add 'findNearest' method to catch typos
Summary:
Add a method `OptTable::findNearest`, which allows users of OptTable to
check user input for misspelled options. In addition, have llvm-mt
check for misspelled options. For example, if a user invokes
`llvm-mt /oyt:foo`, the error message will indicate that while an
option named `/oyt:` does not exist, `/out:` does.

The method ports the functionality of the `LookupNearestOption` method
from LLVM CommandLine to libLLVMOption. This allows tools like Clang
and Swift, which do not use CommandLine, to use this functionality to
suggest similarly spelled options.

As room for future improvement, the new method as-is cannot yet properly suggest
nearby "joined" options -- that is, for an option string "-FozBar", where
"-Foo" is the correct option name and "Bar" is the value being passed along
with the misspelled option, this method will calculate an edit distance of 4,
by deleting "Bar" and changing "z" to "o". It should instead calculate an edit
distance of just 1, by changing "z" to "o" and recognizing "Bar" as a
value. This commit includes a disabled test that expresses this limitation.

Test Plan: `check-llvm`

Reviewers: yamaguchi, v.g.vassilev, teemperor, ruiu, jroelofs

Reviewed By: jroelofs

Subscribers: jroelofs, llvm-commits

Differential Revision: https://reviews.llvm.org/D41732

llvm-svn: 321877
2018-01-05 17:10:39 +00:00
Max Moroz b845fe649f [llvm-cov] Temporarily disable multithreaded-report.test on Windows.
Summary: The test is failing because Windows do not support "diff -r".

Reviewers: Dor1s

Reviewed By: Dor1s

Differential Revision: https://reviews.llvm.org/D41768

llvm-svn: 321876
2018-01-05 16:43:24 +00:00
Adrian Prantl 405419fa37 add 'REQUIRES: object-emission' to test
llvm-svn: 321875
2018-01-05 16:31:22 +00:00
Adrian Prantl 33b3984d4f remove unnecessary target triple from generic test
llvm-svn: 321874
2018-01-05 16:29:24 +00:00
Davide Italiano 554f68be44 [BasicAA] Fix linearization of shifts beyond the bitwidth.
Thanks to Simon Pilgrim for the reduced testcase.
Fixes PR35821.

llvm-svn: 321873
2018-01-05 16:18:47 +00:00
Alexey Bataev fa13848da8 [SLP] Update more test checks, NFC.
llvm-svn: 321872
2018-01-05 16:15:17 +00:00
Max Moroz cc254ba4a7 [llvm-cov] Multi-threaded implementation of prepareFileReports method.
Summary:
Local testing has demonstrated a great speed improvement, compare the following:

1) Existing version:
```
$ time llvm-cov show -format=html -output-dir=report -instr-profile=... ...
The tool has been launched:                            00:00:00
Loading coverage data:                                 00:00:00
Get unique source files:                               00:00:33
Creating an index out of the source files:             00:00:34
Going into prepareFileReports:                         00:00:34
Going to emit summary information for each file:       00:28:55 <-- 28:21 min!
Going to emit links to files with no function:         00:28:55
Launching 32 threads for generating HTML files:        00:28:55

real  37m43.651s
user  112m5.540s
sys   7m39.872s
```

2) Multi-threaded version with 32 CPUs:
```
$ time llvm-cov show -format=html -output-dir=report -instr-profile=... ...
The tool has been launched:                            00:00:00
Loading coverage data:                                 00:00:00
Get unique source files:                               00:00:38
Creating an index out of the source files:             00:00:40
Going into prepareFileReports:                         00:00:40
Preparing file reports using 32 threads:               00:00:40
# Creating thread tasks for the following number of files: 16422
Going to emit summary information for each file:       00:01:57 <-- 1:17 min!
Going to emit links to files with no function:         00:01:58
Launching 32 threads for generating HTML files:        00:01:58

real  11m2.044s
user  134m48.124s
sys   7m53.388s
```

Reviewers: vsk, morehouse

Reviewed By: vsk

Subscribers: Dor1s, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D41206

llvm-svn: 321871
2018-01-05 16:15:07 +00:00
Alexey Bataev e565ebcdad [SLP] Update test checks, NFC.
llvm-svn: 321870
2018-01-05 15:20:40 +00:00
Alexey Bataev 988db0bd50 [SLP] Update tests checks, NFC.
llvm-svn: 321869
2018-01-05 14:40:04 +00:00
Simon Pilgrim 15fcbe2d4a [X86] Regenerate illegal move test
Recommitting after fixing case-sensitive issue in the RUN command

llvm-svn: 321868
2018-01-05 14:24:03 +00:00
Momchil Velikov 7efdd090e2 [ARM] Issue an erorr when non-general-purpose registers are used in address operands
Currently the assembler would accept, e.g. `ldr r0, [s0, #12]` and similar.
This patch add checks that only general-purpose registers are used in address
operands, shifted registers, and shift amounts.

Differential revision: https://reviews.llvm.org/D39910

llvm-svn: 321866
2018-01-05 13:28:10 +00:00
Florian Hahn e970d64ec5 [AArch64] Fix -mcpu option in aarch64-combine-fmul-fsub.mir (NFC)
llvm-svn: 321865
2018-01-05 11:17:48 +00:00
Jonas Devlieghere cbf651f739 [DebugInfo] Don't crash when given invalid DWARFv5 line table prologue.
This patch replaces an assertion with an explicit check for the validity
of the FORM parameters. The assertion was triggered when the DWARFv5
line table contained a zero address size.

This fixes OSS-Fuzz Issue 4644
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=4644

Differential revision: https://reviews.llvm.org/D41615

llvm-svn: 321863
2018-01-05 10:03:02 +00:00
Sam Parker 1ad085b808 [DAGCombine] Fix for PR37563
While searching for loads to be narrowed, equal sized loads were not
added to the list, resulting in anyext loads not being converted to
zext loads.

https://bugs.llvm.org/show_bug.cgi?id=35763

Differential Revision: https://reviews.llvm.org/D41628

llvm-svn: 321862
2018-01-05 08:47:23 +00:00
Aditya Nandakumar 5710c44eee [GISel]: Don't create G_MUL with 1 during translation of GEP
When element size is 1, it's just wasteful to create MUL with 1.
https://reviews.llvm.org/D41738

llvm-svn: 321857
2018-01-05 02:56:28 +00:00
Adrian Prantl a29aac7b77 Debug Info: Support DW_AT_calling_convention on composite types.
This implements the DWARF 5 feature described at
http://www.dwarfstd.org/ShowIssue.php?issue=141215.1

This allows a consumer to understand whether a composite data type is
trivially copyable and thus should be passed by value instead of by
reference. The canonical example is being able to distinguish the
following two types:

  // S is not trivially copyable because of the explicit destructor.
  struct S {
     ~S() {}
  };

  // T is a POD type.
  struct T {
     ~T() = default;
  };

This patch adds two new (DI)flags to LLVM metadata: TypePassByValue
and TypePassByReference.

<rdar://problem/36034922>
Differential Revision: https://reviews.llvm.org/D41743

llvm-svn: 321844
2018-01-05 01:13:37 +00:00
Alexey Bataev 8040e5047b [DEBUG] Fix debug info test for NVPTX, NFC.
llvm-svn: 321835
2018-01-04 23:50:24 +00:00
Reid Kleckner cd78ddc119 Revert "[JumpThreading] Preservation of DT and LVI across the pass"
This reverts r321825, it causes crashes in Chromium. Reproducer
forthcoming.

llvm-svn: 321832
2018-01-04 23:23:46 +00:00
Alexey Bataev df73983f3e [DEBUG] Fix the test for NVPTX, NFC.
llvm-svn: 321829
2018-01-04 23:01:42 +00:00
Simon Pilgrim ab84a9a65d [X86] Add srem/udiv/urem by one combine tests
llvm-svn: 321826
2018-01-04 22:08:36 +00:00
Brian M. Rzycki cdad6c0b60 [JumpThreading] Preservation of DT and LVI across the pass
Summary:
See D37528 for a previous (non-deferred) version of this
patch and its description.

Preserves dominance in a deferred manner using a new class
DeferredDominance. This reduces the performance impact of
updating the DominatorTree at every edge insertion and
deletion. A user may call DDT->flush() within JumpThreading
for an up-to-date DT. This patch currently has one flush()
at the end of runImpl() to ensure DT is preserved across
the pass.

LVI is also preserved to help subsequent passes such as
CorrelatedValuePropagation. LVI is simpler to maintain and
is done immediately (not deferred). The code to perfom the
preversation was minimally altered and was simply marked
as preserved for the PassManager to be informed.

This extends the analysis available to JumpThreading for
future enhancements. One example is loop boundary threading.

Reviewers: dberlin, kuhar, sebpop

Reviewed By: kuhar, sebpop

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40146

llvm-svn: 321825
2018-01-04 21:57:32 +00:00
Evandro Menezes 6161a0b3b0 [AArch64] Improve code generation of vector build
Instead of using, for example, `dup v0.4s, wzr`, which transfers between
register files, use the more efficient `movi v0.4s, #0` instead.

Differential revision: https://reviews.llvm.org/D41515

llvm-svn: 321824
2018-01-04 21:43:12 +00:00
Simon Pilgrim e7c06423c1 [X86] Add scalar undef sdiv/srem/udiv/urem combine tests
llvm-svn: 321823
2018-01-04 21:33:19 +00:00
Alexey Bataev 7212b6e4b9 [DEBUG] Add initial tests for debug info for NVPTX target, NFC.
llvm-svn: 321822
2018-01-04 21:07:07 +00:00
Craig Topper dffb98e03d [X86] Correct the execution domain for AVX1 VBROADCASTF128 to be FP instead of integer.
llvm-svn: 321821
2018-01-04 20:56:21 +00:00
Amara Emerson 83e852f30d Revert "[X86] Regenerate test"
This reverts r321814 as it was failing make check.

llvm-svn: 321819
2018-01-04 20:20:44 +00:00
Max Moroz 1ef3a778ac [llvm-cov] Refactor "export" command implementation and add support for SOURCES.
Summary: Define an interface for Exporter + split JSON exporter into .h and .cpp.

Reviewers: vsk, morehouse

Reviewed By: vsk

Subscribers: llvm-commits, Dor1s, kcc

Differential Revision: https://reviews.llvm.org/D41600

llvm-svn: 321815
2018-01-04 19:33:29 +00:00
Simon Pilgrim a47289a2ee [X86] Regenerate test
llvm-svn: 321814
2018-01-04 18:48:42 +00:00
Simon Pilgrim 208fab193b [X86] Add common CHECK prefix for tests without SSE/AVX codegen
llvm-svn: 321810
2018-01-04 18:23:46 +00:00
Bjorn Pettersson 77f3299415 Teach InlineCost about address spaces
Summary:
I basically copied this patch from here:
  https://reviews.llvm.org/D1251
But I skipped some of the refactoring to make the patch more clean.

The new outer3/inner3 test case in ptr-diff.ll triggers the
following assert without this patch:
lib/IR/Constants.cpp:1834: static llvm::Constant *llvm::ConstantExpr::getCompare(unsigned short, llvm::Constant *, llvm::Constant *, bool): Assertion `C1->getType() == C2->getType() && "Op types should be identical!"' failed.

The other new test cases makes sure that there is code coverage
for all modifications in InlineCost.cpp (getting different values
due to not fetching sizes for address space zero). I only guarantee
code coverage for those tests. The tests are not written in a way
that they would break if not having the corrections in
InlineCost.cpp. I found it quite hard to fine tune the tests into
getting different results based on the pointer sizes (except for
the test case where we hit an assert if not teaching InlineCost
about address spaces).

Reviewers: chandlerc, arsenm, haicheng

Reviewed By: arsenm

Subscribers: wdng, eraman, llvm-commits, haicheng

Differential Revision: https://reviews.llvm.org/D40455

llvm-svn: 321809
2018-01-04 18:23:40 +00:00
Simon Pilgrim d6923083ba Regenerate broadcast constant comment
llvm-svn: 321808
2018-01-04 18:21:33 +00:00
Simon Pilgrim 5643b40d91 [X86] Show missed combine for X/X for SDIV/UDIV and X%X for SREM/UREM
llvm-svn: 321807
2018-01-04 18:20:46 +00:00
Matt Arsenault 35c4244aea StructurizeCFG: xfail one of the testcases from r321751
It fails with -verify-region-info. This seems to be a issue
with RegionInfo itself which existed before.

llvm-svn: 321806
2018-01-04 17:23:24 +00:00
Sanjay Patel c63f9014d6 [InstCombine] safely create a constant of the right type (PR35794)
llvm-svn: 321801
2018-01-04 14:31:56 +00:00
Oliver Stannard 7d9198b296 [ARM] Fix endianness of Thumb .inst.w directive
Wide Thumb2 instructions should be emitted into the object file as pairs of
16-bit words of the appropriate endianness, not one 32-bit word.

Differential revision: https://reviews.llvm.org/D41185

llvm-svn: 321799
2018-01-04 13:56:40 +00:00
Diana Picus 865f7fecb2 [ARM GlobalISel] Select G_PHI
Select G_PHI to PHI and manually constrain the result register. This is
very similar to how COPY is handled, so extract and reuse some of that
code.

llvm-svn: 321797
2018-01-04 13:09:25 +00:00
Diana Picus bcabda43e4 [ARM GlobalISel] Add RegBankSelect tests for G_PHI
RegBankSelect already handles G_PHI with some generic code. Add a couple
of tests for it.

llvm-svn: 321796
2018-01-04 13:09:20 +00:00
Diana Picus c768bbe2e7 [ARM GlobalISel] Legalize scalar G_PHI
Mark G_PHI as Legal for s32 and p0, and also for s64 if we have hard
float. Widen any smaller types.

llvm-svn: 321795
2018-01-04 13:09:14 +00:00
Diana Picus 37ae9f68a4 [ARM GlobalISel] Fix selection of pointer constants
We used to handle G_CONSTANT with pointer type by forcing the type of
the result register to s32 and then letting TableGen handle it.
Unfortunately, setting the type only works for generic virtual
registers, that haven't yet been constrained to a register class (e.g.
those used only by a COPY later on). If the result register has already
been constrained as a use of a previously selected instruction, then
setting the type will assert.

It would be nice to be able to teach TableGen to select pointer
constants the same as integer constants, but since it's such an edge
case (at the moment the only pointer constant that we're generally
interested in is 0, and that is mostly used for comparisons and selects,
which are also not supported by TableGen) it's probably not worth the
effort right now. Instead, handle pointer constants with some trivial
handwritten code.

llvm-svn: 321793
2018-01-04 10:54:57 +00:00
Nikolai Bozhenov 4d5e34b221 [ValueTracking] Adding missed lit-test for commit r316208
Reviewers: reames, hfinkel

Differential Revision: https://reviews.llvm.org/D34101

Patch by: Olga Chupina <olga.chupina@intel.com>

llvm-svn: 321792
2018-01-04 10:02:50 +00:00
Sam Parker 4e70c2fac8 [X86] Codegen test for PR37563
Adding test to ease review of D41628.

llvm-svn: 321791
2018-01-04 09:42:27 +00:00
Aditya Kumar 1f90cae80f [GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases load in case of a loop
Reviewers:
    dberlin
    sebpop
    eli.friedman

Differential Revision: https://reviews.llvm.org/D41453

llvm-svn: 321789
2018-01-04 07:47:24 +00:00
Michael Trent ca30902ff8 Do not look up symbol names when n_strx == 0
Summary:
Historical tools for working with mach-o binaries verify the nlist field
n_strx has a non-zero value before using that value to retrieve symbol names.
Under some cirumstances, llvm-nm will attempt to display the symbol name at 
position 0, even though symbol names at that position are not well defined. 
This change addresses this problem by returning an empty string when n_strx
is zero.

rdar://problem/35750548

Reviewers: enderby, davide

Reviewed By: enderby, davide

Subscribers: davide, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D41657

llvm-svn: 321773
2018-01-03 23:28:32 +00:00
Simon Pilgrim ec0a2fb703 [DAGCombine] Handle out of range EXTRACT_VECTOR_ELT indices
Handle this in DAGCombiner::visitEXTRACT_VECTOR_ELT the same as we already do in SelectionDAG::getNode and use APInt instead of getZExtValue.

This should also fix oss-fuzz #4910

llvm-svn: 321767
2018-01-03 22:42:33 +00:00
Philip Reames cf524a408a [PRE] Add a bunch of test cases for LICM-like PRE patterns
These were inspired by a very old review I'm about to abandon (https://reviews.llvm.org/D7061).  Several of the test cases from that worked without modification and expanding test coverage of such cases is always worthwhile.

llvm-svn: 321764
2018-01-03 22:28:26 +00:00
Matt Arsenault 8070882b4e StructurizeCFG: Fix broken backedge detection
The work order was changed in r228186 from SCC order
to RPO with an arbitrary sorting function. The sorting
function attempted to move inner loop nodes earlier. This
was was apparently relying on an assumption that every block
in a given loop / the same loop depth would be seen before
visiting another loop. In the broken testcase, a block
outside of the loop was encountered before moving onto
another block in the same loop. The testcase would then
structurize such that one blocks unconditional successor
could never be reached.

Revert to plain RPO for the analysis phase. This fixes
detecting edges as backedges that aren't really.

The processing phase does use another visited set, and
I'm unclear on whether the order there is as important.
An arbitrary order doesn't work, and triggers some infinite
loops. The reversed RPO list seems to work and is closer
to the order that was used before, minus the arbitary
custom sorting.

A few of the changed tests now produce smaller code,
and a few are slightly worse looking.

llvm-svn: 321751
2018-01-03 18:45:37 +00:00
Simon Pilgrim 3bf2d64589 [InstCombine] Check for out of range shift values using APInt before calling getZExtValue
Reduced from oss-fuzz #4871 test case

llvm-svn: 321748
2018-01-03 18:28:20 +00:00
Craig Topper cc6637b707 [X86] Use ANY_EXTEND instead of SIGN_EXTEND in lowerMasksToReg
Currently we use SIGN_EXTEND in lowerMasksToReg as part of calling convention setup, but we don't require a specific value for the upper bits.

This patch changes it to ANY_EXTEND which will be lowered as SIGN_EXTEND if it ends up sticking around.

llvm-svn: 321746
2018-01-03 18:11:01 +00:00
Dmitry Venikov 3d8cd34a5d [InstSimplify] Missed optimization in math expression: squashing exp(log), log(exp)
Summary: This patch enables folding following expressions under -ffast-math flag: exp(log(x)) -> x, exp2(log2(x)) -> x, log(exp(x)) -> x, log2(exp2(x)) -> x

Reviewers: spatel, hfinkel, davide

Reviewed By: spatel, hfinkel, davide

Subscribers: scanon, llvm-commits

Differential Revision: https://reviews.llvm.org/D41381

llvm-svn: 321710
2018-01-03 14:37:42 +00:00
Florian Hahn dcc0ba9bbb [InstCombine] Add test to remove VarArg casts (NFC)
llvm-svn: 321706
2018-01-03 13:35:43 +00:00
Amara Emerson 9de62130fd [GlobalISel][Legalizer] Fix legalization of llvm.smul.with.overflow
Previously the code for handling G_SMULO didn't properly check for the signed
multiply overflow, instead treating it the same as the unsigned G_UMULO.

Fixes PR35800.

llvm-svn: 321690
2018-01-03 04:56:56 +00:00
Jake Ehrlich 30d927a128 [llvm-objcopy] Add support for visibility
I have no clue how this was missed when symbol table support was added. This
change ensures that the visibility of symbols is preserved by default.

llvm-svn: 321681
2018-01-02 23:01:24 +00:00
Andrew Kaylor e12e08c680 Handle the case of live 16-bit subregisters in X86FixupBWInsts
Differential Revision: https://reviews.llvm.org/D40524

Change-Id: Ie3a405b28503ceae999f5f3ba07a68fa733a2400
llvm-svn: 321674
2018-01-02 21:04:38 +00:00
Sanjay Patel 24e6a8bde0 [AArch64] fix typos in comments; NFC
llvm-svn: 321673
2018-01-02 21:04:08 +00:00
Sanjay Patel 7811430588 [ValueTracking] recognize min/max of min/max patterns
This is part of solving PR35717:
https://bugs.llvm.org/show_bug.cgi?id=35717

The larger IR optimization is proposed in D41603, but we can show 
the improvement in ValueTracking using codegen tests because 
SelectionDAG creates min/max nodes based on ValueTracking. 

Any target with min/max ops should show wins here. I chose AArch64
vector ops because they're clean and uniform.

Some Alive proofs for the tests (can't put more than 2 tests in 1 
page currently because the web app says it's too long):
https://rise4fun.com/Alive/WRN
https://rise4fun.com/Alive/iPm
https://rise4fun.com/Alive/HmY
https://rise4fun.com/Alive/CNm
https://rise4fun.com/Alive/LYf

llvm-svn: 321672
2018-01-02 20:56:45 +00:00
Sanjay Patel 35a6ee86af [AArch64] add tests for min/max of min/max (PR35717); NFC
llvm-svn: 321668
2018-01-02 20:16:45 +00:00
Amara Emerson 913918cbef [AArch64][GlobalISel] Fix assert fail with unknown intrinsic.
A call may have an intrinsic name but not have a valid intrinsic ID,
for example with llvm.invariant.group.barrier. If so, treat it as a
normal call like FastISel does.

llvm-svn: 321662
2018-01-02 18:56:39 +00:00
Sanjay Patel 9a80871ffe [x86] allow pairs of PCMPEQ for vector-sized integer equality comparisons (PR33325)
This is an extension of D31156 with the goal that we'll allow memcmp() == 0 expansion 
for x86 to use 2 pairs of loads per block.

The memcmp expansion pass (formerly part of CGP) will generate this kind of pattern 
with oversized integer compares, so we want to transform these into x86-specific vector
nodes before legalization splits things into scalar chunks.

See PR33325 for more details:
https://bugs.llvm.org/show_bug.cgi?id=33325

Differential Revision: https://reviews.llvm.org/D41618

llvm-svn: 321656
2018-01-02 16:38:29 +00:00
Amara Emerson 854d10d10b [AArch64][GlobalISel] Enable GlobalISel at -O0 by default
Tests updated to explicitly use fast-isel at -O0 instead of implicitly.

This change also allows an explicit -fast-isel option to override an
implicitly enabled global-isel. Otherwise -fast-isel would have no effect at -O0.

Differential Revision: https://reviews.llvm.org/D41362

llvm-svn: 321655
2018-01-02 16:30:47 +00:00
Anna Thomas bdb9430917 [BasicBlockUtils] Check for unreachable preds before updating LI in UpdateAnalysisInformation
Summary:
We are incorrectly updating the LI when loop-simplify generates
dedicated exit blocks for a loop. The issue is that there's an implicit
assumption that the Preds passed into UpdateAnalysisInformation are
reachable. However, this is not true and breaks LI by incorrectly
updating the header of a loop.

One such case is when we generate dedicated exits when the exit block is
a landing pad (through SplitLandingPadPredecessors). There maybe other
cases as well, since we do not guarantee that Preds passed in are
reachable basic blocks.

The added test case shows how loop-simplify breaks LI for the outer loop (and DT in turn)
after we try to generate the LoopSimplifyForm.

Reviewers: davide, chandlerc, sanjoy

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41519

llvm-svn: 321653
2018-01-02 16:25:50 +00:00
Krzysztof Parzyszek cfe4a3616f [Hexagon] Fix generation of vector sign extensions
llvm-svn: 321650
2018-01-02 15:28:49 +00:00
Daniel Jasper cc4903e2ba Revert r321089: "[DAG] Elide overlapping store" (and subsequent fix in r321204)
Our internal testing has revealed has discovered bugs in PPC builds.
I have forward reproduction instructions to the original author (Nirav).

llvm-svn: 321649
2018-01-02 14:38:52 +00:00
Sam Parker 3570c554b5 [DAGCombine] Fix for PR35765
Remove the acceptance of ANY_EXTEND nodes while trying to move and
nodes back to loads.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35765

Differential Revision: https://reviews.llvm.org/D41625

llvm-svn: 321641
2018-01-02 10:19:01 +00:00
Sam Parker 2dea5e0f3c [X86] Codegen test for pr35765
Committing reproducer test for pr35765, fix to follow.

llvm-svn: 321640
2018-01-02 10:14:00 +00:00
Craig Topper e3b6bd337a [SelectionDAG] Teach WidenVecOp_Convert to widen the operation if a widened result type would still be legal.
llvm-svn: 321638
2018-01-02 07:30:53 +00:00
Dmitry Venikov a58d8deb3a [InstCombine] Missed optimization in math expression: squashing sqrt functions
Summary: This patch enables folding under -ffast-math flag sqrt(a) * sqrt(b) -> sqrt(a*b)

Reviewers: hfinkel, spatel, davide

Reviewed By: spatel, davide

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D41322

llvm-svn: 321637
2018-01-02 05:58:11 +00:00
Simon Pilgrim 6720726d27 [ValueTracking] Don't assume shift values are in range
Reduced (as best I could...) from oss-fuzz #4857 test case

llvm-svn: 321634
2018-01-01 22:44:59 +00:00
Simon Pilgrim af35f5ec1d [InstCombine] Regenerate udiv tests.
llvm-svn: 321633
2018-01-01 22:27:49 +00:00
Craig Topper c8898b3640 [X86] Promote vXi1 fp_to_uint/fp_to_sint to vXi32 to avoid scalarization.
llvm-svn: 321632
2018-01-01 21:12:18 +00:00
Craig Topper bb8b79b0a0 [X86] Add test cases for vXi1 fptosi/fptoui.
Currently we do a lot of scalarization in these test cases.

llvm-svn: 321631
2018-01-01 21:12:10 +00:00
Sanjay Patel 18962dabb7 [x86] add runs for more vector variants; NFC
Preliminary step to see what the effects of D41618 look like.

llvm-svn: 321624
2018-01-01 16:36:47 +00:00
Simon Pilgrim e337268df7 [X86][SSE] Add test case from PR32160
llvm-svn: 321620
2018-01-01 13:04:04 +00:00
Uriel Korach c06596ced4 [X86] Regenerate test checks in sse-intrinsics-x86-upgrade with update-llc
Removing outdated checks.
NFC

llvm-svn: 321619
2018-01-01 09:00:13 +00:00
Uriel Korach e87d240699 [X86] Regenerate test checks in sse2-intrinsics-x86-upgrade with update-llc
Removing outdated checks.
NFC

llvm-svn: 321618
2018-01-01 08:47:50 +00:00
Craig Topper 0d35edda90 [X86] In LowerTruncateVecI1, don't add SHL if the input is known to be all sign bits.
If the input is all sign bits then the LSB through MSB are all the same so we don't need to be move the LSB to the MSB.

llvm-svn: 321617
2018-01-01 04:52:58 +00:00
Craig Topper fc3ce4993c [X86] Add patterns for using zmm registers for v8i32/v8f32 vselect with the false input being zero.
We can use zmm move with zero masking for this. We already had patterns for using a masked move, but we didn't check for the zero masking case separately.

llvm-svn: 321612
2018-01-01 01:11:29 +00:00
Craig Topper f78b75fb59 [X86] Use CONCAT_VECTORS instead of INSERT_SUBVECTOR for padding v4i1/v2i1 vector to v8i1 pre-legalize.
The CONCAT_VECTORS will be lowered to INSERT_SUBVECTOR later. In the modified cases this seems to be enough to trick a later DAG combine into running in a different order than allows the ANDs to be removed.

I'll admit this is a bit of a hack that happens to work, but using CONCAT_VECTORS is more consistent with other legalization code anyway.

llvm-svn: 321611
2017-12-31 19:17:52 +00:00
Simon Pilgrim b000675374 [X86][AVX2] Combine extract(broadcast(scalar_value)) --> scalar_value
As it has a scalar source we don't treat it as a target shuffle so needs special handling.

llvm-svn: 321610
2017-12-31 18:59:30 +00:00
Simon Pilgrim e940b86c5f [X86][AVX] Add test case from PR33740
llvm-svn: 321608
2017-12-31 17:16:48 +00:00
Simon Pilgrim f205ec716b [X86][SSE] Don't vectorize splat buildvector of binops (PR30780)
Don't combine buildvector(binop(),binop(),binop(),binop()) -> binop(buildvector(), buildvector()) if its a splat - keep the binop scalar and just splat the result to avoid large vector constants.

llvm-svn: 321607
2017-12-31 17:07:47 +00:00
Davide Italiano 9f074fe915 [SimplifyCFG] Stop hoisting musttail calls incorrectly.
PR35774.

llvm-svn: 321603
2017-12-31 16:47:16 +00:00
Craig Topper f0f6eefb49 [X86] Add a DAG combine to widen (i4 (bitcast (v4i1))) before type legalization sees the i4 and changes to load/store.
Same for v2i1 and i2.

llvm-svn: 321602
2017-12-31 09:50:38 +00:00
Craig Topper 7f39623533 [X86] Add a DAG combine to fix (v4i1 (bitcast (i4))) before type legalization sees the i4 and changes to load/store.
Same for i2 and v2i1.

llvm-svn: 321601
2017-12-31 08:25:50 +00:00
George Rimar 7672eb84af [MC] - Stop ignoring invalid meta data symbols.
Previously llvm-mc would silently accept code from testcase,
that contains invalid metadata symbol in section declaration.

Patch fixes the issue.

Differential revision: https://reviews.llvm.org/D41641

llvm-svn: 321599
2017-12-31 07:41:02 +00:00
Craig Topper 876ec0b558 [X86] Prevent combining (v8i1 (bitconvert (i8 load)))->(v8i1 load) if we don't have DQI.
We end up using an i8 load via an isel pattern from v8i1 anyway. This just makes it more explicit. This seems to improve codgen in some cases and I'd like to kill off some of the load patterns.

llvm-svn: 321598
2017-12-31 07:38:41 +00:00
Craig Topper a362dee774 [X86] Remove AND32ri8 from pattern for v1i1 load.
I don't think anything would actually expect the other bits to be zero.

llvm-svn: 321596
2017-12-31 07:38:33 +00:00
Craig Topper 7ba1b76854 [X86] Fix a crash when returning a <1 x i1> value>
llvm-svn: 321595
2017-12-31 07:38:30 +00:00
Philip Reames 232951dfb2 2nd attempt at "fixing" amdgpu tests after r321575​
The test needs to be changed; it was exercising UB and that likely wasn't the intent of the test author.  I simply removed the checks because I have absolutely no idea what this test was trying to accomplish.  With multiple check patterns, no explanation, and no familiarity on my part with the ISA a true fix is going to have to come from someone familiar with the target.  

llvm-svn: 321591
2017-12-31 03:34:36 +00:00
Philip Reames 3580c90458 Test fix after r321575
The test in question was checking for a particular intepretation of undefined behavior.  Relax the test to check that we simply don't crash.

Sorry for the breakage, I don't generally build AMDGPU locally and just saw the failure this morning.  

llvm-svn: 321589
2017-12-30 18:42:37 +00:00
Simon Pilgrim 06f6d262f9 [X86][SSE] Add PR30780 test cases
Broadcast of sign/zero extended scalars resulting in unnecessary vector constants

llvm-svn: 321584
2017-12-30 11:51:45 +00:00
Simon Pilgrim fa0f793c8d [X86][SSE] Add test for (v2f32 uitofp(build_vector(i32, i32))) (PR35732)
To compare against (v2f32 build_vector(f32 uitofp(i32), f32 uitofp(i32)))

llvm-svn: 321583
2017-12-30 11:20:56 +00:00
Hiroshi Inoue ca3cdd7f27 [PowerPC] fix a bug in TCO eligibility check
If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same.

This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed.

Differential Revision: https://reviews.llvm.org/D40893

llvm-svn: 321579
2017-12-30 08:09:04 +00:00
Craig Topper c5fd31a802 [X86] Custom legalize vXi1 extract_subvector with KSHIFTR.
This allows us to remove some isel patterns.

This is mostly NFC, but we now use KSHIFTB instead of KSHIFTW with DQI.

llvm-svn: 321576
2017-12-30 06:45:43 +00:00
Philip Reames e499bc3042 [instsimplify] consistently handle undef and out of bound indices for insertelement and extractelement
In one case, we were handling out of bounds, but not undef indices.  In the other, we were handling undef (with the comment making the analogy to out of bounds), but not out of bounds.  Be consistent and treat both undef and constant out of bounds indices as producing undefined results.

As a side effect, this also protects instcombine from having to handle large constant indices as we always simplify first.

llvm-svn: 321575
2017-12-30 05:54:22 +00:00
Philip Reames 8e1abe4a7d Add another test case for r321489
Went to reduce another fuzzer failure to find it's already been fixed, but the test case is slightly different so it's worth adding anyways.

Reduced from oss-fuzz #4768 test case

llvm-svn: 321573
2017-12-30 04:10:48 +00:00
Philip Reames 3e9c671923 Move tests associated with transforms moved in r321467
llvm-svn: 321572
2017-12-30 03:13:00 +00:00
Simon Atanasyan 970f686faa [mips] Replace assert by an error message
Initially, if the `c` constraint applied to the wrong data type that
causes LLVM to assert. This commit replaces the assert by an error
message.

llvm-svn: 321565
2017-12-29 19:18:24 +00:00
Matt Arsenault d94b63d765 AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores
Atomics still have hasSideEffects set on them because
of the mess that is the memory properties.

llvm-svn: 321556
2017-12-29 17:18:18 +00:00