Commit Graph

96759 Commits

Author SHA1 Message Date
Sanjoy Das e30a281449 [SCEVExpander] Don't hoist divisions
Fixes PR30942.

llvm-svn: 286437
2016-11-10 07:56:09 +00:00
Craig Topper bd298c37d1 [AVX-512] Allow legacy cvtpd2dq intrinsics to select EVEX encoded instruction when available.
llvm-svn: 286435
2016-11-10 07:47:17 +00:00
Craig Topper e0845d8e8c [AVX-512][X86] Convert avx_cvtt_ps2dq_256 and sse2_cvttps2dq intrinsics to ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions.
This allows these intrinsics to also use EVEX instructons.

llvm-svn: 286434
2016-11-10 07:24:52 +00:00
Craig Topper f37b9b9b5f [X86] Convert int_x86_avx_cvtt_pd2dq_256 to fp_to_sint using the intrinsics table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available.
llvm-svn: 286433
2016-11-10 06:45:39 +00:00
Craig Topper 2afed2c790 [X86] Move some custom patterns into the currently empty pattern of their corresponding instructions. NFC
llvm-svn: 286432
2016-11-10 06:45:37 +00:00
Craig Topper 1d2e74f030 [X86] Remove some patterns still referencing int_x86_sse2_cvttpd2dq that should have been removed in r286344. NFC
llvm-svn: 286431
2016-11-10 06:45:34 +00:00
Sanjoy Das 0ae390abce [SCEV] Eta reduce some lambdas; NFC
llvm-svn: 286429
2016-11-10 06:33:54 +00:00
Sanjay Patel 4e1b5a53c7 [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)
Removing the limitation in visitInsertElementInst() causes several regressions
because we're not prepared to fold sequences of shuffles or inserts and extracts
separated by shuffles. Fixing that appears to be a difficult mission because we
are purposely trying to avoid creating shuffles with arbitrary shuffle masks
because some targets may choke on those.

https://llvm.org/bugs/show_bug.cgi?id=30923

llvm-svn: 286423
2016-11-10 00:15:14 +00:00
Peter Collingbourne 32ab3a817d Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.", with a fix for 32-bit x86.
Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions
that take a global address operand.

llvm-svn: 286420
2016-11-09 23:53:43 +00:00
Dehao Chen 38a666d6e5 Add isHotBB helper function to ProfileSummaryInfo
Summary: This will unify all BB hotness checks.

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26353

llvm-svn: 286415
2016-11-09 23:36:02 +00:00
Eli Friedman ddbf83ea14 Preserve assumption cache in loop-rotate.
No testcase included because I can't figure out how to reduce it.
(It's easy to write a testcase where rotation clones an assume,
but that doesn't actually seem to trigger the crash in opt on
its own; maybe an issue with the laziness?)

Differential Revision: https://reviews.llvm.org/D26434

llvm-svn: 286410
2016-11-09 23:05:01 +00:00
Tim Northover 09dd2496b7 GlobalISel: fix typo. NFC
llvm-svn: 286408
2016-11-09 22:40:02 +00:00
Tim Northover a9105be437 GlobalISel: translate invoke and landingpad instructions
Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj
etc), but it should get things going.

llvm-svn: 286407
2016-11-09 22:39:54 +00:00
Mehdi Amini 67d1a41226 Make BitcodeReader::parseIdentificationBlock() robust to EOF
This method is particular: it iterates at the top-level and does
not have an enclosing block.

llvm-svn: 286394
2016-11-09 21:26:49 +00:00
Evgeny Stupachenko c2698cd903 Minor unroll pass refacoring.
Summary:
Unrolled Loop Size calculations moved to a function.
Constant representing number of optimized instructions
 when "back edge" becomes "fall through" replaced with
 variable.
Some comments added.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D21719

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 286389
2016-11-09 19:56:39 +00:00
Sanjoy Das 26f28a2836 [Verifier] clang-format a section; NFC
Suggested in D26438 since I'm touching related code.

llvm-svn: 286388
2016-11-09 19:36:39 +00:00
Sanjoy Das 6b46a0d1e8 [SCEV] Refactor out a useful pattern; NFC
llvm-svn: 286386
2016-11-09 18:22:43 +00:00
Peter Collingbourne a9cadeddd4 Revert r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate."
Suspected to be the cause of a sanitizer-windows bot failure:
Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420

llvm-svn: 286385
2016-11-09 18:17:50 +00:00
Peter Collingbourne 4c15db45e4 X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.
A relocatable immediate is either an immediate operand or an operand that
can be relocated by the linker to an immediate, such as a regular symbol
in non-PIC code.

Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands
of type "imm32_su". Remove a number of now-redundant patterns.

Differential Revision: https://reviews.llvm.org/D25812

llvm-svn: 286384
2016-11-09 17:51:58 +00:00
Krzysztof Parzyszek f817efbbb0 [Hexagon] Silence "sometimes uninitialized" warning in HexagonCopyToCombine
llvm-svn: 286383
2016-11-09 17:50:46 +00:00
Peter Collingbourne 7f00d0a125 Bitcode: Change the materializer interface to return llvm::Error.
Differential Revision: https://reviews.llvm.org/D26439

llvm-svn: 286382
2016-11-09 17:49:19 +00:00
Krzysztof Parzyszek a540997ce4 [Hexagon] Separate Hexagon subreg indices for different register classes
For pairs of 32-bit registers: isub_lo, isub_hi.
For pairs of vector registers: vsub_lo, vsub_hi.

Add generic subreg indices: ps_sub_lo, ps_sub_hi, and a function
  HexagonRegisterInfo::getHexagonSubRegIndex(RegClass, GenericSubreg)
that returns the appropriate subreg index for RegClass.

llvm-svn: 286377
2016-11-09 16:19:08 +00:00
Krzysztof Parzyszek 601d7eb11a [Hexagon] Eliminate Insert4 pseudo-instruction, use combines instead
llvm-svn: 286368
2016-11-09 14:16:29 +00:00
Jonas Paulsson e127fe7083 [SystemZ] A few fixes in scheduler files.
Review: U Weigand
llvm-svn: 286362
2016-11-09 12:47:57 +00:00
Pavel Labath c207bec388 Remove TimeValue usage from Scalar/SROA.cpp. NFC.
llvm-svn: 286361
2016-11-09 12:07:12 +00:00
Pavel Labath 775bbc3736 Zero-initialize chrono duration objects
The default duration constructor does not zero-initialize the object, we need to
do that manually.

llvm-svn: 286359
2016-11-09 11:43:57 +00:00
Jonas Paulsson 28f29487b9 [MachineScheduler] Comments fixing.
The name/comment of the third argument to the ScheduleDAGMI constructor
is RemoveKillFlags and not IsPostRA. Only the comments are changed.

Review: A Trick
llvm-svn: 286350
2016-11-09 09:59:27 +00:00
Alexandros Lamprineas 0ee3ec2fe4 [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.

Differential Revision: https://reviews.llvm.org/D26185

llvm-svn: 286347
2016-11-09 08:53:07 +00:00
Craig Topper f334ac19ad [AVX-512] Add lowering to cvttpd2udq/cvttps2udq for fptoui v2f64/2f32 to 2i32
This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions.

If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result.

It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result.

Differential Revision: https://reviews.llvm.org/D26331

llvm-svn: 286345
2016-11-09 07:48:51 +00:00
Craig Topper 731bf9c5d6 [X86] Lower AVX512 and SSE intrinsics for CVTTPD2DQ to X86ISD::CVTTPD2DQ.
Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves.

Reviewers: delena, zvi, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26330

llvm-svn: 286344
2016-11-09 07:31:32 +00:00
Craig Topper 28e3dfc02b [AVX-512] Use alignedstore256 in patterns that look for stores of the lower 256-bits of a 512-bit vector to use a 256-bit aligned store.
Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947.

llvm-svn: 286342
2016-11-09 05:31:57 +00:00
Craig Topper 5c842be9a0 [AVX-512] Make VBMI instruction set enabling imply that the BWI instruction set is also enabled.
Summary:
This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912.

Reviewers: delena, zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26322

llvm-svn: 286339
2016-11-09 04:50:48 +00:00
Mehdi Amini b6a11a7879 Revert "[ThinLTO] Prevent exporting of locals used/defined in module level asm"
This reverts commit r286297.
Introduces a dependency from libAnalysis to libObject, which I missed
during the review.

llvm-svn: 286329
2016-11-09 01:45:13 +00:00
Peter Collingbourne 7576cb0fa7 Bitcode: Remove the remnants of the BitcodeDiagnosticInfo class.
The BitcodeReader no longer produces BitcodeDiagnosticInfo diagnostics.
The only remaining reference was in the gold plugin; the code there has been
dead since we stopped producing InvalidBitcodeSignature error codes in r225562.
While at it remove the InvalidBitcodeSignature error code.

llvm-svn: 286326
2016-11-09 01:09:11 +00:00
Dehao Chen 947dbe1254 Enable Loop Sink pass for functions that has profile.
Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.

Reviewers: davidxl, hfinkel

Subscribers: mehdi_amini, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26155

llvm-svn: 286325
2016-11-09 00:58:19 +00:00
Peter Collingbourne 58f7f0759f Bitcode: Change the BitcodeReader to use llvm::Error internally.
Differential Revision: https://reviews.llvm.org/D26430

llvm-svn: 286323
2016-11-09 00:51:04 +00:00
Sanjay Patel e104554412 [ValueTracking] recognize obfuscated variants of umin/umax
The smallest tests that expose this are codegen tests (because SelectionDAGBuilder::visitSelect() uses matchSelectPattern
to create UMAX/UMIN nodes), but it's also possible to see the effects in IR alone with folds of min/max pairs.

If these were written as unsigned compares in IR, InstCombine canonicalizes the unsigned compares to signed compares. 
Ie, running the optimizer pessimizes the codegen for this case without this patch:

define <4 x i32> @umax_vec(<4 x i32> %x) {
  %cmp = icmp ugt <4 x i32> %x, <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647>
  %sel = select <4 x i1> %cmp, <4 x i32> %x, <4 x i32> <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647>
  ret <4 x i32> %sel
}

$ ./opt umax.ll -S | ./llc -o - -mattr=avx

vpmaxud LCPI0_0(%rip), %xmm0, %xmm0

$ ./opt -instcombine umax.ll -S | ./llc -o - -mattr=avx

vpxor %xmm1, %xmm1, %xmm1
vpcmpgtd  %xmm0, %xmm1, %xmm1
vmovaps LCPI0_0(%rip), %xmm2    ## xmm2 = [2147483647,2147483647,2147483647,2147483647]
vblendvps %xmm1, %xmm0, %xmm2, %xmm0

Differential Revision: https://reviews.llvm.org/D26096

llvm-svn: 286318
2016-11-09 00:24:44 +00:00
Greg Clayton bde0a1632b Added the ability to dump hex bytes easily into a raw_ostream.
Unit tests were added to verify this functionality keeps working correctly.

Example output for raw hex bytes:
llvm::ArrayRef<uint8_t> Bytes = ...;
llvm::outs() << format_hex_bytes(Bytes);
554889e5 4881ec70 04000048 8d051002
00004c8d 05fd0100 004c8b0d d0020000

Example output for raw hex bytes with offsets:
llvm::outs() << format_hex_bytes(Bytes, 0x100000d10);
0x0000000100000d10: 554889e5 4881ec70 04000048 8d051002
0x0000000100000d20: 00004c8d 05fd0100 004c8b0d d0020000

Example output for raw hex bytes with ASCII with offsets:
llvm::outs() << format_hex_bytes_with_ascii(Bytes, 0x100000d10);
0x0000000100000d10: 554889e5 4881ec70 04000048 8d051002 |UH.?H.?p...H....|
0x0000000100000d20: 00004c8d 05fd0100 004c8b0d d0020000 |..L..?...L..?...|

The default groups bytes into 4 byte groups, but this can be changed to 1 byte:
llvm::outs() << format_hex_bytes(Bytes, 0x100000d10, 16 /*NumPerLine*/, 1 /*ByteGroupSize*/);
0x0000000100000d10: 55 48 89 e5 48 81 ec 70 04 00 00 48 8d 05 10 02
0x0000000100000d20: 00 00 4c 8d 05 fd 01 00 00 4c 8b 0d d0 02 00 00

llvm::outs() << format_hex_bytes(Bytes, 0x100000d10, 16 /*NumPerLine*/, 2 /*ByteGroupSize*/);
0x0000000100000d10: 5548 89e5 4881 ec70 0400 0048 8d05 1002
0x0000000100000d20: 0000 4c8d 05fd 0100 004c 8b0d d002 0000

llvm::outs() << format_hex_bytes(Bytes, 0x100000d10, 8 /*NumPerLine*/, 1 /*ByteGroupSize*/);
0x0000000100000d10: 55 48 89 e5 48 81 ec 70
0x0000000100000d18: 04 00 00 48 8d 05 10 02
0x0000000100000d20: 00 00 4c 8d 05 fd 01 00
0x0000000100000d28: 00 4c 8b 0d d0 02 00 00

https://reviews.llvm.org/D26405

llvm-svn: 286316
2016-11-09 00:15:54 +00:00
Sanjay Patel 4e9d6cd354 [InstCombine] fix profitability equation for max-of-nots transform
As the test change shows, we can increase the critical path by adding
a 'not' instruction, so make sure that we're actually removing an
instruction if we do this transform.

This transform could also cause us to miss folds of min/max pairs.

llvm-svn: 286315
2016-11-09 00:13:11 +00:00
Sanjay Patel 99dc5feff1 [InstCombine] reduce indentation; NFC
llvm-svn: 286314
2016-11-08 23:49:15 +00:00
Zachary Turner 44728f4014 Fix some size_t / uint32_t ambiguity errors.
llvm-svn: 286305
2016-11-08 22:30:11 +00:00
Zachary Turner 4efa0a4201 [CodeView] Hook up CodeViewRecordIO to type serialization path.
Previously support had been added for using CodeViewRecordIO
to read (deserialize) CodeView type records.  This patch adds
support for writing those same records.  With this patch,
reading and writing of CodeView type records finally uses a single
codepath.

Differential Revision: https://reviews.llvm.org/D26253

llvm-svn: 286304
2016-11-08 22:24:53 +00:00
Adrian Prantl 3502f2089c Emit the DW_AT_type for a C++ static member definition
if it is more specific than the one in its DW_AT_specification.

If a static member is an array, the translation unit containing the
member definition may have a more specific type (including its length)
than TUs only seeing the class declaration. This patch adds a
DW_AT_type to the member's DW_TAG_variable in addition to the
DW_AT_specification in these cases. The member type in the
DW_AT_specification still shows the more generic type (without the
length) to avoid defeating type uniquing.

The DWARF standard discourages “duplicating” a DW_AT_type in a member
variable definition but doesn’t explicitly forbid it.  Having the more
specific type (with the array length) available is what allows the
debugger to print the contents of a static array member variable.

https://reviews.llvm.org/D26368
rdar://problem/28706946

llvm-svn: 286302
2016-11-08 22:11:38 +00:00
David L. Jones e09ae201f2 GlobalISel: make sure debugging variables are appropriately elided in release builds.
Summary:
There are two variables here that break. This change constrains both of them to
debug builds (via DEBUG() or #ifndef NDEBUG).

Reviewers: bkramer, t.p.northover

Subscribers: mehdi_amini, vkalintiris

Differential Revision: https://reviews.llvm.org/D26421

llvm-svn: 286300
2016-11-08 22:03:23 +00:00
Teresa Johnson 6955feebf3 [ThinLTO] Prevent exporting of locals used/defined in module level asm
Summary:
This patch uses the same approach added for inline asm in r285513 to
similarly prevent promotion/renaming of locals used or defined in module
level asm.

All static global values defined in normal IR and used in module level asm
should be included on either the llvm.used or llvm.compiler.used global.
The former were already being flagged as NoRename in the summary, and
I've simply added llvm.compiler.used values to this handling.

Module level asm may also contain defs of values. We need to prevent
export of any refs to local values defined in module level asm (e.g. a
ref in normal IR), since that also requires renaming/promotion of the
local. To do that, the summary index builder looks at all values in the
module level asm string that are not marked Weak or Global, which is
exactly the set of locals that are defined. A summary is created for
each of these local defs and flagged as NoRename.

This required adding handling to the BitcodeWriter to look at GV
declarations to see if they have a summary (rather than skipping them
all).

Finally, added an assert to IRObjectFile::CollectAsmUndefinedRefs to
ensure that an MCAsmParser is available, otherwise the module asm parse
would silently fail. Initialized the asm parser in the opt tool for use
in testing this fix.

Fixes PR30610.

Reviewers: mehdi_amini

Subscribers: johanengelen, krasin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26146

llvm-svn: 286297
2016-11-08 21:53:35 +00:00
Kuba Brecka a49dcbb743 [asan] Speed up compilation of large C++ stringmaps (tons of allocas) with ASan
This addresses PR30746, <https://llvm.org/bugs/show_bug.cgi?id=30746>. The ASan pass iterates over entry-block instructions and checks each alloca whether it's in NonInstrumentedStaticAllocaVec, which is apparently slow. This patch gathers the instructions to move during visitAllocaInst.

Differential Revision: https://reviews.llvm.org/D26380

llvm-svn: 286296
2016-11-08 21:30:41 +00:00
Andrew Kaylor 9604f34996 [BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes
Differential Revision: https://reviews.llvm.org/D26382

llvm-svn: 286294
2016-11-08 21:07:42 +00:00
Matthias Braun c53cbbb1d1 AArch64DeadRegisterDefinitionsPass: Fix Changed flag
Fix a bug in the calculation of the changed flag introduced in r285488.

llvm-svn: 286293
2016-11-08 20:59:03 +00:00
Sanjoy Das 2582e690b7 [TBAA] Drop support for "old style" scalar TBAA tags
Summary:
We've had support for auto upgrading old style scalar TBAA access
metadata tags into the "new" struct path aware TBAA metadata for 3 years
now.  The only way to actually generate old style TBAA was explicitly
through the IRBuilder API.  I think this is a good time for dropping
support for old style scalar TBAA.

I'm not removing support for textual or bitcode upgrade -- if you have
IR with the old style scalar TBAA tags that go through the AsmParser orf
the bitcode parser before LLVM sees them, they will keep working as
usual.

Note:

  %val = load i32, i32* %ptr, !tbaa !N
  !N = < scalar tbaa node >

is equivalent to

  %val = load i32, i32* %ptr, !tbaa !M
  !N = < scalar tbaa node >
  !M = !{!N, !N, 0}

Reviewers: manmanren, chandlerc, sunfish

Subscribers: mcrosier, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D26229

llvm-svn: 286291
2016-11-08 20:46:01 +00:00
Tim Northover 6cddfc14f9 GlobalISel: allow CodeGen to fallback on VReg type/class issues.
After instruction selection we perform some checks on each VReg just before
discarding the type information. These checks were assertions before, but that
breaks the fallback path so this patch moves the logic into the main flow and
reports a better error on failure.

llvm-svn: 286289
2016-11-08 20:39:03 +00:00
Ulrich Weigand 05effca2d8 [SystemZ] Add missing FP extension instructions
This completes assembler / disassembler support for all BFP
instructions provided by the floating-point extensions facility.
The instructions added here are not currently used for codegen.

llvm-svn: 286285
2016-11-08 20:18:41 +00:00
Ulrich Weigand 4006e09d1d [SystemZ] Add program mask and addressing mode instructions
Add several instructions that operate on the program mask
or the addressing mode.  These are not really needed for
code generation under Linux, but are provided for completeness
for the assembler/disassembler.

llvm-svn: 286284
2016-11-08 20:17:02 +00:00
Ulrich Weigand fffc7110d6 [SystemZ] Model access registers as LLVM registers
Add the 16 access registers as LLVM registers.  This allows removing
a lot of special cases in the assembler and disassembler where we
were handling access registers; this can all just use the generic
register code now.

Also add a bunch of instructions to operate on access registers,
for assembler/disassembler use only.  No change in code generation
intended.

llvm-svn: 286283
2016-11-08 20:15:26 +00:00
Davide Italiano 11a871b227 [LoopDistribute] Preserve GlobalsAA also in the new Pass Manager.
Differential Revision:  https://reviews.llvm.org/D26408

llvm-svn: 286280
2016-11-08 19:52:32 +00:00
Eli Friedman 06025cf6c7 Don't store Twine in a local variable.
Fixes post-commit review comment from r286177.

llvm-svn: 286275
2016-11-08 19:43:56 +00:00
Dan Gohman e81021a5cb [WebAssembly] Convert stackified IMPLICIT_DEF into constant 0.
Since IMPLIFIT_DEF instructions are omitted in the output, when the output
of an IMPLICIT_DEF instruction is stackified, the resulting register lacks
an explicit push, leading to a push/pop mismatch. Fix this by converting
such IMPLICIT_DEFs into CONST_I32 0 instructions so that they have explicit
pushes.

llvm-svn: 286274
2016-11-08 19:40:38 +00:00
Ahmed Bougacha 53a03a28c4 [GlobalISel] Dump all instructions inserted by selector.
This is helpful when multiple instructions are inserted.

llvm-svn: 286273
2016-11-08 19:27:13 +00:00
Ahmed Bougacha db273a1272 [GlobalISel] Permit select() to erase.
Erasing reverse_iterators is problematic; iterate manually.
While there, keep track of the range of inserted instructions.
It can miss instructions inserted elsewhere, but those are harder
to track.

Differential Revision: http://reviews.llvm.org/D22924

llvm-svn: 286272
2016-11-08 19:27:10 +00:00
Davide Italiano 1e77aaca8a [LibcallsShrinkWrap] This pass doesn't preserve the CFG.
For example, it invalidates the domtree, causing assertions
in later passes which need dominator infos. Make it preserve
GlobalsAA, as suggested by Eli.

Differential Revision:  https://reviews.llvm.org/D26381

llvm-svn: 286271
2016-11-08 19:18:20 +00:00
Chad Rosier fbc7b7d154 Fix typo in comment. NFC.
llvm-svn: 286270
2016-11-08 19:10:25 +00:00
Ulrich Weigand 3d07d45089 [SystemZ] Always use semantic instruction classes
Define a couple of additional semantic classes and use them
throughout the .td files to make them more consistent and
more easily readable.

No functional change.

llvm-svn: 286268
2016-11-08 18:37:48 +00:00
Ulrich Weigand bfcfa0e207 [SystemZ] Refactor InstRR* instruction format patterns
This changes the InstRR (and related) patterns to no longer
automatically add an "r" at the end of the mnemonic.  This
makes the .td files more obviously understandable, and also
allows using the patterns for those few instructions that
do not follow the *r scheme.

Also add some more sub-formats of the RRF format class, to
match operand names and sequence from the PoP better.

No functional change.

llvm-svn: 286267
2016-11-08 18:36:31 +00:00
Ulrich Weigand 37bd451a55 [SystemZ] Rename some Inst* instruction format classes
Now that we've added instruction format subclasses like
InstRIb, it makes sense to rename the old InstRI to InstRIa.

Similar for InstRX, InstRXY, InstRS, InstRSY, and InstSS.

No functional change.

llvm-svn: 286266
2016-11-08 18:32:50 +00:00
Nirav Dave e833c6c61a [MC][AArch64] Cleanup end-of-line parsing in AArch64 AsmParser.
Reviewers: t.p.northover, rengolin

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D26309

llvm-svn: 286265
2016-11-08 18:31:04 +00:00
Ulrich Weigand d2148caffc [SystemZ] Refactor branch and conditional instruction patterns
Rework patterns for branches, call & return instructions,
compare-and-branch, compare-and-trap, and conditional move
instructions.

In particular, simplify creation of patterns for the extended
opcodes of instructions that take a CC mask.

Also, use semantical instruction classes for all the instructions
instead of open-coding them in SystemZInstrInfo.td.

Adds a couple of the basic branch instructions (that are unused
for codegen) for the assembler/disassembler.

llvm-svn: 286263
2016-11-08 18:30:50 +00:00
Piotr Padlewski 01659cb9fe NFC small changes in MemDep
llvm-svn: 286260
2016-11-08 18:20:51 +00:00
Wei Mi b5cf9e53e5 [RegAllocGreedy] Another fix about NewVRegs for last chance recoloring after r281783.
About when we should move a vreg from CurrentNewVRegs to NewVRegs,
if the vreg in CurrentNewVRegs was added into RecoloringCandidate and was
evicted, it shouldn't be added to NewVRegs because its physical register
will be restored at the end of tryLastChanceRecoloring after the recoloring
failed. If the vreg in CurrentNewVRegs was not in RecoloringCandidate, i.e.
it was evicted in selectOrSplitImpl inside tryRecoloringCandidates, its
physical register will not be restored even if the recoloring failed. In
that case, we need to add the vreg to NewVRegs.

Same as r281783, the problem was seen on out-of-tree target and we didn't
have a test case that reproduce the problem with in-tree targets.

llvm-svn: 286259
2016-11-08 18:19:36 +00:00
Tim Northover 5f7dea85c2 GlobalISel: support selecting fpext/fptrunc instructions on AArch64.
llvm-svn: 286253
2016-11-08 17:44:07 +00:00
Anton Korobeynikov 243a4700ce Fix PR27500: on MSP430 the branch destination offset is measured in words, not bytes.
Summary: In addition, the branch instructions will have proper BB destinations, not offsets, like before.

Reviewers: asl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23718

llvm-svn: 286252
2016-11-08 17:19:59 +00:00
Chad Rosier c244349b85 Remove unused include. NFC.
llvm-svn: 286250
2016-11-08 16:51:19 +00:00
Dehao Chen 2ca9be330b Use the last 7 bits to represent the discriminator to fit it in 1 byte ULEB128 (NFC).
From experiments, discriminator is rarely greater than 127. Here we enforce it to be no greater than 127 so that it will always fit in 1 byte.

llvm-svn: 286245
2016-11-08 16:32:32 +00:00
Simon Pilgrim 778596bf59 [TargetLowering] Fix undef vector element issue with true/false result handling
Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching.

The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements....

This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs).

The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed.

Differential Revision: https://reviews.llvm.org/D26031

llvm-svn: 286238
2016-11-08 15:07:01 +00:00
Pablo Barrio 9f45254138 [JumpThreading] Unfold selects that depend on the same condition
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.

Patch by James Molloy and Pablo Barrio.

Reviewers: rengolin, haicheng, sebpop

Subscribers: jojo, jmolloy, llvm-commits

Differential Revision: https://reviews.llvm.org/D26391

llvm-svn: 286236
2016-11-08 14:53:30 +00:00
Simon Pilgrim d02c55204b [VectorLegalizer] Expansion of CTLZ using CTPOP when possible
This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available.

This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful.

Differential Revision: https://reviews.llvm.org/D25910

llvm-svn: 286233
2016-11-08 14:10:28 +00:00
Roger Ferrer Ibanez 80c0f33c29 [AArch64] Fix incorrect CSEL node created
Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64
transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to
"a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have
the same type which can lead to a wrong CSEL node that crashes later
due to nonsensical copies.

Differential Revision: https://reviews.llvm.org/D26394

llvm-svn: 286231
2016-11-08 13:34:41 +00:00
Amara Emerson 0b40201e13 Adds the loop end location to the loop metadata.
This additional information can be used to improve the locations when generating remarks for loops.

Patch by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D25763

llvm-svn: 286227
2016-11-08 11:18:59 +00:00
Sylvestre Ledru 3ce346d4ca Fix memory leaks (coverity issues 1365586 & 1365591)
Reviewers: hfinkel

Subscribers: george.burgess.iv, malcolm.parsons, boris.ulasevich, llvm-commits

Differential Revision: https://reviews.llvm.org/D26347

llvm-svn: 286223
2016-11-08 10:00:45 +00:00
Peter Collingbourne e2dcf7c3a1 IR, Bitcode: Change bitcode reader to no longer own its memory buffer.
Unique ownership is just one possible ownership pattern for the memory buffer
underlying the bitcode reader. In practice, as this patch shows, ownership can
often reside at a higher level. With the upcoming change to allow multiple
modules in a single bitcode file, it will no longer be appropriate for
modules to generally have unique ownership of their memory buffer.

The C API exposes the ownership relation via the LLVMGetBitcodeModuleInContext
and LLVMGetBitcodeModuleInContext2 functions, so we still need some way for
the module to own the memory buffer. This patch does so by adding an owned
memory buffer field to Module, and using it in a few other places where it
is convenient.

Differential Revision: https://reviews.llvm.org/D26384

llvm-svn: 286214
2016-11-08 06:03:43 +00:00
Peter Collingbourne 77c89b6958 Bitcode: Decouple block info block state from reader.
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106630.html

Move block info block state to a new class, BitstreamBlockInfo.
Clients may set the block info for a particular cursor with the
BitstreamCursor::setBlockInfo() method.

At this point BitstreamReader is not much more than a container for an
ArrayRef<uint8_t>, so remove it and replace all uses with direct uses
of memory buffers.

Differential Revision: https://reviews.llvm.org/D26259

llvm-svn: 286207
2016-11-08 04:17:11 +00:00
Peter Collingbourne 939c7d916e Bitcode: Split out block info reading into a separate function.
We're about to make this more complicated.

llvm-svn: 286206
2016-11-08 04:16:57 +00:00
George Burgess IV 63b06e185c Add a missing break statement. NFC.
llvm-svn: 286203
2016-11-08 04:01:50 +00:00
Tim Northover 60f2349b50 GlobalISel: improve error diagnostics when IRTranslation fails.
llvm-svn: 286190
2016-11-08 01:12:17 +00:00
Tim Northover 9ac0eba672 GlobalISel: support selecting G_SELECT on AArch64.
llvm-svn: 286185
2016-11-08 00:45:29 +00:00
Tim Northover 7d88da6a46 GlobalISel: constrain PHI registers on AArch64.
Self-referencing PHI nodes need their destination operands to be constrained
because nothing else is likely to do so. For now we just pick a register class
naively.

Patch mostly by Ahmed again.

llvm-svn: 286183
2016-11-08 00:34:06 +00:00
Eli Friedman 8649fc053b [LTO] Add error message on IO error in compileOptimizedToFile.
(No testcase because it's difficult to force an error here.)

Differential Revision: https://reviews.llvm.org/D26371

llvm-svn: 286177
2016-11-07 23:43:07 +00:00
Stanislav Mekhanoshin 92e01ee90b [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.

With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a forward propagation of a v_cmp 64 bit result
to an user is implemented. Additional side effect of this is that we may
consume less VGPRs in a cost of more SGPRs in case if holding of multiple
conditions is needed, and that is a clear win in most cases.

llvm-svn: 286171
2016-11-07 23:04:50 +00:00
Adam Nemet b103fc52d3 [OptDiag, opt-viewer] Save callee's location and display as link
With this we get a new field in the YAML record if the value being
streamed out has a debug location.  For examples, please see the changes
to the tests.

This is then used in opt-viewer to display a link for the callee
function in the inlining remarks.

Differential Revision: https://reviews.llvm.org/D26366

llvm-svn: 286169
2016-11-07 22:41:13 +00:00
Sanjin Sijaric 6f020d91a1 [AArch64] Transfer memory operands when lowering vector load/store intrinsics
Summary:
Some vector loads and stores generated from AArch64 intrinsics alias each other
unnecessarily, preventing better scheduling.  We just need to transfer memory
operands during lowering.

Reviewers: mcrosier, t.p.northover, jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26313

llvm-svn: 286168
2016-11-07 22:39:02 +00:00
Sanjoy Das 4aeb080db3 [TRE] Remove dead code
Address review by Eli Friedman on rL286147.

llvm-svn: 286165
2016-11-07 22:17:37 +00:00
Derek Schuff 0d41b7b3f3 [WebAssembly] Emit a BasePointer when we have overly-aligned stack objects
Because we shift the stack pointer by an unknown amount, we need an
additional pointer. In the case where we have variable-size objects
as well, we can't reuse the frame pointer, thus three pointers.

Patch by Jacob Gravelle

Differential Revision: https://reviews.llvm.org/D26263

llvm-svn: 286160
2016-11-07 22:00:48 +00:00
Dehao Chen d74e1e161d Reset debug loc to OldInduction in InnerLoopVectorizer::createInductionVariable. (NFC)
This is to prevent SetInsertionPoint from setting debug loc to Latch->getTerminator().

llvm-svn: 286159
2016-11-07 21:59:40 +00:00
Sanjoy Das e06ef141fc Avoid tail recursion elimination across calls with operand bundles
Summary:
In some specific scenarios with well understood operand bundle types
(like `"deopt"`) it may be possible to go ahead and convert recursion to
iteration, but TailRecursionElimination does not have that logic today
so avoid doing the right thing for now.

I need some input on whether `"funclet"` operand bundles should also
block tail recursion elimination.  If not, I'll allow TRE across calls
with `"funclet"` operand bundles and add a test case.

Reviewers: rnk, majnemer, nlewycky, ahatanak

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26270

llvm-svn: 286147
2016-11-07 21:01:49 +00:00
Davide Italiano 2b5ba7bae6 [lib/Object] Modernize. NFCI.
llvm-svn: 286146
2016-11-07 21:01:42 +00:00
Evgeniy Stepanov cd729d6236 Use -fsanitize-recover instead of -mllvm -msan-keep-going.
Summary: Use -fsanitize-recover instead of -mllvm -msan-keep-going.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26352

llvm-svn: 286145
2016-11-07 21:00:10 +00:00
Davide Italiano 5df6066ec1 [AArch64] Remove dead store. Found by gcc7.
llvm-svn: 286137
2016-11-07 19:11:25 +00:00
Kuba Brecka 44e875ad5b [tsan] Cast floating-point types correctly when instrumenting atomic accesses, LLVM part
Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this.

Differential Revision: https://reviews.llvm.org/D26266

llvm-svn: 286135
2016-11-07 19:09:56 +00:00
Matt Arsenault f530e8b3f0 AMDGPU: Remove unnecessary and on conditional branch
The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.

llvm-svn: 286134
2016-11-07 19:09:33 +00:00
Matt Arsenault 52f14ec596 AMDGPU: Preserve vcc undef flags when inverting branch
If the branch was on a read-undef of vcc, passes that used
analyzeBranch to invert the branch condition wouldn't preserve
the undef flag resulting in a verifier error.

Fixes verifier failures in a future commit.

Also fix verifier error when inserting copy for vccz
corruption bug.

llvm-svn: 286133
2016-11-07 19:09:27 +00:00
Benjamin Kramer 1697d39eef [MemCpyOpt] Don't emit IR in an unspecified order
Argument evaluation order is one of the edge cases where Clang differs
from GCC, yielding different IR depending on which compiler LLVM was
built with. Make the order deterministic and tune the test to actually
verify the order instead of trying to hide it.

llvm-svn: 286126
2016-11-07 17:47:28 +00:00
Matt Arsenault 2ae5653072 AMDGPU: Try to fix (non-clang?) bot builds
llvm-svn: 286120
2016-11-07 16:52:50 +00:00
Richard Smith 857efb0880 Add -O0 support for @llvm.invariant.group.barrier by discarding it if it gets to ISel.
Differential Revision: https://reviews.llvm.org/D26292

llvm-svn: 286119
2016-11-07 16:47:20 +00:00
Matt Arsenault 314cbf7a3b AMDGPU: Refactor copyPhysReg
Separate the subregister splitting logic to re-use later.

llvm-svn: 286118
2016-11-07 16:39:22 +00:00
Chad Rosier 611b73b1f9 Fix 80-column violations. NFC.
llvm-svn: 286117
2016-11-07 16:28:04 +00:00
Sanjay Patel 86408a8048 [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.

llvm-svn: 286113
2016-11-07 15:52:45 +00:00
Jonas Paulsson 4f0509fab3 [SystemZ] Correct the SchedModel regarding vector unit / instructions.
* Use a generic vector unit to model the issue unit more accurately.
* Update some vector instructions that actually use the vector unit for more
  than one cycle.

Review: Ulrich Weigand
llvm-svn: 286112
2016-11-07 15:45:06 +00:00
Amara Emerson 614b44bbe9 This patch adds support for 16 bit floating point registers to the inline asm register selection on AArch64.
Without this patch, register allocation for the example below fails.

define half @test(half %a1, half %a2) #0 {
entry:
  %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1
  ret half %0
}

Patch by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D25080

llvm-svn: 286111
2016-11-07 15:42:12 +00:00
Chad Rosier d6daac4746 [AArch64] Removed the narrow load merging code in the ld/st optimizer.
This feature has been disabled for some time now, so remove cruft.

Differential Revision: https://reviews.llvm.org/D26248

llvm-svn: 286110
2016-11-07 15:27:22 +00:00
Jonas Paulsson 818431a61a [SystemZ] Fixes in SchedModels for older subtargets.
IssueWidth updated to reflect the capacity of the issue unit correctly.
Correct number of FX and LS units modelled (2, was 1).

Review: Ulrich Weigand
llvm-svn: 286109
2016-11-07 14:47:25 +00:00
Chad Rosier 8f348017b0 [AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory.
Differential Revision: https://reviews.llvm.org/D26252

llvm-svn: 286108
2016-11-07 14:11:45 +00:00
James Molloy b03e0879fc [Thumb1] Move padding earlier when synthesizing TBBs off of the PC
When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding.

If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset.

In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4.

llvm-svn: 286107
2016-11-07 13:38:21 +00:00
Dylan McKay c988b334b6 [AVR] Enable the ISel, frame analyzer, and alloca passes
llvm-svn: 286095
2016-11-07 06:02:55 +00:00
Craig Topper b110e04851 [AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to selects and native zext/sext.
This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select.

llvm-svn: 286092
2016-11-07 02:12:57 +00:00
Craig Topper 96041c6ba9 [X86] Use StringRef::startswith to reduce a few compares in the intrinsic autoupgrade code.
llvm-svn: 286090
2016-11-07 00:13:42 +00:00
Craig Topper 7e545335d6 [AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to legacy intrinsics and a select.
llvm-svn: 286089
2016-11-07 00:13:39 +00:00
Krzysztof Parzyszek 39d14f3bc3 Reapply r286080 with a phony change in Hexagon's CMakeLists.txt
Cmake has not recognized that Hexagon.td has a new dependency in
HexagonPatterns.td. All changes to that file were not visible to
the build bots.

llvm-svn: 286084
2016-11-06 20:55:57 +00:00
Saleem Abdulrasool 804e12eeb5 ARM: lower fpowi appropriately for Windows ARM
This handles the last case of the builtin function calls that we would
generate code which differed from Microsoft's ABI.  Rather than
generating a call to `__pow{d,s}i2` we now promote the parameter to a
float or double and invoke `powf` or `pow` instead.

Addresses PR30825!

llvm-svn: 286082
2016-11-06 19:46:54 +00:00
Krzysztof Parzyszek f8d38d11b9 Revert r286080: it breaks build bots
llvm-svn: 286081
2016-11-06 19:36:09 +00:00
Krzysztof Parzyszek 9e3520c884 [Hexagon] Remove redundant custom selection code
The clr/set/toggle-bit instructions (with the bit index given as an
immediate operand) had both, custom selection code that generated them,
and selection patterns at the same time. The selection patterns were
not used, because the custom selection code was executed first.
This patch removes the custom code in favor of the selection patterns.
The custom code handled 64-bit registers as well with an immediate bit
index, and so new patterns were added to implement that.

It was also the same case for the instruction "Rd += asr(Rs, Rt)",
except that the custom code did not offer any additional functionality,
and was simply removed.

llvm-svn: 286080
2016-11-06 19:03:38 +00:00
Krzysztof Parzyszek c93815ef04 [Hexagon] Round 5 of selection pattern simplifications
Remove unnecessary type casts in patterns.

llvm-svn: 286079
2016-11-06 18:13:14 +00:00
Krzysztof Parzyszek f914278f8b [Hexagon] Round 4 of selection pattern simplifications
Give simpler or more meaningful names to pat frags and xforms.

llvm-svn: 286078
2016-11-06 18:09:56 +00:00
Krzysztof Parzyszek 846597d081 [Hexagon] Round 3 of selection pattern simplifications
Remove unnecessary C++ functions for SDNode transforms. Move more
pat frags to files where they are used.

llvm-svn: 286077
2016-11-06 18:05:14 +00:00
Krzysztof Parzyszek 84755104b4 [Hexagon] Round 2 of selection pattern simplifications
Add pat frags for any-, sign-, and zero-extensions.

llvm-svn: 286076
2016-11-06 17:56:48 +00:00
Simon Pilgrim 39df78e384 [SelectionDAG] Add support for vector demandedelts in XOR opcodes
llvm-svn: 286075
2016-11-06 16:49:19 +00:00
Craig Topper 46de41330c [AVX-512] Remove intrinsics for 128/256-bit masked variable shift. Instead upgrade them to a select and the older AVX2 intrinsic.
llvm-svn: 286073
2016-11-06 16:29:19 +00:00
Craig Topper af9b3fe752 [AVX-512] Remove intrinsics for 128/256-bit masked shift by immediate. Instead upgrade them to a select and the older SSE/AVX2 intrinsic.
llvm-svn: 286072
2016-11-06 16:29:14 +00:00
Simon Pilgrim dd4809a603 [SelectionDAG] Add support for vector demandedelts in OR opcodes
llvm-svn: 286071
2016-11-06 16:29:09 +00:00
Craig Topper c9467ed31e [AVX-512] Remove intrinsics for 128/256-bit masked shift by single element in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic.
llvm-svn: 286070
2016-11-06 16:29:08 +00:00
Simon Pilgrim b3ad5f7ebf [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsElementInsertion. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286067
2016-11-06 14:20:29 +00:00
Benjamin Kramer f15a8e6a50 [BitcodeWriter] Replace a manual byteswap with read32be.
No functional change intended.

llvm-svn: 286066
2016-11-06 13:26:39 +00:00
Amaury Sechet 74084c44ca Kill deprecated attribute API
Summary:
This kill various depreacated API related to attribute :
 - The deprecated C API attribute based on LLVMAttribute enum.
 - The Raw attribute set format (planned to be removed in 4.0).

Reviewers: bkramer, echristo, mehdi_amini, void

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D23039

llvm-svn: 286062
2016-11-06 07:48:46 +00:00
Tim Shen 398f90f024 [APFloat] Make functions that produce APFloaat objects use correct semantics.
Summary:
Fixes PR30869.

In D25977 I meant to change all functions that care about lifetime. I
changed constructors, factory functions, but I missed member/free
functions that return new instances. This patch changes them.

Reviewers: hfinkel, kbarton, echristo, joerg

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26269

llvm-svn: 286060
2016-11-06 07:38:37 +00:00
Craig Topper 5471fc29e4 [AVX-512] Add missing EVEX version of pattern for (v2f64 (extloadv2f32 addr:)) -> VCVTPS2PDZ128rm
llvm-svn: 286059
2016-11-06 04:12:52 +00:00
Craig Topper 1162857ec4 [AVX-512] Lower AVX cvtpd2ps intrinsic to ISD::FP_ROUND so it can use EVEX instruction when available.
llvm-svn: 286057
2016-11-06 04:12:46 +00:00
Craig Topper 9a4a3af5dd [AVX-512] Lower SSE/AVX cvtdq2ps intrinsics directly to ISD::SINT_TO_FP so they can use EVEX instructions when available.
llvm-svn: 286056
2016-11-06 04:12:42 +00:00
Krzysztof Parzyszek 2839b29f4b [Hexagon] Relocate pattern-related bits to proper places
llvm-svn: 286049
2016-11-05 21:44:50 +00:00
Krzysztof Parzyszek 4b4012a5c9 [Hexagon] Round 1 of selection pattern simplifications
Consistently use register class pat frags instead of spelling out
the type and class each time.

llvm-svn: 286048
2016-11-05 21:02:54 +00:00
Simon Pilgrim 4a9f210412 [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBlend. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286045
2016-11-05 18:31:57 +00:00
Simon Pilgrim 725174694a [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsZeroOrAnyExtend. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286044
2016-11-05 18:22:13 +00:00
Simon Pilgrim 9f0afc6ae1 [X86][SSE] Reuse zeroable element mask in SSE4A EXTRQ/INSERTQ vector shuffle lowering. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286043
2016-11-05 18:05:13 +00:00
Simon Pilgrim 3cae21960e [X86][SSE] Reuse zeroable element mask in PSHUFB vector shuffle lowering. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286042
2016-11-05 17:53:27 +00:00
Simon Pilgrim 64a592d0a2 [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsInsertPS. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286040
2016-11-05 17:27:48 +00:00
Simon Pilgrim 009befbd88 [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBitMask. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286039
2016-11-05 17:12:19 +00:00
Justin Lebar 54b0be048e [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.

It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range.  But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.

Fixes PR30914.

Reviewers: jeremyhu

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26323

llvm-svn: 286038
2016-11-05 16:47:25 +00:00
Simon Pilgrim 1af0fc1103 [X86][SSE] Reuse zeroable element mask instead of regenerating it. NFCI
We are repeatedly calling computeZeroableShuffleElements in many shuffle lowering calls for the same shuffle mask/inputs.

This is a first step towards reusing the zeroable result, initially just for lowerVectorShuffleAsShift calls.

llvm-svn: 286037
2016-11-05 16:40:20 +00:00
Krzysztof Parzyszek a8d63dc289 [Hexagon] Split all selection patterns into a separate file
This is just the basic separation, without any cleanup. Further changes
will follow.

llvm-svn: 286036
2016-11-05 15:01:38 +00:00
Simon Pilgrim 1b4e1ac966 Strip trailing whitespace. NFCI.
llvm-svn: 286034
2016-11-05 14:43:04 +00:00
Craig Topper 8f354a7f8b [AVX-512] Use an equality compare instead of StringRef::startswith in a few places in auto upgrade that were looking for the complete intrinsic name anyway.
llvm-svn: 286033
2016-11-05 05:35:23 +00:00
Alina Sbirlea f802c3f975 Correct mprotect page boundries to round up end page. Fixes PR30905.
Summary:
Update the boundries for mprotect.
Patch by Andrew Adams. Fixes PR30905.

Reviewers: loladiro, andrew.w.kaylor, chandlerc

Subscribers: abadams, llvm-commits

Differential Revision: https://reviews.llvm.org/D26312

llvm-svn: 286032
2016-11-05 04:22:15 +00:00
Craig Topper a2f12e0a75 [X86] Remove broken support for autoupgrading llvm.x86.fma4.* intrinsics to llvm.x86.fma.*.
It currently fires an assert if you even try. Looking back, I don't think it ever worked because it only changed the name of the function object, but not the intrinsic ID stored in it. Given that, I think it can be removed since no one has noticed or complained in the past 4 years.

llvm-svn: 286031
2016-11-05 04:00:31 +00:00
Krzysztof Parzyszek b7eb7fc892 [Hexagon] Account for <def,read-undef> when validating moves for predication
llvm-svn: 286009
2016-11-04 20:41:03 +00:00
Weiming Zhao 6100118a52 Fix 24560: assembler does not share constant pool for same constants
Summary: This patch returns the same label if the CP entry with the same value has been created.

Reviewers: eli.friedman, rengolin, jmolloy

Subscribers: majnemer, jmolloy, llvm-commits

Differential Revision: https://reviews.llvm.org/D25804

llvm-svn: 286006
2016-11-04 19:17:32 +00:00
Zvi Rackover 85bc64c734 [X86] Broadcast from memory intructions aren't unfoldable
Broadcast from memory instructions should be treated as moves. They can't be unfolded.

Fixes pr30693.

llvm-svn: 285998
2016-11-04 15:15:19 +00:00
Tom Stellard 2d2d33f1dc Revert "AMDGPU: Add VI i16 support"
This reverts commit r285939 and r285948.  These broke some conformance tests.

llvm-svn: 285995
2016-11-04 13:06:34 +00:00
Jonas Paulsson baeb402014 Comment rewording in MachineScheduler.cpp.
Author: A Trick
llvm-svn: 285991
2016-11-04 08:31:14 +00:00
Chandler Carruth d9ef4b6601 Only log the visit of a return instruction if we in fact found a return
instruction.

This avoids dereferencing null in the debug logging if the instruction
was not in fact a return instruction. This potential bug was found by
PVS-Studio.

This actually fixes the last of the "dereferenced a pointer before
checking it for null" reports in the recent PVS-Studio run. However,
there are quite a few reports of this nature that I did not do anything
to fix because they are pretty glaring false positives. They usually
took the form of quite clear correlated checks or a check made in
a separate function. I've even added asserts anywhere this correlation
wasn't pretty obvious and fundamental to the code.

llvm-svn: 285988
2016-11-04 06:59:50 +00:00
Chandler Carruth 0f139b4f71 Hoist check for TLI above all of the attempts to use it (including one
of which that is hidden inside a separate function call) and helpfully
before building expensive transaction infrastructure. This will avoid
crashing when running CGP in a generic mode if we ever managed to hit
this case.

Note that I spent some time looking at alternatives. CGP is actually
used without a TM or TLI in order to do some target-independent testing.
Further, all of the neighboring optimization techniques actually have
some paths that are effective even in the absence of TLI so this seemed
the correct scope at which to check and bypass logic. It still isn't
clear that long-term support for missing TM/TLI is the right
cost/benefit tradeoff for CGP -- we seem to get relatively little for it
and the code is just littered with checks (and assumptions which
I suspect are still missing some checks).

This at least fixes the potential bug in this code spotted by
PVS-Studio, so we've got that going for us. ;]

llvm-svn: 285987
2016-11-04 06:54:00 +00:00
Xinliang David Li f450b88191 Fix typo
llvm-svn: 285978
2016-11-04 03:00:52 +00:00
Justin Bogner 2c2c6ac7b5 X86: Move a non-null assert to before the pointer is dereferenced
llvm-svn: 285975
2016-11-03 23:55:36 +00:00
Chandler Carruth 651f019297 Sink all of the code relying on the MachO MachineModuleInfo to live
behind the test that the MachineModuleInfo analysis was
actually available and can be used.

While the MachO bits may well be reasonable to assume in the darwin
assembly printer, the analysis isn't constructively guaranteed anywhere
I could find so it seems safest to avoid crashing here.

This issue was found with PVS-Studio. Pretty sure the Clang Static
Anaylzer flags similar issues but we've probably never pointed it at
this code effectively.

llvm-svn: 285972
2016-11-03 23:33:46 +00:00
Weiming Zhao 962eaaea9c [Cortex-M0] Atomic lowering
Summary: ARMv6m supports dmb etc fench instructions but not ldrex/strex etc. So for some atomic load/store, LLVM should inline instructions instead of lowering to __sync_ calls.

Reviewers: rengolin, efriedma, t.p.northover, jmolloy

Subscribers: efriedma, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D26120

llvm-svn: 285969
2016-11-03 21:49:08 +00:00
Kevin Enderby 7747cb55dc Add support for the ARM_THREAD_STATE64 and
in llvm-objdump for Mach-O files add the printing of the
ARM_THREAD_STATE64 in the same format as
otool-classic(1) on darwin.

To do this the 64-bit ARM general tread state
needed to be defined in include/llvm/Support/MachO.h .

rdar://28985800

llvm-svn: 285967
2016-11-03 20:51:28 +00:00
Tony Jiang 946242b5d2 NFC - Test commit.
Delete an empty line at the end of README.txt file.

llvm-svn: 285964
2016-11-03 20:32:21 +00:00
Adrian Prantl dbfda63695 Add DWARF debug info support for C++11 inline namespaces.
This implements the DWARF 5 DW_AT_export_symbols feature:
http://dwarfstd.org/ShowIssue.php?issue=141212.1

<rdar://problem/18616046>

llvm-svn: 285959
2016-11-03 19:42:02 +00:00
Kostya Serebryany 8a56917492 [libFuzzer] fix -error_exitcode=N, now with a test
llvm-svn: 285958
2016-11-03 19:31:18 +00:00
Justin Bogner f9fb2abb01 PDB: Fix some APIs to avoid use-after-frees
The buffer is already owned by the PDBFile for all of these APIs, so
don't pass it in separately.

llvm-svn: 285953
2016-11-03 18:28:04 +00:00
Tom Stellard cc34983181 AMDGPU/SI: Re add VIInstructions.td to unbreak bots
This file is unused as of r285939, but we need to keep it around
for bots that don't do full rebuilds.   We should be able to delete this
again in a few days.

llvm-svn: 285948
2016-11-03 17:56:46 +00:00
Chandler Carruth 5589aa60c7 Remove a redundant condition found by PVS-Studio.
Filed http://llvm.org/PR30897 to teach Clang to warn on this kind of
stuff.

llvm-svn: 285945
2016-11-03 17:42:02 +00:00
Tom Stellard 2b3379cdff AMDGPU: Add VI i16 support
Patch By: Wei Ding

Differential Revision: https://reviews.llvm.org/D18049

llvm-svn: 285939
2016-11-03 17:13:50 +00:00
Chandler Carruth 30e0029904 Delete a dead store found by PVS-Studio.
Quite sad we still aren't really using aggressive dead code warnings
from Clang that we could potentially use to catch this and so many other
things.

llvm-svn: 285936
2016-11-03 17:01:38 +00:00
Chandler Carruth fca1ff0da2 Fix a bug found by inspection by PVS-Studio.
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.

We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101

llvm-svn: 285930
2016-11-03 16:39:25 +00:00
Alexander Timofeev f867a40bf6 [AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.
hange explores the fact that LDS reads may be reordered even if access
the same location.

Prior the change, algorithm immediately stops as soon as any memory
access encountered between loads that are expected to be merged
together. Although, Read-After-Read conflict cannot affect execution
correctness.

Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%.
Also improvement expected on any massive sequences of reads from LDS.

Differential Revision: https://reviews.llvm.org/D25944

llvm-svn: 285919
2016-11-03 14:37:13 +00:00
Zvi Rackover a455864fdf Refactor creation of X86ISD::SETCC nodes to a helper function. NFC.
llvm-svn: 285917
2016-11-03 14:25:24 +00:00
Nicolai Haehnle bea772c6dc DAGCombiner: fix use-after-free when merging consecutive stores
Summary:
Have MergeConsecutiveStores explicitly return information about the stores
that were merged, so that we can safely determine whether the starting
node has been freed.

Reviewers: chandlerc, bogner, niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25601

llvm-svn: 285916
2016-11-03 14:25:04 +00:00
James Molloy e7d97368f2 Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently"
This reverts commit r285893. It caused (probably) http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/83 .

llvm-svn: 285912
2016-11-03 14:08:01 +00:00
James Molloy b60d8b1987 [Thumb] Teach ISel how to lower compares of AND bitmasks efficiently
This recommits r281323, which was backed out for two reasons. One, a selfhost failure, and two, it apparently caused Chromium failures. Actually, the latter was a red herring. The log has expired from the former, but I suspect that was a red herring too (actually caused by another problematic patch of mine). Therefore reapplying, and will watch the bots like a hawk.

For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).

1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.

1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.

llvm-svn: 285893
2016-11-03 10:18:20 +00:00
George Rimar e1924f06d1 [Object/ELF] - Make getSymbol() return Error.
That is consistent with other methods around
and helps to handle error on a caller side.

Differential revision: https://reviews.llvm.org/D26247

llvm-svn: 285886
2016-11-03 08:40:55 +00:00
Craig Topper 7b9cc1474f [AVX-512] Use 'vnot' instead of 'not' in patterns involving vXi1 vectors.
This fixes selection of KANDN instructions and allows us to remove an extra set of patterns for KNOT and KXNOR.

Reviewers: delena, igorb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26134

llvm-svn: 285878
2016-11-03 06:04:28 +00:00
Elena Demikhovsky caaceef4b3 Expandload and Compressstore intrinsics
2 new intrinsics covering AVX-512 compress/expand functionality.
This implementation includes syntax, DAG builder, operation lowering and tests.
Does not include: handling of illegal data types, codegen prepare pass and the cost model.

llvm-svn: 285876
2016-11-03 03:23:55 +00:00
Teresa Johnson 0515fb8d4b [ThinLTO] Handle distributed backend case when doing renaming
Summary:
The recent change I made to consult the summary when deciding whether to
rename (to handle inline asm) in r285513 broke the distributed build
case. In a distributed backend we will only have a portion of the
combined index, specifically for imported modules we only have the
summaries for any imported definitions. When renaming on import we were
asserting because no summary entry was found for a local reference being
linked in (def wasn't imported).

We only need to consult the summary for a renaming decision for the
exporting module. For imports, we would have prevented importing any
references to NoRename values already.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26250

llvm-svn: 285871
2016-11-03 01:07:16 +00:00
Greg Bedwell 5fc6f94591 Revert "[InstCombine] allow splat vector folds in adjustMinMax()"
This reverts commit r285732.

This change introduced a new assertion failure in the following
testcase at -O2:

typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
  __v8hi Result = V1;
  if (mask & 0x80)
    Result[0] = V2[0];
  return Result;
}

llvm-svn: 285866
2016-11-02 23:17:05 +00:00
Adrian McCarthy 4333daab1c Emit S_COMPILE3 record once per TU rather than once per function
This has some ripple effects in several tests.

llvm-svn: 285862
2016-11-02 21:30:35 +00:00
Kevin Enderby fbebe1632a Add the rest of the additional error checks for invalid Mach-O files when
the offsets and sizes of an element of the Mach-O file overlaps with
another element in the Mach-O file.

Some other tests for malformed Mach-O files now run into these
checks so their tests were also adjusted.

llvm-svn: 285860
2016-11-02 21:08:39 +00:00
Eli Friedman b6befc3bc4 DCE math library calls with a constant operand.
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .

Differential Revision: https://reviews.llvm.org/D25970

llvm-svn: 285857
2016-11-02 20:48:11 +00:00
Krzysztof Parzyszek ead77016d8 [Hexagon] Remove registers coalesced in expand-condsets from live intervals
llvm-svn: 285846
2016-11-02 17:59:54 +00:00
Davide Italiano 6b2bba14a9 [lli/COFF] Set the correct alignment for common symbols
Otherwise we set it always to zero, which is not correct,
and we assert inside alignTo (Assertion failed:
Align != 0u && "Align can't be 0.").

Differential Revision:  https://reviews.llvm.org/D26173

llvm-svn: 285841
2016-11-02 17:32:19 +00:00
Zachary Turner 7251ede7c5 Add CodeViewRecordIO for reading and writing.
Using a pattern similar to that of YamlIO, this allows
us to have a single codepath for translating codeview
records to and from serialized byte streams.  The
current patch only hooks this up to the reading of
CodeView type records.  A subsequent patch will hook
it up for writing of CodeView type records, and then a
third patch will hook up the reading and writing of
CodeView symbols.

Differential Revision: https://reviews.llvm.org/D26040

llvm-svn: 285836
2016-11-02 17:05:19 +00:00
Nicolai Haehnle 368972c3b3 AMDGPU: Allow additional implicit operands on MOVRELS instructions
Summary:
The post-RA scheduler occasionally uses additional implicit operands when
the vector implicit operand as a whole is killed, but some subregisters
are still live because they are directly referenced later. Unfortunately,
this seems incredibly subtle to reproduce.

Fixes piglit spec/glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-wr.shader_test
and others.

Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25656

llvm-svn: 285835
2016-11-02 17:03:11 +00:00
Malcolm Parsons 06ac79c210 Fix Clang-tidy readability-redundant-string-cstr warnings
Reviewers: beanz, lattner, jlebar

Subscribers: jholewinski, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26235

llvm-svn: 285832
2016-11-02 16:43:50 +00:00
Nirav Dave 0a392a8e7f [ARM][MC] Cleanup ARM Target Assembly Parser
Summary:
Correctly parse end-of-statement tokens and handle preprocessor
end-of-line comments in ARM assembly processor.

Reviewers: rnk, majnemer

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26152

llvm-svn: 285830
2016-11-02 16:22:51 +00:00
Adrian Prantl f148d6989c Improve and cleanup comments in DwarfExpression.h
llvm-svn: 285829
2016-11-02 16:20:37 +00:00
Matt Arsenault 44deb7914e BranchRelaxation: Fix computing indirect branch block size
llvm-svn: 285828
2016-11-02 16:18:29 +00:00
Adrian Prantl 54286bda2c Simplify control flow in the the DWARF expression compiler
by refactoring common code into a DwarfExpressionCursor wrapper.

llvm-svn: 285827
2016-11-02 16:12:20 +00:00
Adrian Prantl 56527dc804 Emit DW_OP_piece also if the previous value was a constant.
This fixes a bug in the DWARF backend.

llvm-svn: 285826
2016-11-02 16:12:16 +00:00
Simon Pilgrim 93f2f7fb6c Use !operator to test if APInt is zero/non-zero. NFCI.
Avoids APInt construction and slower comparisons.

llvm-svn: 285822
2016-11-02 15:41:15 +00:00
Vasileios Kalintiris e3bb72ea78 [mips] Always run the MipsOptimizePICCall pass.
Summary:
Remove this pass from addMachineSSAOptimization() and register it unconditionally in through addPreRegAlloc(). This pass is required for generating correct PIC calls.

Reviewers: sdardis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26036

llvm-svn: 285814
2016-11-02 15:11:27 +00:00
Joerg Sonnenberger bef3621ad0 Create the virtual register for the global base in the intersection of
GPRC and GPRC_NOR0 (or the 64bit equivalent) and not just the latter.
GPRC_NOR0 contains ZERO as alternative meaning of r0 and is therefore
not a true subclass of GPRC.

llvm-svn: 285813
2016-11-02 15:00:31 +00:00
Aaron Ballman 3ac3a7efff Removing a switch statement that contains a default label, but no case labels. Silences an MSVC warning; NFC.
llvm-svn: 285806
2016-11-02 13:58:57 +00:00
Joerg Sonnenberger 5e31b3ad93 Simplify.
llvm-svn: 285802
2016-11-02 12:45:28 +00:00
Ulrich Weigand 75d2f1b10d [SystemZ] Fix compiler warnings introduced by r285574
SystemZAsmParser::parseOperand returns a bool, not an enum.

llvm-svn: 285800
2016-11-02 11:32:28 +00:00
Kirill Bobyrev 1f1751182e [llvm] FIx if-clause -Wmisleading-indentation issue.
While bootstrapping Clang with recent `gcc 6.2.0` I found a bug related to misleading indentation.

I believe, a pair of `{}` was forgotten, especially given the above similar piece of code:

```
      if (!RDef || !HII->isPredicable(*RDef)) {
        Done = coalesceRegisters(RD, RegisterRef(S1));
        if (Done) {
          UpdRegs.insert(RD.Reg);
          UpdRegs.insert(S1.getReg());
        }
      }
```

Reviewers: kparzysz

Differential Revision: https://reviews.llvm.org/D26204

llvm-svn: 285794
2016-11-02 10:00:40 +00:00
Bjorn Pettersson 7424c8ccd1 [Reassociate] Skip analysis of dead code to avoid infinite loop.
Summary:
It was detected that the reassociate pass could enter an inifite
loop when analysing dead code. Simply skipping to analyse basic
blocks that are dead avoids such problems (and as a side effect
we avoid spending time on optimising dead code).

The solution is using the same Reverse Post Order ordering of the
basic blocks when doing the optimisations, as when building the
precalculated rank map. A nice side-effect of this solution is
that we now know that we only try to do optimisations for blocks
with ranked instructions.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30818

Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini

Subscribers: dberlin

Differential Revision: https://reviews.llvm.org/D26154

llvm-svn: 285793
2016-11-02 08:55:19 +00:00
Dylan McKay 7549b0a013 [AVR] Add instruction selection lowering code
Summary: This adds AVRISelLowering.cpp

Reviewers: arsenm, kparzysz

Subscribers: llvm-commits, modocache, japaric, wdng, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25034

llvm-svn: 285790
2016-11-02 06:47:40 +00:00
Peter Collingbourne ff2c2ec6b2 Bitcode: Check file size before reading bitcode header.
Should unbreak ocaml binding tests.

Also added an llvm-dis test that checks for the same thing.

llvm-svn: 285777
2016-11-02 00:39:11 +00:00
Peter Collingbourne 4e76019e34 Support: Remove MemoryObject and DataStreamer interfaces.
These interfaces are no longer used.

Differential Revision: https://reviews.llvm.org/D26222

llvm-svn: 285774
2016-11-02 00:08:37 +00:00
Peter Collingbourne 028eb5a3f8 Bitcode: Change reader interface to take memory buffers.
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106595.html

This change also fixes an API oddity where BitstreamCursor::Read() would
return zero for the first read past the end of the bitstream, but would
report_fatal_error for subsequent reads. Now we always report_fatal_error
for all reads past the end. Updated clients to check for the end of the
bitstream before reading from it.

I also needed to add padding to the invalid bitcode tests in
test/Bitcode/. This is because the streaming interface was not checking that
the file size is a multiple of 4.

Differential Revision: https://reviews.llvm.org/D26219

llvm-svn: 285773
2016-11-02 00:08:19 +00:00
Alex Bradbury 6b2cca7f8f [RISCV] Add bare-bones RISC-V MCTargetDesc
This is enough to compile and link but doesn't yet do anything particularly 
useful. Once an ASM parser and printer are added in the next two patches, the 
whole thing can be usefully tested.

Differential Revision: https://reviews.llvm.org/D23562

llvm-svn: 285770
2016-11-01 23:47:30 +00:00
Alex Bradbury 24d9b13b36 [RISCV 4/10] Add basic RISCV{InstrFormats,InstrInfo,RegisterInfo,}.td
For now, only add instruction definitions for basic ALU operations. Our 
initial target is a working MC layer rather than codegen, so appropriate 
SelectionDAG patterns will come later.

Differential Revision: https://reviews.llvm.org/D23561

llvm-svn: 285769
2016-11-01 23:40:28 +00:00
Matt Arsenault c507cdb4bc AMDGPU: Handle CopyToReg in getOperandRegClass
llvm-svn: 285768
2016-11-01 23:22:17 +00:00
Matt Arsenault 663ab8c119 AMDGPU: Use brev for materializing SGPR constants
This is already done with VGPR immediates and saves 4 bytes.

llvm-svn: 285765
2016-11-01 23:14:20 +00:00
Matt Arsenault 3d463193a9 AMDGPU: Default to using scalar mov to materialize immediate
This is the conservatively correct way because it's easy to
move or replace a scalar immediate. This was incorrect in the case
when the register class wasn't known from the static instruction
definition, but still needed to be an SGPR. The main example of this
is inlineasm has an SGPR constraint.

Also start verifying the register classes of inlineasm operands.

llvm-svn: 285762
2016-11-01 22:55:07 +00:00
Eric Christopher 690f8e587e Move the initialization of PreferredLoopExit into runOnMachineFunction to be near the other function specific initializations.
llvm-svn: 285758
2016-11-01 22:15:50 +00:00
Sam McCall 2a36eee4ca Fix uninitialized access in MachineBlockPlacement.
Summary:
Currently PreferredLoopExit is set only in buildLoopChains, which is
never called if there are no MachineLoops.

MSan is currently broken by this:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/145/steps/check-llvm%20msan/logs/stdio

This is a naive fix to get things green again. iteratee: you may have a better fix.

This change will also mean PreferredLoopExit will not carry over if
buildCFGChains() is called a second time in runOnMachineFunction, this
appears to be the right thing.

Reviewers: bkramer, iteratee, echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26069

llvm-svn: 285757
2016-11-01 22:02:14 +00:00
Matt Arsenault a6319b82ca AMDGPU: Stop creating unused virtual registers
These are only used in the spill to VMEM path. Move them to
the one use.

llvm-svn: 285756
2016-11-01 21:58:07 +00:00
George Burgess IV 66837aba0a [MemorySSA] Tighten up types to make our API prettier. NFC.
Patch by bryant.

Differential Revision: https://reviews.llvm.org/D26126

llvm-svn: 285750
2016-11-01 21:17:46 +00:00
Sanjay Patel 9840cdad4c [ValueTracking] remove TODO comment; NFC
InstCombine should always canonicalize patterns like the one shown in the comment
when visiting 'select' insts in adjustMinMax().

Scalars were already handled there, and vector splats are handled after:
https://reviews.llvm.org/rL285732

llvm-svn: 285744
2016-11-01 20:43:00 +00:00
Matt Arsenault 2d8c289b4b AMDGPU: Workaround for instruction size with literals
Instructions with a 32-bit base encoding with an optional
32-bit literal encoded after them report their size as 4
for the disassembler. Consider these when computing the
MachineInstr size. This fixes problems caused by size estimate
consistency in BranchRelaxation.

llvm-svn: 285743
2016-11-01 20:42:24 +00:00
Sanjay Patel c3d89842ad [InstCombine] allow splat vector folds in adjustMinMax()
llvm-svn: 285732
2016-11-01 20:08:02 +00:00
Sanjay Patel c0339c77ef [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.
This transforms

%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0

into

%b = icmp {ugt|ule} %x, (c0 >> c1)

z3:

(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
                (bvugt x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
                (bvule x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25913

llvm-svn: 285729
2016-11-01 19:19:29 +00:00
Krzysztof Parzyszek 654dc11b79 [Hexagon] Rename operand/predicate names for unshifted integers
For example, rename s6Ext to s6_0Ext. The names for shifted integers
include the underscore and this will make the naming consistent. It
also exposed a few duplicates that were removed.

llvm-svn: 285728
2016-11-01 19:02:10 +00:00
Matt Arsenault cb578f84e0 BranchRelaxation: Expand unconditional branches first
It's likely if a conditional branch needs to be expanded, the following
unconditional branch will also need expansion. By expanding the
unconditional branch first, the conditional branch can be simply
inverted to jump over the inserted indirect branch block. If the
conditional branch is expanded first, it results in an additional
branch.

This avoids test regressions in future commits.

llvm-svn: 285722
2016-11-01 18:34:00 +00:00
Sanjay Patel 644d7c3b8a [InstCombine] clean up adjustMinMax(); NFCI
1. Change param names for readability
2. Change pointer param to ref
3. Early exit to reduce indent
4. Change switch to if/else

llvm-svn: 285718
2016-11-01 18:15:03 +00:00
Konstantin Zhuravlyov d971a1123f [AMDGPU] Check if type transforms to i16 (VI+) when getting AMDGPUISD::FFBH_U32
This will prevent following regression when enabling i16 support (D18049):

test/CodeGen/AMDGPU/ctlz.ll
test/CodeGen/AMDGPU/ctlz_zero_undef.ll

Differential Revision: https://reviews.llvm.org/D25802

llvm-svn: 285716
2016-11-01 17:49:33 +00:00
Sanjay Patel 7ce658388b [InstCombine] add helper function for adjustMinMax(); NFCI
This is just a cut and paste; clean-up and enhancements to follow.

llvm-svn: 285715
2016-11-01 17:46:08 +00:00
Alex Bradbury b2e5472d85 [RISCV] Add stub backend
This contains just enough for lib/Target/RISCV to compile. Notably a basic 
RISCVTargetMachine and RISCVTargetInfo. At this point you can attempt llc 
-march=riscv32 myinput.ll and will find it fails due to the lack of 
MCAsmInfo.

See http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html for 
further discussion

Differential Revision: https://reviews.llvm.org/D23560

llvm-svn: 285712
2016-11-01 17:27:54 +00:00
Tom Stellard 9677b60288 AMDGPU: Fix buildbots broken by r285704
llvm-svn: 285711
2016-11-01 17:20:03 +00:00
Alex Bradbury 1524f62b97 [RISCV] Add RISC-V ELF defines
Add the necessary definitions for RISC-V ELF files, including relocs. Also 
make necessary trivial change to ELFYaml, llvm-objdump, and llvm-readobj in 
order to work with RISC-V ELFs.

Differential Revision: https://reviews.llvm.org/D23557

llvm-svn: 285708
2016-11-01 16:59:37 +00:00
Alex Bradbury b6e784a240 [RISCV] Recognise riscv32 and riscv64 in triple parsing code
This is the first in a series of 10 initial patches that incrementally add an 
MC layer for RISC-V to LLVM. See 
<http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html> for more 
discussion.

Differential Revision: https://reviews.llvm.org/D23557

llvm-svn: 285707
2016-11-01 16:47:54 +00:00
Alex Bradbury 58eba09949 [TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h
As it stands, the OperandMatchResultTy is only included in the generated
header if there is custom operand parsing. However, almost all backends
make use of MatchOperand_Success and friends from OperandMatchResultTy for
e.g. parseRegister. This is a pain when starting an AsmParser for a new
backend that doesn't yet have custom operand parsing. Move the enum to
MCTargetAsmParser.h.

This patch is a prerequisite for D23563

Differential Revision: https://reviews.llvm.org/D23496

llvm-svn: 285705
2016-11-01 16:32:05 +00:00
Tom Stellard 94c21bc088 AMDGPU: Implement expansion of f16 = FP_TO_FP16 f64
I wanted to implement this as a target independent expansion, however when
targets say they want to expand FP_TO_FP16 what they actually want is
the unsafe math expansion when possible and expansion to a libcall in all
other cases.

The only way to make this work as a target independent would be to add logic
to target's TargetLowering construction to mark theses nodes as Expand when
LegalizeDAG can use the unsafe expansion and mark them as LibCall when it
cannot.  I think this would be possible, but I think it would be too fragile
and complex as it would require targets to keep their expansion logic up
to date with the code in LegalizeDAG.

Reviewers: bogner, ab, t.p.northover, arsenm

Subscribers: wdng, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D25999

llvm-svn: 285704
2016-11-01 16:31:48 +00:00
Simon Pilgrim 6dd8fab443 [InstCombine] Folding of shifts by the sum of positive values
This patch introduces the combine:

(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive

If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.

Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).

Differential Revision: https://reviews.llvm.org/D26000

llvm-svn: 285696
2016-11-01 15:40:30 +00:00
James Molloy 70a3d6df52 [Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables
[Reapplying r284580 and r285917 with fix and testing to ensure emitted jump tables for Thumb-1 have 4-byte alignment]

The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions.

It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size.

TBB example:
Before: lsls r0, r0, #2    After: add  r0, pc
        adr  r1, .LJTI0_0         ldrb r0, [r0, #6]
        ldr  r0, [r0, r1]         lsls r0, r0, #1
        mov  pc, r0               add  pc, r0
  => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4.

The only case that can increase dynamic instruction count is the TBH case:

Before: lsls r0, r4, #2    After: lsls r4, r4, #1
        adr  r1, .LJTI0_0         add  r4, pc
        ldr  r0, [r0, r1]         ldrh r4, [r4, #6]
        mov  pc, r0               lsls r4, r4, #1
                                  add  pc, r4
  => 1 more instruction in prologue. Jump table shrunk by a factor of 2.

So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!)

llvm-svn: 285690
2016-11-01 13:37:41 +00:00
Valery Pykhtin 8a89d3662a [AMDGPU] Expand vector mulhu/mulhs
Differential revision: https://reviews.llvm.org/D26077

llvm-svn: 285684
2016-11-01 10:26:48 +00:00
Nemanja Ivanovic e70fa63390 [PowerPC] Implement vector shift builtins - llvm portion
This patch corresponds to review https://reviews.llvm.org/D26095.
Committing on behalf of Tony Jiang.

llvm-svn: 285681
2016-11-01 09:42:32 +00:00
Serge Pavlov 6ac8e034f6 Allow resolving response file names relative to including file
If a response file included by construct @file itself includes a response file
and that file is specified by relative file name, current behavior is to resolve
the name relative to the current working directory. The change adds additional
flag to ExpandResponseFiles that may be used to resolve nested response file
names relative to including file. With the new mode a set of related response
files may be kept together and reference each other with short position
independent names.

Differential Revision: https://reviews.llvm.org/D24917

llvm-svn: 285675
2016-11-01 06:53:29 +00:00
Sanjoy Das 9c729067e9 [TBAA] Use wrapper objects instead of raw getOperand s; NFC
This is intended to make the semantic intent clearer.

The wrapper objects are now generic to avoid `const_cast` s.  Since
`const` ness is part of the API of `MDNode::getMostGenericTBAA` (and
therefore I can't make things `const` all the way through without some
code churn outside TypeBasedAliasAnalysis.cpp), this seemed like the
cleanest solution.

llvm-svn: 285665
2016-11-01 02:58:30 +00:00
Sanjoy Das a1a77f3a21 [TBAA] Rename accessors to be more idiomatic; NFC
llvm-svn: 285661
2016-11-01 01:21:57 +00:00
Peter Collingbourne d3a6c70b2d Bitcode: Simplify BitstreamWriter::EnterBlockInfoBlock() interface.
No block info block should need to define local abbreviations, so we can
always use a code width of 2.

Also change all block info block writers to use EnterBlockInfoBlock.

Differential Revision: https://reviews.llvm.org/D26168

llvm-svn: 285660
2016-11-01 01:18:57 +00:00
Matt Arsenault f3dd863031 AMDGPU: Whitespace fixes
llvm-svn: 285659
2016-11-01 00:55:14 +00:00
Sanjay Patel 70c5f02d25 [DAG] disable nsw/nuw for add/sub/mul when simplifying based on demanded bits (PR30841)
This bug was exposed by using nsw/nuw for more aggressive folds in:
https://reviews.llvm.org/rL284844

The changes mimic the IR demanded bits logic in InstCombiner::SimplifyDemandedUseBits(),
but we can't just flip flag bits in the DAG; we have to create a new node that has the
bits cleared.

This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30841 

llvm-svn: 285656
2016-10-31 23:28:45 +00:00
Davide Italiano 51cbe13a3f [Hexagon] Garbage collect dead code.
llvm-svn: 285654
2016-10-31 22:56:56 +00:00
Evgeniy Stepanov 1bd9fc7098 Fix a typo.
Found with PVS-Studio here: http://www.viva64.com/en/b/0446/

llvm-svn: 285652
2016-10-31 22:42:39 +00:00
Saleem Abdulrasool e1aa782bd0 CodeGen: further loosen -O0 CG for WoA division
Generate the slowest possible codepath for noopt CodeGen.  Even trying to be
clever with the negated jump can cause out-of-range jumps.  Use a wide branch
instead. Although the code is modelled simplistically, the later optimizations
would recombine the branching into `cbz` if possible.  This re-enables the
previous optimization as well as hopefully gives us working code in all cases.

Addresses PR30356!

llvm-svn: 285649
2016-10-31 22:12:37 +00:00
Teresa Johnson 002af9bbce [ThinLTO] Disable importing and other cross-module optis at -O0
Summary:
There is no point to importing at -O0, since we won't inline. We should
also disable other cross-module optimizations.

(Plan to backport this fix to the 3.9 branch to fix PR30774)

Reviewers: pcc

Subscribers: johanengelen, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25918

llvm-svn: 285648
2016-10-31 22:12:21 +00:00
Justin Lebar ed1e312f05 [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass.
Summary:
This has been replaced by the NVPTXInferAddressSpaces pass.  We've had
the new one as the default with the old one accessible via a flag for
some months now, and we've had no problems.

Reviewers: tra

Subscribers: llvm-commits, jholewinski, jingyue, mgorny

Differential Revision: https://reviews.llvm.org/D26165

llvm-svn: 285642
2016-10-31 21:51:42 +00:00
Kevin Enderby d503940e8f More additional error checks for invalid Mach-O files when
the offsets and sizes of an element of the file overlaps with
another element in the Mach-O file.

This shows the approach to this testing for three elements
and contains for tests for their overlap.  Checking for all the
remain elements will be added next.

llvm-svn: 285632
2016-10-31 20:29:48 +00:00
Nemanja Ivanovic 60bdfe5a7c [PPC] add absolute difference altivec instructions and matching intrinsics
This patch corresponds to review https://reviews.llvm.org/D26072.
Committing on behalf of Sean Fertile.

llvm-svn: 285627
2016-10-31 19:47:52 +00:00
Victor Leschuk e1156c2eb0 DebugInfo: make DW_TAG_atomic_type valid
DW_TAG_atomic_type was already included in Dwarf.defs and emitted correctly,
however Verifier didn't recognize it as valid.
Thus we introduce the following changes:

  * Make DW_TAG_atomic_type valid tag for IR and DWARF (enabled only with -gdwarf-5)
  * Add it to related docs
  * Add DebugInfo tests

Differential Revision: https://reviews.llvm.org/D26144

llvm-svn: 285624
2016-10-31 19:09:38 +00:00
Kuba Brecka a28c9e8f09 [asan] Move instrumented null-terminated strings to a special section, LLVM part
On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings.

Differential Revision: https://reviews.llvm.org/D25026

llvm-svn: 285619
2016-10-31 18:51:58 +00:00
Tim Northover 037af52c8b GlobalISel: allow truncating pointer casts on AArch64.
llvm-svn: 285615
2016-10-31 18:31:09 +00:00
Tim Northover cdf23f1d93 GlobalISel: translate stack protector intrinsics
llvm-svn: 285614
2016-10-31 18:30:59 +00:00