Commit Graph

36550 Commits

Author SHA1 Message Date
Sanjay Patel 84a0bf64a8 revert r268751 - caused same failures on msan bot
llvm-svn: 268765
2016-05-06 17:51:37 +00:00
Simon Pilgrim b3f5cb7a65 [CostModel][X86] Tweak 'SSE2-only' test CPU as it was only disabling SSE41 not SSE3/SSSE3 etc.
llvm-svn: 268763
2016-05-06 17:50:07 +00:00
Artem Tamazov ebe71ce36a [AMDGPU][llvm-mc] Add support for sendmsg(...) syntax.
Added support for sendmsg(MSG[, OP[, STREAM_ID]]) syntax
in s_sendmsg and s_sendmsghalt instructions.
The syntax matches the SP3 assembler/disassembler rules.
That is why implicit inputs (like M0 and EXEC) are not printed
to disassembly output anymore.

sendmsg(...) allows only known message types and attributes,
even if literals are used instead of symbolic names.
However, raw literal (without "sendmsg") still can be used,
and that allows for any 16-bit value.

Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19596

llvm-svn: 268762
2016-05-06 17:48:48 +00:00
Simon Pilgrim 93d9b96bdb [CostModel][X86] Added ctlz/cttz undef-zero costmodel tests
llvm-svn: 268761
2016-05-06 17:48:35 +00:00
Ahmed Bougacha 258426ca7a [X86] Teach X86FixupBWInsts to promote MOV8rr/MOV16rr to MOV32rr.
Codesize is less (16) or equal (8), and we avoid partial dependencies.

Differential Revision: http://reviews.llvm.org/D19999

llvm-svn: 268760
2016-05-06 17:42:57 +00:00
Geoff Berry f8862968db [AArch64] Fix test to specify triple and disable post-RA scheduling.
This should fix bot breakage caused by r268746:
[AArch64] Combine callee-save and local stack SP adjustment instructions.

llvm-svn: 268752
2016-05-06 17:12:38 +00:00
Sanjay Patel 6609510c32 [SimplifyCFG] propagate branch metadata when creating select (retry r268550 with possible fix)
Retrying r268550 which was reverted at r268577 due a memory sanitizer failure.
I have not been able to reproduce that failure, but I've taken a guess at fixing
the problem in this version of the patch and will watch for another failure.

Original commit message:
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268751
2016-05-06 17:07:47 +00:00
Geoff Berry a5335647d5 [AArch64] Combine callee-save and local stack SP adjustment instructions.
Summary:
If a function needs to allocate both callee-save stack memory and local
stack memory, we currently decrement/increment the SP in two steps:
first for the callee-save area, and then for the local stack area.  This
changes the code to allocate them both at once at the very beginning/end
of the function.  This has two benefits:

1) there is one fewer sub/add micro-op in the prologue/epilogue

2) the stack adjustment instructions act as a scheduling barrier, so
moving them to the very beginning/end of the function increases post-RA
scheduler's ability to move instructions (that only depend on argument
registers) before any of the callee-save stores

This change can cause an increase in instructions if the original local
stack SP decrement could be folded into the first store to the stack.
This occurs when the first local stack store is to stack offset 0.  In
this case we are trading off one more sub instruction for one fewer sub
micro-op (along with benefits (2) and (3) above).

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18619

llvm-svn: 268746
2016-05-06 16:34:59 +00:00
Nikolay Haustov 6eb050ea4e Revert "AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2."
This reverts commit 47486d52454d60cdf6becc0b2efe533c73794380.

It broke calling OpenCL kernel from another kernel.

llvm-svn: 268739
2016-05-06 14:59:04 +00:00
Simon Pilgrim 5122c64fd8 [CostModel][X86] Added costmodel tests for vector ctpop/ctlz/cttz/bitreverse/bswap
llvm-svn: 268738
2016-05-06 14:38:14 +00:00
Daniel Sanders 8de3d3cad6 [mips] Fix inconsistent .cprestore behaviour between direct object emission and assembling.
Summary:
Direct object emission has an initialization order problem where an
InitMCObjectFile is called after MipsTargetELFStreamer determines whether
PIC is enabled by default or not. There doesn't seem to be point that
initializes all cases so split the responsibility between
MipsTargetELFStreamer and MipsAsmPrinter.

Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19728

llvm-svn: 268737
2016-05-06 14:37:24 +00:00
Chad Rosier 4ab37c0037 [SimplifyCFG] Prefer a simplification based on a dominating condition.
Rather than merge two branches with a common destination.
Differential Revision: http://reviews.llvm.org/D19743

llvm-svn: 268735
2016-05-06 14:25:14 +00:00
Daniel Sanders a463d31a64 [mips] Correct the ordering of HI/LO pairs in the relocation table.
Summary:
There seems to have been a misunderstanding as to the meaning of 'offset' in
the rules laid down by our ABI. The previous code believed that 'offset' meant
the offset within the section that the relocation is applied to. However, it
should have meant the offset from the symbol used in the relocation expression.

This patch adds two fields to ELFRelocationEntry and uses them to correct the
order of relocations for MIPS. These fields contain:
* The original symbol before shouldRelocateWithSymbol() is considered. This
  ensures that R_MIPS_GOT16 is able to correctly distinguish between local and
  external symbols, allowing us to tell whether %got() requires a matching
  %lo() or not (local symbols require one, external symbols don't). It also
  prevents confusing cases where the fuzzy matching rules cause things like
  %hi(foo)/%lo(foo+3) and %hi(bar)/%lo(bar+1) to swap their %lo()'s.
* The original offset before shouldRelocateWithSymbol() is considered. The
  existing Addend field is always zero when the object uses in place addends
  (because it's already moved it to the encoding) but MIPS needs to use the
  original offset to ensure that the linker correctly calculates the carry-in
  bit for %hi() and %got().

IAS ensures that unmatchable %hi()/%got() relocations are placed at the end of
the table to ensure that the linker rejects the table (we're unable to report
such errors directly). The alternatives to this risk accidental matching
against inappropriate relocations which may silently compute incorrect values
due to an incorrect carry bit between the %lo() and %hi()/%got().

Reviewers: sdardis

Subscribers: dsanders, sdardis, rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D19718

llvm-svn: 268733
2016-05-06 13:49:25 +00:00
Daniel Sanders f9d8b8ccc5 [mips][mips16] Use isUnconditionalBranch() in AnalyzeBranch() and constant island pass.
Summary:
This stops it misidentifying unconditional branches as conditional branches
which fixes a -verify-machineinstrs error about exiting a function via fall through.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19864

llvm-svn: 268731
2016-05-06 13:23:51 +00:00
Daniel Sanders a6cda12179 [mips][fastisel] Conditional moves do not have implicit operands.
Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19862

llvm-svn: 268730
2016-05-06 12:57:26 +00:00
Ryan Govostes 3f37df0326 [asan] add option to set shadow mapping offset
Allowing overriding the default ASAN shadow mapping offset with the
-asan-shadow-offset option, and allow zero to be specified for both offset and
scale.

Patch by Aaron Carroll <aaronc@apple.com>.

llvm-svn: 268724
2016-05-06 10:25:22 +00:00
Nikolay Haustov dc1bb79b92 AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2.
Summary:
    Check calling convention in AMDGPUMachineFunction::isKernel

    This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.

    Also, in the future unused non-kernels may be optimized.

    Reviewers: tstellarAMD, arsenm

    Subscribers: arsenm, joker.eph, llvm-commits

    Differential Revision: http://reviews.llvm.org/D19917

llvm-svn: 268719
2016-05-06 09:23:13 +00:00
Nikolay Haustov 1f7732abfa AMDGPU/SI: Add amdgpu_kernel calling convention. Part 1.
Summary:
This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.

Also, in the future unused non-kernels may be optimized.

For now, also accept SPIR_KERNEL for HCC frontend.

Also, add bitcode compatibility tests for missing calling conventions
except AVR_BUILTIN which doesn't have parse code.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, joker.eph, llvm-commits
llvm-svn: 268717
2016-05-06 09:07:29 +00:00
Mehdi Amini 3b132e34b0 ThinLTO: fix assertion and refactor check for hidden use from inline ASM in a helper function
This test was crashing, and currently it breaks bootstrapping clang with debuginfo

Differential Revision: http://reviews.llvm.org/D20008

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268715
2016-05-06 08:25:33 +00:00
Zlatko Buljan 31c9ebe281 [mips][microMIPS] Add CodeGen support for MUL* and DMUL* instructions
Differential Revision: http://reviews.llvm.org/D15744

llvm-svn: 268714
2016-05-06 08:24:14 +00:00
Xinliang David Li 8aebf44c97 [PM] port IR based PGO prof-gen pass to new pass manager
llvm-svn: 268710
2016-05-06 05:49:19 +00:00
Xinliang David Li 779dd2db95 [profile] Remove another unneeded field in raw profile reader
DataValueSize is now removed. The change is consolidated
with previous raw version bump.

llvm-svn: 268703
2016-05-06 02:13:00 +00:00
Ahmed Bougacha 16547c4e31 [CodeGen] Round [SU]INT_TO_FP result when promoting from f16.
If we don't, values that aren't precisely representable in f16 could
be used as-is in a promoted f32 operation, which would produce
incorrect results.

AArch64 had the correct behavior; add a focused test.

Fixes http://llvm.org/PR26871

llvm-svn: 268700
2016-05-06 00:58:00 +00:00
Xinliang David Li 28a932742c [PM] port Branch Frequency Analaysis pass to new PM
llvm-svn: 268687
2016-05-05 21:13:27 +00:00
Davide Italiano f54f2f0893 [PM] Port Interprocedural SCCP to the new pass manager.
llvm-svn: 268684
2016-05-05 21:05:36 +00:00
Dehao Chen f50c67ce7c Revert http://reviews.llvm.org/D19926 as it breaks tests.
llvm-svn: 268681
2016-05-05 20:47:53 +00:00
Dan Gohman 450a80754f [WebAssembly] Don't emit epilogue code in the middle of stackified code.
llvm-svn: 268679
2016-05-05 20:41:15 +00:00
Dehao Chen e48b4ee98c Simplify CFG before assigning discriminator.
Summary: We need to clean up CFG before assigning discriminator to minimize the impact of optimization on debug info.

Reviewers: davidxl, dblaikie, dnovillo

Subscribers: dnovillo, danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D19926

llvm-svn: 268675
2016-05-05 20:18:49 +00:00
Marcin Koscielnicki 60061c21cb [MSan] [MIPS64] Fix vararg helper for >1 fixed argument.
This fixes http://llvm.org/PR27646 on Mips64.

Differential Revision: http://reviews.llvm.org/D19989

llvm-svn: 268673
2016-05-05 20:13:17 +00:00
Matt Arsenault 6689abe632 AMDGPU: Run r600 tests last
llvm-svn: 268672
2016-05-05 20:07:37 +00:00
Tim Northover df43264cf7 ARM: don't attempt to merge litpools referencing different PC-anchors.
Given something like:

    ldr r0, .LCPI0_0 (== pc-rel var)
    add r0, pc

    ldr r1, .LCPI0_1 (== pc-rel var)
    add r1, pc

we cannot combine the 2 ldr instructions and litpools because they get added to
a different pc to form the correct address. I think the original logic came
from a time when we fused the LDRpci/PICADD instructions into one
pseudo-instruction so the PC was always immediately at-hand. That's no longer
the case.

Should fix general-dynamic TLS access on Linux, and quite possibly other -fPIC
code that relies on litpools (e.g. v6m and -Oz compilations) though trivial
tweaks of the .ll test didn't provoke anything.

llvm-svn: 268662
2016-05-05 18:38:53 +00:00
Krzysztof Parzyszek f7a4bd4068 [Hexagon] Add aliases for vector loads/stores with no explicit offset
The mem(r0) instructions are treated as mem(r0+#0).

llvm-svn: 268661
2016-05-05 18:38:35 +00:00
Vitaly Buka 1df2338bb6 Revert "[ThinLTO] Emit individual index files for distributed backends"
MemorySanitizer: use-of-uninitialized-value in lib/Bitcode/Writer/BitcodeWriter.cpp:364:70
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/12544/steps/check-llvm%20msan/logs/stdio

This reverts commit 0c4a898ea550699d1b2f4fe3767251c8f9a48d52.

llvm-svn: 268660
2016-05-05 18:31:00 +00:00
Kevin Enderby b34e3a1877 Clean up the specific error message for a malformed Mach-O files with bad segment
load commands.

The existing test case in test/Object/macho-invalid.test for
macho-invalid-too-small-segment-load-command has a cmdsize of 55, while
being too small also it is not a multiple of 4.  So when that check is added
this test case will produce a different error. So I constructed a new test case
that will trigger the intended error.

I also changed the error message to be consistent with the other malformed Mach-O
file error messages which prints the load command index.  I also removed both
object_error::macho_load_segment_too_small and
object_error::macho_load_segment_too_many_sections from Object/Error.h
as they are not needed and can just use object_error::parse_failed and let the
error message string distinguish the specific error.

llvm-svn: 268652
2016-05-05 17:43:35 +00:00
Nicolai Haehnle ffbd56a1c9 AMDGPU: Uniform branch conditions can originate with intrinsics
Summary:
Discovered by Dave Airlie, fixes an assertion in Khronos OpenGL CTS
GL43-CTS.shader_storage_buffer_object.advanced-matrix.

In this particular case, the buffer load intrinsic fed into a uniform
conditional branch, and led the brcond lowering down the wrong path.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19931

llvm-svn: 268650
2016-05-05 17:36:36 +00:00
Tom Stellard fcfaea4cff AMDGPU/SI: Add support for AMD code object version 2.
Summary:
Version 2 is now the default.  If you want to emit version 1, use
the amdgcn--amdhsa-amdcov1 triple.

Reviewers: arsenm, kzhuravl

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19283

llvm-svn: 268647
2016-05-05 17:03:33 +00:00
Chad Rosier 25cfb7dbd6 [ValueTracking] Improve isImpliedCondition for matching LHS and Imm RHSs.
llvm-svn: 268636
2016-05-05 15:39:18 +00:00
Silviu Baranga c05bab8a9c [LV] Identify more induction PHIs by coercing expressions to AddRecExprs
Summary:
Some PHIs can have expressions that are not AddRecExprs due to the presence
of sext/zext instructions. In order to prevent the Loop Vectorizer from
bailing out when encountering these PHIs, we now coerce the SCEV
expressions to AddRecExprs using SCEV predicates (when possible).

We only do this when the alternative would be to not vectorize.

Reviewers: mzolotukhin, anemet

Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17153

llvm-svn: 268633
2016-05-05 15:20:39 +00:00
James Y Knight 0c145c0c3a Remove bit-rotten CppBackend.
This backend was supposed to generate C++ code which will re-construct
the LLVM IR passed as input. This seems to me to have very marginal
usefulness in the first place.

However, the code has never been updated to use IRBuilder, which makes
its current value negative -- people who look at the output may be
steered to use the *wrong* C++ APIs to construct IR.

Furthermore, it's generated code that doesn't compile since at least
2013.

Differential Revision: http://reviews.llvm.org/D19942

llvm-svn: 268631
2016-05-05 14:35:40 +00:00
Nirav Dave 996fc133b7 Fix Mips Parser error reporting
[mips] On error, ParseDirective should always return false to signify that the
directive was understood.

Reviewers: dsanders, vkalintiris, sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19929

llvm-svn: 268630
2016-05-05 14:15:46 +00:00
Teresa Johnson f8cbd6591f Fix Windows bot failures from r268627
Remove "/" path separator from expected pattern which should fix a
couple of Windows bots that have failed:

http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/4816
http://bb.pgr.jp/builders/ninja-clang-i686-msc19-R/builds/2610

llvm-svn: 268629
2016-05-05 14:10:57 +00:00
Teresa Johnson 9254ebe3c0 [ThinLTO] Emit individual index files for distributed backends
Summary:
When launching ThinLTO backends in a distributed build (currently
supported in gold via the thinlto-index-only plugin option), emit
an individual index file for each backend process as described here:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html

The individual index file encodes the summary and module information
required for implementing the importing/exporting decisions made
for a given module in the thin link step.
This is in place of the current mechanism that uses the combined index
to make importing decisions in each back end independently. It is an
enabler for doing global summary based optimizations in the thin link
step (which will be recorded in the individual index files), and reduces
the size of the index that must be sent to each backend process, and
the amount of work to scan it in the backends.

Rather than create entirely new ModuleSummaryIndex structures (and all
the included unique_ptrs) for each backend index file, a map is created
to record all of the GUID and summary pointers needed for a particular
index file. The IndexBitcodeWriter walks this map instead of the full
index (hiding the details of managing the appropriate summary iteration
in a new iterator subclass). This is more efficient than walking the
entire combined index and filtering out just the needed summaries during
each backend bitcode index write.

Depends on D19481.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19556

llvm-svn: 268627
2016-05-05 13:44:56 +00:00
Marcin Koscielnicki 0275fac2c9 [X86] Extend some Linux special cases to cover kFreeBSD.
Both Linux and kFreeBSD use glibc, so follow similiar code paths.
Add isTargetGlibc to check for this, and use it instead of isTargetLinux
in a few places.

Fixes PR22248 for kFreeBSD.

Differential Revision: http://reviews.llvm.org/D19104

llvm-svn: 268624
2016-05-05 11:35:51 +00:00
Igor Kudrin 27d8dd39cf [Coverage] Combine counts of expansion regions if there are no code regions for the same area.
Differential Revision: http://reviews.llvm.org/D18831

llvm-svn: 268620
2016-05-05 09:39:45 +00:00
David Majnemer 911d0e3c21 [X86] Use the right type when folding xor (truncate (shift)) -> setcc
The result type of setcc is dependent on whether or not AVX512 is
present.
We had an X86-specific DAG-combine which assumed that the result type
should be i8 when it could be i1.
This meant that we would generate illegal setccs which LowerSETCC did
not like.

Instead, use an appropriate type and zero extend to i8.

Also, there were some scenarios where the fold should have fired but
didn't because we were overly cautious about the types.  This meant that
we generated:

        shrl    $31, %edi
        andl    $1, %edi
        kmovw   %edi, %k0
        kxnorw  %k0, %k0, %k1
        kshiftrw        $15, %k1, %k1
        kxorw   %k1, %k0, %k0
        kmovw   %k0, %eax

instead of:

        testl   %edi, %edi
        setns   %al

This fixes PR27638.

llvm-svn: 268609
2016-05-05 06:00:56 +00:00
Mehdi Amini 022b5bcb7a LTOCodeGenerator: add linkonce(_odr) to "llvm.compiler.used" when present in "MustPreserve" set
If the linker requested to preserve a linkonce function, we should
honor this even if we drop all uses.
We explicitely avoid turning them into weak_odr (unlike the first
version of this patch in r267644), because the codegen can be
different on Darwin: because of `llvm::canBeOmittedFromSymbolTable()`
we may emit the symbol as weak_def_can_be_hidden instead of
weak_definition.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268607
2016-05-05 05:14:24 +00:00
Mehdi Amini 752ffe9c5f Revert "LTOCodeGenerator: turns linkonce(_odr) into weak_(odr) when present "MustPreserve" set"
This reverts commit r267644. Turning linkonce_odr into weak_odr is
a sementic change on Darwin: because of
`llvm::canBeOmittedFromSymbolTable()` we may emit the symbol as
weak_def_can_be_hidden instead of weak_definition.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268606
2016-05-05 05:14:20 +00:00
Xinliang David Li 78d61b11e3 [Profile] Raw profile header clean up
Remove dead ValueDataBegin field in raw header.

llvm-svn: 268602
2016-05-05 04:07:30 +00:00
Xinliang David Li 6e5dd41481 [PM] Port Branch Probability Analysis pass to the new pass manager.
Differential Revision: http://reviews.llvm.org/D19839

llvm-svn: 268601
2016-05-05 02:59:57 +00:00
Davide Italiano 344e838fea [PM] Port EliminateAvailableExternally pass to the new pass manager.
llvm-svn: 268599
2016-05-05 02:37:32 +00:00
Ryan Govostes 8c21be6b3e Revert "[asan] add option to set shadow mapping offset"
This reverts commit ba89768f97b1d4326acb5e33c14eb23a05c7bea7.

llvm-svn: 268588
2016-05-05 01:27:04 +00:00
Ryan Govostes 097c5b051c [asan] add option to set shadow mapping offset
Allowing overriding the default ASAN shadow mapping offset with the
-asan-shadow-offset option, and allow zero to be specified for both offset and
scale.

llvm-svn: 268586
2016-05-05 01:14:39 +00:00
Davide Italiano 164b9bc6fe [PM] Port ConstantMerge to the new pass manager.
llvm-svn: 268582
2016-05-05 00:51:09 +00:00
Marcin Koscielnicki ad1482c6f1 [SystemZ] Implement backchain attribute (recommit with fix).
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI.  This will
be used to implement -mbackchain option in clang.

Differential Revision: http://reviews.llvm.org/D19889

Fixed in this version: added RegState::Define and RegState::Kill on R1D
in prologue.

llvm-svn: 268581
2016-05-05 00:37:30 +00:00
Adam Nemet 3c5eabfcbc [LoopDataPrefetch] Add optimization remark
With -Rpass=loop-data-prefetch, show the memory access that got
prefetched.

llvm-svn: 268578
2016-05-05 00:08:15 +00:00
Vitaly Buka fdcea9d78a Revert "[SimplifyCFG] propagate branch metadata when creating select"
MemorySanitizer: use-of-uninitialized-value
0x4910e47 in count /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:159:12
0x4910e47 in countLeadingZeros<unsigned long> /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:183
0x4910e47 in FitWeights /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:855
0x4910e47 in SimplifyCondBranchToCondBranch /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:2895

This reverts commit 609f4dd4bf3bc735c8c047a4d4b0a8e9e4d202e2.

llvm-svn: 268577
2016-05-04 23:59:33 +00:00
Marcin Koscielnicki 12037b4e9d Revert "[SystemZ] Implement backchain attribute."
This reverts commit rL268571.

It caused failures in register scavenger.

llvm-svn: 268576
2016-05-04 23:54:53 +00:00
Marcin Koscielnicki 9de88d9bbe [SystemZ] Implement llvm.get.dynamic.area.offset
To be used for AddressSanitizer.

Differential Revision: http://reviews.llvm.org/D19817

llvm-svn: 268572
2016-05-04 23:31:26 +00:00
Marcin Koscielnicki 835d927938 [SystemZ] Implement backchain attribute.
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI.  This will
be used to implement -mbackchain option in clang.

Differential Revision: http://reviews.llvm.org/D19889

llvm-svn: 268571
2016-05-04 23:31:20 +00:00
Quentin Colombet 0c5bfd0514 [X86] Add a few register classes for x32 address accesses.
The new register classes allow to tell the machine verifier that it is
fine to use RIP for address accesses in x32 mode. Prior to that patch,
we would complain that we are using a GR64 in place of GR32, whereas it
is actually fine to use GR64 for x32 as long as the 32 high bits are 0s.
RIP has this property and is used for RIP-relative addressing.

This partially fixes http://llvm.org/PR27481.

llvm-svn: 268567
2016-05-04 22:45:31 +00:00
Simon Pilgrim 1f5ad702f8 [SelectionDAG] BITREVERSE vector legalization of bit operations (REAPPLIED)
Some vector bit operations are promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use a new TLI helper isOperationLegalOrCustomOrPromote instead, allowing the SSE implementations to stay on the simd unit.

Differential Revision: http://reviews.llvm.org/D19805

llvm-svn: 268561
2016-05-04 22:08:51 +00:00
Balaram Makam 569eaec5f3 "Reapply r268521 "[InstCombine] Canonicalize icmp instructions based on dominating conditions.""
This reapplies commit r268521, that was reverted in r268530 due to a test failure in select-implied.ll
Modified the test case to reflect the new change.

llvm-svn: 268557
2016-05-04 21:32:14 +00:00
Sanjay Patel 7e8c285814 [SimplifyCFG] propagate branch metadata when creating select
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268550
2016-05-04 20:48:24 +00:00
Evandro Menezes bcb95cd0ed [AArch64] Use the reciprocal estimation machinery
This patch adds support for estimating the square root, its reciprocal and
division or reciprocal using the combiner generic reciprocal machinery.

llvm-svn: 268539
2016-05-04 20:18:27 +00:00
Vitaly Buka 6b5c89262a Revert r268529 because it caused use-of-uninitialized-value
Summary: This reverts commit d88cc0862bf7da64850b89e9bb5ea9f95e7f1184.

#0 0xfed467 in llvm::ARMFrameLowering::determineCalleeSaves(llvm::MachineFunction&, llvm::BitVector&, llvm::RegScavenger*) const /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Target/ARM/ARMFrameLowering.cpp:1625:52
#1 0x330d4cc in (anonymous namespace)::PEI::runOnMachineFunction(llvm::MachineFunction&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/PrologEpilogInserter.cpp:186:3
#2 0x3193e12 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:60:13
#3 0x396237d in llvm::FPPassManager::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1526:23
#4 0x3962a23 in llvm::FPPassManager::runOnModule(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1547:16
#5 0x3963d52 in runOnModule /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1603:23
#6 0x3963d52 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1706
#7 0x6bb910 in compileModule(char**, llvm::LLVMContext&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:412:5
#8 0x6b3c25 in main /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:218:22
#9 0x7fd4a7d37ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
#10 0x625c93 in _start (/mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm_build_msan/bin/llc+0x625c93)

Reviewers:

Subscribers:

llvm-svn: 268536
2016-05-04 19:44:11 +00:00
Hal Finkel e2b89118bd [ConstantFold] Don't try to strip fp -> int bitcasts to simplify icmps
ConstantFold has logic to take icmp (bitcast x to y), null and strip the
bitcast. This makes sense in general, but not if x has floating-point type. In
this case, we'd need a fcmp, not an icmp, and the code will assert. We normally
don't see this situation because we constant fold fp -> int bitcasts, however,
we'll see it for bitcasts of ppc_fp128 -> i128. This is because that bitcast is
Endian-dependent, and as a result, we don't simplify it in ConstantFold (we
could, but no one has yet added the necessary logic). Regardless, ConstantFold
should not depend on that canonicalization for correctness.

llvm-svn: 268534
2016-05-04 19:37:08 +00:00
Sanjay Patel 13d57b94bb [x86] add tests to show current codegen for obscured fneg/fabs
llvm-svn: 268533
2016-05-04 19:06:03 +00:00
Marcin Koscielnicki cc9676a821 [MSan] [Mips64] Add tests for vararg handling.
Differential Revision: http://reviews.llvm.org/D19919

llvm-svn: 268531
2016-05-04 18:39:14 +00:00
Balaram Makam 31e7e13789 Revert "[InstCombine] Canonicalize icmp instructions based on dominating conditions."
This reverts commit 573a40f79b35cf3e71db331bb00f6a84f03b835d.

llvm-svn: 268530
2016-05-04 18:37:35 +00:00
Weiming Zhao 2373f769ce [ARM] Fix Scavenger assert due to underestimated stack size
Summary:
Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure.

Reviewers: rengolin

Subscribers: aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D19896

llvm-svn: 268529
2016-05-04 18:19:33 +00:00
Simon Pilgrim 1a14f0d25c Revert r268504
llvm-svn: 268526
2016-05-04 17:49:14 +00:00
Marianne Mailhot-Sarrasin b192670279 Adding test cases showing the behavior of LoopUnrollPass according to optnone and optsize attributes
The unroll pass was disabled by clang in /Os. Those new test cases shows that the pass will behave correctly even if it is not fully disabled. This patch is related in some way to the clang commit (http://reviews.llvm.org/D19827), which re-enables the pass in /Os.

Differential Revision: http://reviews.llvm.org/D19870

llvm-svn: 268524
2016-05-04 17:45:40 +00:00
Balaram Makam cf3bcb2625 [InstCombine] Canonicalize icmp instructions based on dominating conditions.
Summary:
    This patch canonicalizes conditions based on the constant range information
    of the dominating branch condition.
    For example:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp sgt i64 %a, 0

    Would now be canonicalized into:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp ne i64 %a, 0

Reviewers: mcrosier, gberry, t.p.northover, llvm-commits, reames, hfinkel, sanjoy, majnemer

Subscribers: MatzeB, majnemer, mcrosier

Differential Revision: http://reviews.llvm.org/D18841

llvm-svn: 268521
2016-05-04 17:34:20 +00:00
Reid Kleckner b034526853 Reland "Use ScopedPrinter in llvm-pdbdump"
This reverts r268508 and reinstates r268506 with an additional cast from
TypeLeafKind to unsigned to allow conversion to HexNumber.

llvm-svn: 268517
2016-05-04 16:09:04 +00:00
Nemanja Ivanovic 1a2b2f03e7 [PowerPC] Generate VSX version of splat word
This patch corresponds to review:
http://reviews.llvm.org/D18592

It allows the PPC back end to generate the xxspltw instruction where we
previously only emitted vspltw.

llvm-svn: 268516
2016-05-04 16:04:02 +00:00
Simon Pilgrim bc0e1d7492 [X86][SSE] Regenerate vector bswap tests
llvm-svn: 268514
2016-05-04 15:45:48 +00:00
Hans Wennborg 0c3518e84b [SimplifyCFG] isSafeToSpeculateStore now ignores debug info
This patch fixes PR27615.

@llvm.dbg.value instructions no longer count towards the maximum number of
instructions to look back at in the instruction list when searching for a
store instruction. This should make the output consistent between debug and
non-debug build.

Patch by Henric Karlsson <henric.karlsson@ericsson.com>!

Differential Revision: http://reviews.llvm.org/D19912

llvm-svn: 268512
2016-05-04 15:40:57 +00:00
Chad Rosier 20dbbf3542 Revert "Use ScopedPrinter in llvm-pdbdump"
This reverts commit r268506 due to build breakage.

llvm-svn: 268508
2016-05-04 15:25:06 +00:00
Zachary Turner cdd313ca19 Use ScopedPrinter in llvm-pdbdump
When printing raw PDB file fields, streams, and records, use the
ScopedPrinter class so we have consistency with llvm-readobj's output
format.

For the most part this is pretty mechanical, but I had to fix up the test
file to conform to the new YAMLesque output format. i added a few
additional helper functions to the ScopedPrinter such as one to print a
dotted version, etc.

Differential Revision: http://reviews.llvm.org/D19897
Reviewed By: rnk

llvm-svn: 268506
2016-05-04 15:05:12 +00:00
Simon Pilgrim b97c06210b [SelectionDAG] BITREVERSE vector legalization of bit operations
Vector bit operations are typically promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use isOperationLegalOrPromote instead, allowing the SSE implementations to stay on the simd unit.

Differential Revision: http://reviews.llvm.org/D19805

llvm-svn: 268504
2016-05-04 15:01:13 +00:00
Igor Laevsky fb1811d3a0 [RS4GC] Use SetVector/MapVector instead of DenseSet/DenseMap to guarantee stable ordering
Goal of this change is to guarantee stable ordering of the statepoint arguments and other 
newly inserted values such as gc.relocates. Previously we had explicit sorting in a couple
of places. However for unnamed values ordering was partial and overall we didn't have any 
strong invariant regarding it. This change switches all data structures to use SetVector's
and MapVector's which provide possibility for deterministic iteration over them.
Explicit sorting is now redundant and was removed.

Differential Revision: http://reviews.llvm.org/D19669

llvm-svn: 268502
2016-05-04 14:55:36 +00:00
Elena Demikhovsky 24aba1ca38 The test files are auto-generated by update_llc_test_checks.py utility.
No functional changes.

llvm-svn: 268498
2016-05-04 14:31:18 +00:00
Daniel Sanders c07f06aeee [mips][ias] Only round section sizes when explicitly requested.
As requested by Rafael Espindola in his post-commit comments on r268036. This
makes the previous behaviour the default while still allowing verification of
IAS.

llvm-svn: 268496
2016-05-04 13:21:06 +00:00
Chris Dewhurst 8338d90ba3 [Sparc] Allow taking of function address into a register.
Modification of previously existing code (variable rename only), with unit test added.

Differential Revision: http://reviews.llvm.org/D19368

llvm-svn: 268493
2016-05-04 12:11:05 +00:00
Zlatko Buljan 4807f829b4 [mips][microMIPS] Add CodeGen support for microMIPSr6 ROTR and ROTRV and add tests for LL, SC, SYSCALL, ROTR, ROTRV, LWM32, SWM32 and MOVEP instructions
Differential Revision: http://reviews.llvm.org/D19857

llvm-svn: 268491
2016-05-04 12:02:12 +00:00
Chris Dewhurst 69fa1926db [Sparc] Implement __builtin_setjmp, __builtin_longjmp back-end.
This code implements builtin_setjmp and builtin_longjmp exception handling intrinsics for 32-bit Sparc back-ends.

The code started as a mash-up of the PowerPC and X86 versions, although there are sufficient differences to both that had to be made for Sparc handling.

Note: I have manual tests running. I'll work on a unit test and add that to the rest of this diff in the next day.

Also, this implementation is only for 32-bit Sparc. I haven't focussed on a 64-bit version, although I have left the code in a prepared state for implementing this, including detecting pointer size and comments indicating where I suspect there may be differences.

Differential Revision: http://reviews.llvm.org/D19798

llvm-svn: 268483
2016-05-04 09:33:30 +00:00
Daniel Sanders 04468f2914 [mips] Remove -mattr=+n64 and fix indentation in tailcall.ll RUN lines. NFC.
-mattr=+n64 isn't the correct way to specify the ABI and N64 is already the
default for the RUN line concerned.

llvm-svn: 268482
2016-05-04 09:08:35 +00:00
David Majnemer 3918cdd2a1 [ConstantFolding, ValueTracking] Fold constants involving bitcasts of ConstantVector
We assumed that ConstantVectors would be rather uninteresting from the
perspective of analysis.  However, this is not the case due to a quirk
of how LLVM handles vectors of i1.  Vectors of i1 are not
ConstantDataVectors like vectors of i8, i16, i32 or i64 because i1's
SizeInBits differs from it's StoreSizeInBytes.  This leads to it being
categorized as a ConstantVector instead of a ConstantDataVector.

Instead, treat ConstantVector more uniformly.

This fixes PR27591.

llvm-svn: 268479
2016-05-04 06:13:33 +00:00
Simon Atanasyan 8a71b53ea9 [llvm-readobj] Print MIPS .MIPS.options section content
.MIPS.options section specifies miscellaneous options to be applied
to an object file. LLVM as well as modern versions of GNU tools emit
the only type of the options - ODK_REGINFO. The patch teaches llvm-readobj
to print details of the ODK_REGINFO and skip contents of other options.

llvm-svn: 268478
2016-05-04 05:58:57 +00:00
David Majnemer 2c5aeabedd [X86] Lower zext i1 arguments
i1 is now a legal type for X86 with AVX512.
There were some paths in X86FastISel which were not quite ready to see
an i1 value: they were not quite sure how to deal with sign/zero extends
for call arguments.
DTRT by extending to i8 for zeroext and bailing out of FastISel for
signext.

This fixes PR27591.

llvm-svn: 268470
2016-05-04 00:22:23 +00:00
David Majnemer 95549497ec [GlobalDCE, Misc] Don't remove functions referenced by ifuncs
We forgot to consider the target of ifuncs when considering if a
function was alive or dead.

N.B. Also update a few auxiliary tools like bugpoint and
verify-uselistorder.

This fixes PR27593.

llvm-svn: 268468
2016-05-04 00:20:48 +00:00
Kevin Enderby a8e3ab0c56 Produce another specific error message for a malformed Mach-O file when a load
command has a size less than 8 bytes.

I think the existing test case in test/Object/macho-invalid.test for
macho64-invalid-too-small-load-command was trying to test for this but that
test case triggered a different error given how it was constructed.  So I
constructed a new test case that would trigger this specific error.

I also changed the error message to be consistent with the other malformed Mach-O
file error messages.  I also removed object_error::macho_small_load_command from
Object/Error.h as it is not needed and can just use object_error::parse_failed
and let the error message string distinguish the error.

llvm-svn: 268463
2016-05-03 23:13:50 +00:00
Andrew Kaylor 50271f787e Add opt-bisect support to additional passes that can be skipped
Differential Revision: http://reviews.llvm.org/D19882

llvm-svn: 268457
2016-05-03 22:32:30 +00:00
Justin Bogner d0d2341f30 PM: Port LoopRotation to the new loop pass manager
llvm-svn: 268452
2016-05-03 22:02:31 +00:00
Simon Pilgrim fb1766ad68 [X86][XOP] Add placeholder VPERMIL2 combining tests
llvm-svn: 268450
2016-05-03 21:55:37 +00:00
Justin Bogner ab6a513b4e PM: Port LoopSimplifyCFG to the new pass manager
llvm-svn: 268446
2016-05-03 21:47:32 +00:00
Tim Northover d2ecbccf27 X86-Darwin: start emitting data-region directives for jump-tables.
The surrounding tools can cope these days, and they were invented for a reason.

llvm-svn: 268437
2016-05-03 21:03:41 +00:00
Sanjoy Das 8a004551d0 [RS4GC] Add a test case around calling conventions; NFC
llvm-svn: 268436
2016-05-03 20:58:10 +00:00
Davide Italiano 66228c4cf1 [IPO/GlobalDCE] Port to the new pass manager.
Differential Revision:  http://reviews.llvm.org/D19782

llvm-svn: 268425
2016-05-03 19:39:15 +00:00
Jack Liu f101c0f7a1 [SROA] Function canConvertValue needs to check whether both NewTy and OldTy pointers are
pointing to the same addr space. This can prevent SROA from creating a bitcast
between pointers with different addr spaces.

Differential Revision: http://reviews.llvm.org/D19697

llvm-svn: 268424
2016-05-03 19:30:48 +00:00
Jack Liu 430e2c2140 Revert 268409 due to missing comment.
llvm-svn: 268421
2016-05-03 19:15:02 +00:00
Quentin Colombet 26dab3a485 [ImplicitNullChecks] Account for implicit-defs as well when updating the liveness.
The replaced load may have implicit-defs and those defs may be used
in the block of the original load. Make sure to update the liveness
accordingly.

This is a generalization of r267817.

llvm-svn: 268412
2016-05-03 18:09:06 +00:00
Jack Liu 1ff4a0b7ee (no commit message)
llvm-svn: 268409
2016-05-03 18:01:43 +00:00
Sanjoy Das 4ae3920c5b [LICM] Kill SCEV loop dispositions if needed
SCEV caches whether SCEV expressions are loop invariant, variant or
computable.  LICM breaks this cache, almost by definition; so clear the
SCEV disposition cache if LICM changed anything.

llvm-svn: 268408
2016-05-03 17:50:11 +00:00
Sanjoy Das 7e7a5a050a Use all_of instead of a raw loop; NFC
Added some tests despite being NFC, since it looks like nothing was
exercising the "all incoming values to exit PHIs are same" logic.

llvm-svn: 268407
2016-05-03 17:50:06 +00:00
Sanjoy Das 905fc27ebf [LoopDeletion] Clear SCEV loop dispositions
`Loop::makeLoopInvariant` can hoist instructions out of loops, so loop
dispositions for the loop it operated on may need to be cleared.  We can
be smarter here (especially around how `forgetLoopDispositions` is
implemented), but let's be correct first.

Fixes PR27570.

llvm-svn: 268406
2016-05-03 17:50:02 +00:00
Sanjoy Das 013a4ac4aa [SCEV] Tweak the output format and content of -analyze
In the "LoopDispositions:" section:

 - Instead of printing out a list, print out a "dictionary" to make it
   obvious by inspection which disposition is for which loop.  This is
   just a cosmetic change.

 - Print dispositions for parent _and_ sibling loops.  I will use this
   to write a test case.

llvm-svn: 268405
2016-05-03 17:49:57 +00:00
Kevin Enderby 368e714907 Produce another specific error message for a malformed Mach-O file when a load
command other than the first one is past the end of the load commands.

This is like the test case in test/Object/macho-invalid.test for
macho64-invalid-incomplete-load-command but it is the second load command
that is past the end of all the load commands instead of the first.

The code in the constructor for MachOObjectFile that loops over the load
commands used getNextLoadCommandInfo() which was not producing
a good error message.  So that was fixed and a test case was added.

llvm-svn: 268403
2016-05-03 17:16:08 +00:00
Mehdi Amini 7f7d8be518 Move "Eliminate Available Externally" immediately after the inliner
This pass is supposed to reduce the size of the IR for compile time
purpose. We should run it ASAP, except when we prepare for LTO or
ThinLTO, and we want to keep them available for link-time inline.

Differential Revision: http://reviews.llvm.org/D19813

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268394
2016-05-03 15:46:00 +00:00
Simon Pilgrim d2752708a3 [X86][SSE] Added target shuffle combine to MOVQ
llvm-svn: 268391
2016-05-03 15:05:13 +00:00
Anna Thomas 43d7e1cbff Fold compares irrespective of whether allocation can be elided
Summary
When a non-escaping pointer is compared to a global value, the
comparison can be folded even if the corresponding malloc/allocation
call cannot be elided.
We need to make sure the global value is not null, since comparisons to
null cannot be folded.

In future, we should also handle cases when the the comparison
instruction dominates the pointer escape.

Reviewers: sanjoy
Subscribers s.egerton, llvm-commits

Differential Revision: http://reviews.llvm.org/D19549

llvm-svn: 268390
2016-05-03 14:58:21 +00:00
Daniel Sanders 01bcefd983 [mips][fastisel] ADJCALLSTACKUP has a second immediate operand.
Summary:
It's always zero for SelectionDAG and is never read by the MIPS backend so
do the same for FastISel.

Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19863

llvm-svn: 268386
2016-05-03 14:19:26 +00:00
Daniel Sanders fe98b2f54b [mips] Use MipsMCExpr instead of MCSymbolRefExpr for all relocations.
Summary:
This is much closer to the way MIPS relocation expressions work
(%hi(foo + 2) rather than %hi(foo) + 2) and removes the need for the
various bodges in MipsAsmParser::evaluateRelocExpr().

Removing those bodges ensures that the constant stored in MCValue is the
full 32 or 64-bit (depending on ABI) offset from the symbol. This will be used
to correct the %hi/%lo matching needed to sort the relocation table correctly.

As part of this:
* Gave MCExpr::print() the ability to omit parenthesis when emitting a
  symbol reference inside a MipsMCExpr operator like %hi(X). Without this
  we print things like %lo(($L1)).
* %hi(%neg(%gprel(X))) is now three MipsMCExpr's instead of one. Most of
  the related special cases have been removed or moved to MipsMCExpr. We
  can remove the rest as we gain support for the less common relocations
  when they are not part of this specific combination.
* Renamed MipsMCExpr::VariantKind and the enum prefix ('VK_') to avoid confusion
  with MCSymbolRefExpr::VariantKind and its prefix (also 'VK_').
* fixup_Mips_GOT_Local and fixup_Mips_GOT_Global were found to be identical
  and merged into fixup_Mips_GOT.
* MO_GOT16 and MO_GOT turned out to be identical and have been merged into
  MO_GOT.
* VK_Mips_GOT and VK_Mips_GOT16 turned out to be the same thing so they
  have been merged into MEK_GOT

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19716

llvm-svn: 268379
2016-05-03 13:35:44 +00:00
Simon Pilgrim 32e78c3ff7 [X86][SSSE3] Missing combine opportunity to simplify to a MOVQ shuffle
llvm-svn: 268378
2016-05-03 13:12:44 +00:00
Igor Breger 58c07806ae [AVX512] Add support for commutative MAX/MIN . In general VMAX{PS,PD} and VMIN{PS,PD} instruction are not commutative . In combine pass only if UnsafeFPMath are used VMAX/VMAX are converted to commutative nodes VMAXC/VMAXC.
Differential Revision: http://reviews.llvm.org/D19860

llvm-svn: 268375
2016-05-03 11:51:45 +00:00
Kristof Beyls c08f70588d Mark that SpeculativeExecution preserves Globals Alias Analysis.
A few benchmarks with lots of accesses to global variables in the hot
loops regressed a lot since r266399, which added the
SpeculativeExecution pass to the default pipeline. The problem is that
this pass doesn't mark Globals Alias Analysis as preserved. Globals
Alias Analysis is computed in a module pass, whereas
SpeculativeExecution is a function pass, and a lot of passes dependent
on the Globals Alias Analysis to optimize these benchmarks are also
function passes. As such, the Globals Alias Analysis information cannot
be recomputed between SpeculativeExecution and the following function
passes needing that information.

SpeculativeExecution doesn't invalidate Globals Alias Analysis, so mark
it as such to fix those performance regressions.

Differential Revision: http://reviews.llvm.org/D19806

llvm-svn: 268370
2016-05-03 08:33:26 +00:00
Igor Breger ab076c683c [AVX512] Fix lowerV4X128VectorShuffle to select correctly input operands .
Differential Revision: http://reviews.llvm.org/D19803

llvm-svn: 268368
2016-05-03 08:08:44 +00:00
Matthias Braun bb85aef77d Fix uppercase typo
llvm-svn: 268362
2016-05-03 05:21:53 +00:00
Matthias Braun e25bbd0bb8 AArch64/optimizeCondBranch: Remove earlier kill flag when forming TBZ
This fixes -verify-machineinstrs complaints when compiling
test-suite/SingleSource/Benchmarks/Shootout-C++/wordfreq.cpp

llvm-svn: 268360
2016-05-03 04:54:16 +00:00
Jack Liu cd777c8b35 test commit
llvm-svn: 268358
2016-05-03 04:06:24 +00:00
David Majnemer 3d90bb79c4 [LoopUnroll] Unroll loops which have exit blocks to EH pads
We were overly cautious in our analysis of loops which have invokes
which unwind to EH pads.  The loop unroll transform is safe because it
only clones blocks in the loop body, it does not try to split critical
edges involving EH pads.  Instead, move the necessary safety check to
LoopUnswitch.

N.B. The safety check for loop unswitch is covered by an existing test
which fails without it.

llvm-svn: 268357
2016-05-03 03:57:40 +00:00
Zachary Turner f5c59654f7 Parse the TPI (type information) stream of PDB files.
This parses the TPI stream (stream 2) from the PDB file. This stream
contains some header information followed by a series of codeview records.
There is some additional complexity here in that alongside this stream of
codeview records is a serialized hash table in order to efficiently query
the types. We parse the necessary bookkeeping information to allow us to
reconstruct the hash table, but we do not actually construct it yet as
there are still a few things that need to be understood first.

Differential Revision: http://reviews.llvm.org/D19840
Reviewed By: ruiu, rnk

llvm-svn: 268343
2016-05-03 00:28:21 +00:00
Mehdi Amini 5b85d8d67b ThinLTO: do not import function whose linkage prevents inlining.
There is not point in importing a "weak" or a "linkonce" function
since we won't be able to inline it anyway.
We already had a targeted check for WeakAny, this is using the
same check on GlobalValue as the inline, i.e.
isMayBeOverriddenLinkage()

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268341
2016-05-03 00:27:28 +00:00
Wolfgang Pieb a4e71bd11a Moved test case for r268323 to DebugInfo/X86 to unbreak aarch64.
llvm-svn: 268339
2016-05-03 00:22:09 +00:00
Reid Kleckner 97837b7b09 [MC] Create unique .pdata sections for every .text section
Summary:
This adds a unique ID to the COFF section uniquing map, similar to the
one we have for ELF.  The unique id is not currently exposed via the
assembler because we don't have a use case for it yet. Users generally
create .pdata with the .seh_* family of directives, and the assembler
internally needs to produce .pdata and .xdata sections corresponding to
the code section.

The association between .text sections and the assembler-created .xdata
and .pdata sections is maintained as an ID field of MCSectionCOFF. The
CFI-related sections are created with the given unique ID, so if more
code is added to the same text section, we can find and reuse the CFI
sections that were already created.

Reviewers: majnemer, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19376

llvm-svn: 268331
2016-05-02 23:22:18 +00:00
Quentin Colombet 776e6de516 [MachineBlockPlacement] Let the target optimize the branches at the end.
After the layout of the basic blocks is set, the target may be able to get rid
of unconditional branches to fallthrough blocks that the generic code does not
catch. This happens any time TargetInstrInfo::AnalyzeBranch is not able to
analyze all the branches involved in the terminators sequence, while still
understanding a few of them.

In such situation, AnalyzeBranch can directly modify the branches if it has been
instructed to do so.

This patch takes advantage of that.

llvm-svn: 268328
2016-05-02 22:58:59 +00:00
Quentin Colombet 4e1d389ac5 [X86] Model FAULTING_LOAD_OP as a terminator and branch.
This operation may branch to the handler block and we do not want it
to happen anywhere within the basic block.
Moreover, by marking it "terminator and branch" the machine verifier
does not wrongly assume (because of AnalyzeBranch not knowing better)
the branch is analyzable. Indeed, the target was seeing only the
unconditional branch and not the faulting load op and thought it was
a simple unconditional block.
The machine verifier was complaining because of that and moreover,
other optimizations could have done wrong transformation!

In the process, simplify the representation of the handler block in
the faulting load op. Now, we directly reference the handler block
instead of using a label. This has the benefits of:
1. MC knows how to issue a label for a BB, so leave that to it.
2. Accessing the target BB from its label is painful, whereas it is
   direct from a MBB operand.

Note: The 2 bytes offset in implicit-null-check.ll comes from the
fact the unconditional jumps are not removed anymore, as the whole
terminator sequence is not analyzable anymore.

Will fix it in a subsequence commit.

llvm-svn: 268327
2016-05-02 22:58:54 +00:00
Wolfgang Pieb 56aa4b0629 DebugInfo: Avoid propagating incorrect debug locations in SelectionDAG via CSE.
Summary:
When SelectionDAG performs CSE it is possible that the context's source
location is different from that of the selected node. This can lead to
incorrect line number records. We update the debug location to the
one that occurs earlier in the instruction sequence.

This fixes PR21006.

Reviewers: echristo, sdmitrouk

Subscribers: jevinskie, asl, llvm-commits

Differential Revision: http://reviews.llvm.org/D12094

llvm-svn: 268323
2016-05-02 22:50:51 +00:00
Mehdi Amini 1e918c9cb3 Revert "ThinLTO: do not import function whose linkage prevents inlining."
This reverts commit r268315, the tests are not passing.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268317
2016-05-02 22:26:04 +00:00
Mehdi Amini bda9b2ae9e ThinLTO: do not import function whose linkage prevents inlining.
There is not point in importing a "weak" or a "linkonce" function
since we won't be able to inline it anyway.
We already had a targeted check for WeakAny, this is using the
same check on GlobalValue as the inline, i.e.
isMayBeOverriddenLinkage()

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268315
2016-05-02 22:11:27 +00:00
Kevin Enderby 64f7a995b0 Fix llvm-size to exit with non zero when it can’t open a file.
rdar://26027819

llvm-svn: 268313
2016-05-02 21:41:03 +00:00
Rafael Espindola 21507a4a5a Don't try to create thin bsd archives.
Not such variant has been specified yet.

llvm-svn: 268305
2016-05-02 21:06:57 +00:00
Frederic Riss bd126df21f [dsymutil] Create the temporary files in the system temp directory.
llvm-dsymutil used to create the temporary files in the output directory.
This works fine except when the output directory contains a '%' char, which
is then replaced by llvm::sys::fs::createUniqueFile() generating an invalid
path.
Just use the default temp dir for those files.

llvm-svn: 268304
2016-05-02 21:06:14 +00:00
Kevin Enderby 7bd8d99497 Thread Expected<...> up from libObject’s getType() for symbols to allow llvm-objdump to produce a good error message.
Produce another specific error message for a malformed Mach-O file when a symbol’s
section index is more than the number of sections.  The existing test case in test/Object/macho-invalid.test
for macho-invalid-section-index-getSectionRawName now reports the error with the message indicating
that a symbol at a specific index has a bad section index and that bad section index value.

Again converting interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. Where the existing code reported the error with a
string message or an error code it was converted to do the same.

Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values.  So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
"// TODO: Actually report errors helpfully" and a call something like
consumeError(NameOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.

llvm-svn: 268298
2016-05-02 20:28:12 +00:00
Matt Arsenault bcdfee7030 AMDGPU: Custom lower v2i32 loads and stores
This will allow us to split up 64-bit private accesses when
necessary.

llvm-svn: 268296
2016-05-02 20:13:51 +00:00
Tom Stellard 154c9cdd24 AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratch
We were using v_readlane_b32 with the lane set to zero, but this won't
work if thread 0 is not active.

Differential Revision: http://reviews.llvm.org/D19745

llvm-svn: 268295
2016-05-02 20:11:44 +00:00
Matt Arsenault 2b957b5a6f AMDGPU: Make i64 loads/stores promote to v2i32
Now that unaligned access expansion should not attempt
to produce i64 accesses, we can remove the hack in
PreprocessISelDAG where this is done.

This allows splitting i64 private accesses while
allowing the new add nodes indexing the vector components
can be folded with the base pointer arithmetic.

llvm-svn: 268293
2016-05-02 20:07:26 +00:00
Simon Pilgrim 21b2c5660e [X86][AVX2] Added 128-bit wide shuffle test
Demonstrate missing 128-bit wide shuffle combine support

llvm-svn: 268290
2016-05-02 19:46:58 +00:00
Reid Kleckner bca59d2a43 Revert "[SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics"
This reverts commit r268254.

This change causes assertion failures while building Chromium. Reduced
test case coming soon.

llvm-svn: 268288
2016-05-02 19:43:22 +00:00
Tim Northover c08db1840c ARM: fix handling of SUB immediates in peephole opt.
We were negating an immediate that was going to be used in a SUBri form
unnecessarily. Since ADD/SUB are very similar we *can* do that, but we have to
change the SUB to an ADD at the same time. This also applies to ADD, and allows
us to handle a slightly larger range of immediates for those two operations.

rdar://25992245

llvm-svn: 268276
2016-05-02 18:30:08 +00:00
Justin Holewinski 9a6ea2c256 [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection
Summary:
We don't have sign-/zero-extending ldg/ldu instructions defined,
so we need to emulate them with explicit CVTs. We were originally
handling the i8 case, but not any other cases.

Fixes PR26185

Reviewers: jingyue, jlebar

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D19615

llvm-svn: 268272
2016-05-02 18:12:02 +00:00
Zachary Turner 0eace0bae5 Parse PDB Name Hash Table
PDB has a lot of similar data structures.  We already have code
for parsing a Name Map, but PDB seems to have a different but
very similar structure that is a hash table.  This is the
beginning of code needed in order to parse the name hash table,
but it is not yet complete.  It parses the basic metadata of
the hash table, the bucket array, and the names buffer, but
doesn't use any of these fields yet as the data structure
requires a non-trivial amount of work to understand.

llvm-svn: 268268
2016-05-02 18:09:14 +00:00
Tom Stellard 1f520e5c98 AMDGPU/SI: Use the hazard recognizer to break SMEM soft clauses
Summary:
Add support for detecting hazards in SMEM soft clauses, so that we only
break the clauses when necessary, either by adding s_nop or re-ordering
other alu instructions.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18870

llvm-svn: 268260
2016-05-02 17:39:06 +00:00
Nicolai Haehnle 119d3d80cb AMDGPU: llvm.SI.fs.constant is a source of divergence
Summary:
This intrinsic is used to get flat-shaded fragment shader inputs. Those are
uniform across a primitive, but a fragment shader wave may process pixels from
multiple primitives (as indicated by the prim_mask), and so that's where
divergence can arise.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19747

llvm-svn: 268259
2016-05-02 17:37:01 +00:00
Derek Schuff 31680dd832 [WebAssembly] Rename memory_size intrinsic to current_memory
This follows the recent renaming in the wasm spec.

llvm-svn: 268255
2016-05-02 17:25:22 +00:00
Hans Wennborg b7599329fc [SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics
Make it possible that TryToSimplifyUncondBranchFromEmptyBlock merges empty
basic block including lifetime intrinsics as well as phi nodes and
unconditional branch into its successor or predecessor(s).

If successor of empty block has single predecessor, all contents including
lifetime intrinsics are sinked into the successor. Otherwise, they are
hoisted into its predecessor(s) and then merged into the predecessor(s).

Patch by Josh Yoon <josh.yoon@samsung.com>!

Differential Revision: http://reviews.llvm.org/D19257

llvm-svn: 268254
2016-05-02 17:22:54 +00:00
Mehdi Amini 45c7b3ecb5 Move createReversePostOrderFunctionAttrsPass right after the inliner is done
This is where it was originally, until LoopVersioningLICM was
inserted before in r259986, I don't believe it was on purpose.

Differential Revision: http://reviews.llvm.org/D19809

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268252
2016-05-02 16:53:16 +00:00
Pete Cooper 228b1e9a1f Add llvm-pdbdump to the tool substitutions list in lit. NFC.
This adds llvm-pdbdump to the list of tools which get printed with
the full path in verbose mode.  This makes it easier to take the
whole run line from verbose output and run it again without prepending
with the builds bin directory.

llvm-svn: 268250
2016-05-02 16:51:26 +00:00
Chad Rosier 84567343bc Remove extra whitespace. NFC.
llvm-svn: 268248
2016-05-02 16:45:00 +00:00
Tom Stellard a27007eb4f AMDGPU/SI: Use hazard recognizer to detect DPP hazards
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18603

llvm-svn: 268247
2016-05-02 16:23:09 +00:00
Sanjay Patel ec41cd2461 remove blank lines
llvm-svn: 268246
2016-05-02 15:49:09 +00:00
Sanjay Patel ebc0faa8d4 [InstCombine] regenerate checks
llvm-svn: 268245
2016-05-02 15:32:10 +00:00
Sanjay Patel 0d0181006a [InstCombine] regenerate checks
llvm-svn: 268244
2016-05-02 15:25:49 +00:00
Sanjay Patel 0b75fd81e1 [InstCombine] regenerate checks
llvm-svn: 268242
2016-05-02 15:21:41 +00:00
Sanjay Patel 933f9da43d [InstCombine] regenerate checks
llvm-svn: 268241
2016-05-02 15:18:13 +00:00
Sanjay Patel b193fe943f [InstCombine] regenerate checks
llvm-svn: 268239
2016-05-02 15:06:55 +00:00
Sanjay Patel 1540b19407 [InstCombine] regenerate checks
llvm-svn: 268232
2016-05-02 14:21:55 +00:00
David L Kreitzer 0fe4632bd7 Enable the X86 call frame optimization for the 64-bit targets that allow it.
Fixes PR27241.

Differential Revision: http://reviews.llvm.org/D19688

llvm-svn: 268227
2016-05-02 13:45:25 +00:00
Jonas Paulsson 1eb3486a7a [SystemZ] Temporarily disable codegen test int-add-12.ll.
This checks for AGSI transformation, which is temporarily disabled.

llvm-svn: 268219
2016-05-02 10:42:47 +00:00
Davide Italiano 22b3ad8630 [llvm-readobj] Dump hash as part of -version-info.
llvm-svn: 268210
2016-05-02 02:30:18 +00:00
Davide Italiano 4f277763cf [GlobalDCE] Modernize. Use FileCheck instead of grep.
llvm-svn: 268207
2016-05-01 22:51:14 +00:00
Simon Pilgrim ca140b17cb [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements.
llvm-svn: 268206
2016-05-01 20:43:02 +00:00
Simon Pilgrim c590492075 Dropped FIXME comment
llvm-svn: 268205
2016-05-01 20:33:25 +00:00
Simon Pilgrim eeacc40e27 [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements.
llvm-svn: 268204
2016-05-01 20:22:42 +00:00
Simon Pilgrim cc7f567b6a [InstCombine][AVX] Fixed PERMILVAR identity tests and added additional decode tests
llvm-svn: 268203
2016-05-01 20:06:47 +00:00
Simon Pilgrim e5e8c2fde0 [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements.
llvm-svn: 268202
2016-05-01 19:26:21 +00:00
Simon Pilgrim cae3e70707 [InstCombine][SSE] Regenerate MOVSX/MOVZX tests
llvm-svn: 268201
2016-05-01 18:28:45 +00:00
Craig Topper b6da65403a [AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While there fix the execution domain for VPACKSSDW/VPACKUSDW.
llvm-svn: 268200
2016-05-01 17:38:32 +00:00
Simon Pilgrim 8cddf8b3c6 [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector.
llvm-svn: 268199
2016-05-01 16:41:22 +00:00
Igor Breger 110af565c7 getelementptr instruction, support index vector of EVT.
Differential Revision: http://reviews.llvm.org/D19775

llvm-svn: 268195
2016-05-01 13:29:12 +00:00
Igor Breger 131008fbcb Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth.
Differential Revision: http://reviews.llvm.org/D19579

llvm-svn: 268190
2016-05-01 08:40:00 +00:00
Craig Topper e430de8be6 [AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when VLX and BWI are supported.
llvm-svn: 268189
2016-05-01 06:52:19 +00:00
Sanjoy Das f2f00fb11a [SCEV] When printing via -analysis, dump loop disposition
There are currently some bugs in tree around SCEV caching an incorrect
loop disposition.  Printing out loop dispositions will let us write
whitebox tests as those are fixed.

The dispositions are printed as a list in "inside out" order,
i.e. innermost loop first.

llvm-svn: 268177
2016-05-01 04:51:05 +00:00
Simon Pilgrim c179435055 [InstCombine][AVX2] Added VPERMD/VPERMPS shuffle combining placeholder tests.
For future support for VPERMD/VPERMPS to generic shuffles combines

llvm-svn: 268166
2016-04-30 20:41:52 +00:00
Simon Pilgrim 8e38a5439b [InstCombine][AVX] Split off VPERMILVAR tests and added additional tests for UNDEF mask elements
llvm-svn: 268159
2016-04-30 07:32:19 +00:00
Tom Stellard c51e4468b7 AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaits
This was supposed to be part of r268143.

llvm-svn: 268154
2016-04-30 04:04:48 +00:00
Amjad Aboud 72da9391f0 Reverting 268054 & 268063 as they caused PR27579.
llvm-svn: 268150
2016-04-30 01:44:07 +00:00
Sanjoy Das 47cf2affbd [LowerGuardIntrinsics] Keep track of !make.implicit metadata
If a guard call being lowered by LowerGuardIntrinsics has the
`!make.implicit` metadata attached, then reattach the metadata to the
branch in the resulting expanded form of the intrinsic.  This allows us
to implement null checks as guards and still get the benefit of implicit
null checks.

llvm-svn: 268148
2016-04-30 00:55:59 +00:00
Lawrence Hu 1befea2bdc Reroll loops with multiple IV and negative step part 3
support multiple induction variables

    This patch enable loop reroll for the following case:
        for(int i=0;  i<N; i += 2) {
           S += *a++;
           S += *a++;
        };

Differential Revision: http://reviews.llvm.org/D16550

llvm-svn: 268147
2016-04-30 00:51:22 +00:00
Tom Stellard cb6ba62d6f AMDGPU/SI: Enable the post-ra scheduler
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.

Reviewers: arsenm

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18602

llvm-svn: 268143
2016-04-30 00:23:06 +00:00
Sanjoy Das 52c68bb0f5 [LowerGuardIntrinsics] Preserve calling conv when lowering
llvm-svn: 268142
2016-04-30 00:17:47 +00:00
Sanjay Patel bc6fad0bdf add minimal test to show dropped metadata
llvm-svn: 268141
2016-04-30 00:12:54 +00:00
Sanjay Patel 6748ec49e9 remove the metadata added with r267827
We can demonstrate the 'select' bug and fix with a simpler test case.
The merged weight values are already tested in another test.

llvm-svn: 268139
2016-04-30 00:02:36 +00:00
Sanjoy Das 107aefc2fc Mark guards on true as "trivially dead"
This moves some logic added to EarlyCSE in rL268120 into
`llvm::isInstructionTriviallyDead`.  Adds a test case for DCE to
demonstrate that passes other than EarlyCSE can now pick up on the new
information.

llvm-svn: 268126
2016-04-29 22:23:16 +00:00
Haicheng Wu 4afe0425db [MBP] Use Function::optForSize() instead of checking OptimizeForSize directly.
Fix a FIXME.  Disable loop alignment if compiled with -Oz now.

llvm-svn: 268121
2016-04-29 22:01:10 +00:00
Sanjoy Das ee81b23fe7 [EarlyCSE] Simplify guard intrinsics
Summary:
This change teaches EarlyCSE some basic properties of guard intrinsics:

 - Guard intrinsics read all memory, but don't write to any memory
 - After a guard has executed, the condition it was guarding on can be
   assumed to be true
 - Guard intrinsics on a constant `true` are no-ops

Reviewers: reames, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19578

llvm-svn: 268120
2016-04-29 21:52:58 +00:00
Matt Arsenault 701c21ea10 AMDGPU: Fix crash with unreachable terminators.
If a block has no successors because it ends in unreachable,
this was accessing an invalid iterator.

Also stop counting instructions that don't emit any
real instructions.

llvm-svn: 268119
2016-04-29 21:52:13 +00:00
Sriraman Tallam 7da9b445ea Differential Revision: http://reviews.llvm.org/D19733
llvm-svn: 268106
2016-04-29 21:19:16 +00:00
Matt Arsenault dc4ebad6d4 AMDGPU: Add kernarg.segment.ptr intrinsic
llvm-svn: 268105
2016-04-29 21:16:52 +00:00
Chad Rosier cd62bf5821 [InstCombine] Determine the result of a select based on a dominating condition.
Differential Revision: http://reviews.llvm.org/D19550

llvm-svn: 268104
2016-04-29 21:12:31 +00:00
Matt Arsenault ab2232cf73 DAGCombiner: Reduce truncated shl width
llvm-svn: 268094
2016-04-29 19:53:16 +00:00
David Majnemer d2a074b1f4 [ValueTracking] matchSelectPattern needs to be more careful around FP
matchSelectPattern attempts to see through casts which mask min/max
patterns from being more obvious.  Under certain circumstances, it would
misidentify a sequence of instructions as a min/max because it assumed
that folding casts would preserve the result.  This is not the case for
floating point <-> integer casts.

This fixes PR27575.

llvm-svn: 268086
2016-04-29 18:40:34 +00:00
Artem Tamazov 5d3ae19bdf [AMDGPU][llvm-mc] Add some missing testcases to trap.s
Differential Revision: http://reviews.llvm.org/D19602

llvm-svn: 268073
2016-04-29 17:41:44 +00:00
Geoff Berry b92cd5293e [BasicAA] Treat llvm.assume as not accessing memory in getModRefBehavior(Function)
Reviewers: dberlin, chandlerc, hfinkel, reames, sanjoy

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19730

llvm-svn: 268068
2016-04-29 17:18:28 +00:00
Artem Tamazov 38e496b175 Fixed/Recommitted r267733 "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD."
Previously reverted by r267752.

r267733 review:
Differential Revision: http://reviews.llvm.org/D19342

llvm-svn: 268066
2016-04-29 17:04:50 +00:00
Guozhi Wei fa3e04298b [PPC] Enable shuffling of VSX vectors
This patch fixes PR27078 by enabling shuffling of vectors if VSX is available.

llvm-svn: 268064
2016-04-29 17:00:54 +00:00
Amjad Aboud ee04164599 Fixed LIT tests that was broken after change in r268054.
llvm-svn: 268063
2016-04-29 16:54:18 +00:00
Sanjay Patel 362dcf9615 auto-generate checks
llvm-svn: 268061
2016-04-29 16:39:37 +00:00
Daniel Sanders 7225cd52e7 [mips][ias] Move createCpRestoreMemOp to MipsTargetStreamer. NFC.
Summary:
This removes the temporary call to isIntegratedAssemblerRequired() which was
added recently. It's effect is now acheived directly in the MipsTargetStreamer
hierarchy.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19715

llvm-svn: 268058
2016-04-29 16:16:49 +00:00
Amjad Aboud 293ee8bba1 Recommitted r264280 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26942 in r267004.

llvm-svn: 268054
2016-04-29 16:07:55 +00:00
Simon Dardis d8bceb9d3a [mips][FastISel] A store is not a load.
Correct trivial error. One of the failing tests from PR/27458.

Reviewers: dsanders, vkalintiris, mcrosier

Differential Review: http://reviews.llvm.org/D19726

llvm-svn: 268053
2016-04-29 16:07:47 +00:00
Krzysztof Parzyszek f5cbac93eb [Hexagon] Optimize addressing modes for load/store
Patch by Jyotsna Verma.

llvm-svn: 268051
2016-04-29 15:49:13 +00:00
Tom Stellard 92b24f324b AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions
Summary:
These instructions can add an immediate offset to the address, like other
ds instructions.

Reviewers: arsenm

Subscribers: arsenm, scchan

Differential Revision: http://reviews.llvm.org/D19233

llvm-svn: 268043
2016-04-29 14:34:26 +00:00
Daniel Sanders fba875f902 [mips][ias] Split expandMemInst between MipsAsmParser and MipsTargetStreamer. Almost NFC.
Summary:
The portion in MipsAsmParser is responsible for figuring out which expansion to
use, while the portion in MipsTargetStreamer is responsible for emitting it.

This allows us to remove the call to isIntegratedAssemblerRequired() which is
currently ensuring the effect of .cprestore only occurs when writing objects.

The small functional change is that the memory offsets are now correctly
printed as signed values.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19714

llvm-svn: 268042
2016-04-29 13:43:45 +00:00
Daniel Sanders 9db710a171 [mips][ias] Make section sizes a multiple of the alignment.
Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19008

llvm-svn: 268036
2016-04-29 12:44:07 +00:00
Simon Pilgrim 07a691c706 [InstCombine][SSE] Added x86 pshufb undef mask tests
FIXME: We currently don't support folding constant pshufb shuffle masks containing undef elements.
llvm-svn: 268016
2016-04-29 09:13:53 +00:00
Nikolay Haustov 4f672a34ed AMDGPU/SI: Assembler: Unify parsing/printing of operands.
Summary:
The goal is for each operand type to have its own parse function and
at the same time share common code for tracking state as different
instruction types share operand types (e.g. glc/glc_flat, etc).

Introduce parseAMDGPUOperand which can parse any optional operand.
DPP and Clamp/OMod have custom handling for now. Sam also suggested
to have class hierarchy for operand types instead of table. This
can be done in separate change.

Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps,
parseMubufOptionalOps, parseDPPOptionalOps.
Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class.
Rename AsmMatcher/InstPrinter methods accordingly.
Print immediate type when printing parsed immediate operand.
Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3).
Update tests.

Reviewers: tstellarAMD, SamWot, artem.tamazov

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19584

llvm-svn: 268015
2016-04-29 09:02:30 +00:00
Simon Pilgrim 5779fb61b0 [InstCombine][SSE] Regenerated x86 pshufb tests
llvm-svn: 268014
2016-04-29 08:53:35 +00:00
Zlatko Buljan 531809d340 [mips][microMIPS] Fix offsets for LLE, LWE, SBE, SCE and SHE instructions
Differential Revision: http://reviews.llvm.org/D18645

llvm-svn: 268012
2016-04-29 08:36:54 +00:00
David Majnemer 1a5799fe3e [DeadArgumentElimination] Propagate operand bundles to promoted call sites
We neglected to transfer operand bundles when performing argument
promotion.

llvm-svn: 268008
2016-04-29 07:22:36 +00:00
Adam Nemet c60d8f8fd0 [LoopDist] Add missing RUN line in test from r268006
llvm-svn: 268007
2016-04-29 07:16:00 +00:00
Adam Nemet 88ec491830 [LoopDist] Also emit optimization remark on success (-Rpass=)
The option -Rpass=loop-distribute now reports the loops that were
distributed.

llvm-svn: 268006
2016-04-29 07:10:46 +00:00
David Majnemer 13d5526392 [SLPVectorizer] Add operand bundles to vectorized functions
SLPVectorizing a call site should result in further propagation of its
bundles.

llvm-svn: 268004
2016-04-29 07:09:51 +00:00
David Majnemer 50ddc0e1b6 [LoopVectorize] Add operand bundles to vectorized functions
Also, do not crash when calculating a cost model for loop-invariant
token values.

llvm-svn: 268003
2016-04-29 07:09:48 +00:00
Matt Arsenault 7d1b6c81af AMDGPU: Stop reporting an addressing mode for unknown addrspace
This was being treated the same as private, which has an immediate
offset. For unknown, it probably means it's for a computation not
actually being used for accessing memory, so it should not have a
nontrivial addressing mode.

llvm-svn: 268002
2016-04-29 06:25:10 +00:00
Matt Arsenault 790eb1c490 DivergenceAnalysis: Fix crash with unreachable blocks
Unreachable blocks may not be in the dominator tree,
so don't crash on them.

llvm-svn: 268001
2016-04-29 06:17:47 +00:00
David Majnemer cd24bb1d3a [ArgumentPromotion] Propagate operand bundles to promoted call sites
We neglected to transfer operand bundles when performing argument
promotion.

This fixes PR27568.

llvm-svn: 267986
2016-04-29 04:56:12 +00:00
Michael Zolotukhin 1816d03b7d [PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer.
We don't preserve AAResults, because, for one, we don't preserve SCEV-AA.
That fixes PR25281.

llvm-svn: 267980
2016-04-29 03:31:25 +00:00
Matthias Braun f3619b8212 RegisterPressure: Fix default lanemask for missing regunit intervals
In case of missing live intervals for a physical registers
getLanesWithProperty() would report 0 which was not a safe default in
all situations. Add a parameter to pass in a safe default.
No testcase because in-tree targets do not skip computing register unit
live intervals.

Also cleanup the getXXX() functions to not perform the
RequireLiveIntervals checks anymore so we do not even need to return
safe defaults.

llvm-svn: 267977
2016-04-29 02:44:54 +00:00
Vedant Kumar 78b8bc29db [llvm-cov] Don't emit 'nan%' in reports
llvm-svn: 267971
2016-04-29 01:31:49 +00:00
Hal Finkel 1b66f7e3c8 [LoopVectorize] Keep hints from original loop on the vector loop
We need to keep loop hints from the original loop on the new vector loop.
Failure to do this meant that, for example:

  void foo(int *b) {
  #pragma clang loop unroll(disable)
    for (int i = 0; i < 16; ++i)
      b[i] = 1;
  }

this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the
hints that unrolling should be disabled, and then we'd unroll it.

llvm-svn: 267970
2016-04-29 01:27:40 +00:00
David Majnemer 1573b242ae [llvm-pdbdump] Restore error messages, handle bad block sizes
We lost the ability to report errors, bring it back.  Also, correctly
validate the block size.

llvm-svn: 267955
2016-04-28 23:47:27 +00:00
David Majnemer 5baa2bc2e1 [llvm-pdbdump] Correctly read data larger than a block
A bug was introduced when the code was refactored which resulted in a
bad memory access.

This fixes PR27565.

llvm-svn: 267953
2016-04-28 23:24:23 +00:00
Adam Nemet 0ba164bbcb [LoopDist] Emit optimization remarks (-Rpass*)
I closely followed the precedents set by the vectorizer:

* With -Rpass-missed, the loop is reported with further details pointing
to -Rpass--analysis.

* -Rpass-analysis reports the details why distribution has failed.

* Regardless of -Rpass*, when distribution fails for a loop where
distribution was forced with the pragma, a warning is produced according
to -Wpass-failed.  In this case the analysis info is also printed even
without -Rpass-analysis.

llvm-svn: 267952
2016-04-28 23:08:32 +00:00
Hal Finkel 50316d95a9 [Inliner] Preserve llvm.mem.parallel_loop_access metadata
When inlining a call site with llvm.mem.parallel_loop_access metadata, this
metadata needs to be propagated to all cloned memory-accessing instructions.
Otherwise, inlining parts of the loop body will invalidate the annotation.

With this functionality, we now vectorize the following as expected:

  void Body(int *res, int *c, int *d, int *p, int i) {
    res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
  }

  void Test(int *res, int *c, int *d, int *p, int n) {
    int i;

  #pragma clang loop vectorize(assume_safety)
    for (i = 0; i < 1600; i++) {
      Body(res, c, d, p, i);
    }
  }

llvm-svn: 267949
2016-04-28 23:00:04 +00:00
Dehao Chen 1b54fce319 Read discriminators correctly from object file.
Summary:
This is the follow-up patch for http://reviews.llvm.org/D19436
* Update the discriminator reading algorithm to match the assignment algorithm.
* Add test to cover the new algorithm.

Reviewers: dnovillo, echristo, dblaikie

Subscribers: danielcdh, dblaikie, echristo, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19522

llvm-svn: 267945
2016-04-28 22:09:37 +00:00
Marcin Koscielnicki 7b32957852 [PowerPC] Fix the EH_SjLj_Setup pseudo.
This instruction is just a control flow marker - it should not
actually exist in the object file.  Unfortunately, nothing catches
it before it gets to AsmPrinter.  If integrated assembler is used,
it's considered to be a normal 4-byte instruction, and emitted as
an all-0 word, crashing the program.  With external assembler,
a comment is emitted.

Fixed by setting Size to 0 and handling it in MCCodeEmitter - this
means the comment will still be emitted if integrated assembler
is not used.

This broke an ASan test, which has been disabled for a long time
as a result (see the discussion on D19657).  We can reenable it
once this lands.

llvm-svn: 267943
2016-04-28 21:24:37 +00:00
Kevin Enderby 1be37a3522 Fix a bug in llvm-objdump for -private-headers printing the LC_CODE_SIGNATURE Mach-O load command.
rdar://25985653

llvm-svn: 267940
2016-04-28 21:07:20 +00:00
Krzysztof Parzyszek c5a4e26410 [RDF] Improve handling of inline-asm
- Keep implicit defs from inline-asm instructions.
- Treat register references from inline-asm as fixed.

llvm-svn: 267936
2016-04-28 20:33:33 +00:00
Kevin Enderby 4b627beea8 Update llvm-objdump for disassembly of ARM Mach-O files to always include the opcode bytes.
As this is the expected behavior of the old darwin otool(1) for ARM Mach-O files.

rdar://25896249

llvm-svn: 267929
2016-04-28 20:14:13 +00:00
Zachary Turner 84c3a8ba3d Read the rest of the DBI substreams, and parse source info.
We now read out the rest of the substreams from the DBI streams.  One of
these substreams, the FileInfo substream, contains information about which
source files contribute to each module (aka compiland).  This patch
additionally parses out the file information from that substream, and
dumps it in llvm-pdbdump.

Differential Revision: http://reviews.llvm.org/D19634
Reviewed by: ruiu

llvm-svn: 267928
2016-04-28 20:05:18 +00:00
Kit Barton 7a1a9e01ad This reverts commit r265505.
Revert "[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance".
This patch has caused a functional regression in SPEC2k6 namd, and a performance regression in mesa-pipe.

llvm-svn: 267927
2016-04-28 20:00:42 +00:00
Krzysztof Parzyszek e5fcce2d2b [Hexagon] Add instruction aliases for vector unsigned compare-equal
Unsigned compare-equal instructions are mapped to signed compare-equal.

llvm-svn: 267925
2016-04-28 19:49:18 +00:00
Matt Arsenault 1c4d0efe56 AMDGPU: Emit error if too much LDS is used
llvm-svn: 267922
2016-04-28 19:37:35 +00:00
Krzysztof Parzyszek 7ea9a529aa Reset the TopRPTracker's position in ScheduleDAGMILive::initQueues
ScheduleDAGMI::initQueues changes the RegionBegin to the first non-debug
instruction. Since it does not track register pressure, it does not affect
any RP trackers. ScheduleDAGMILive inherits initQueues from ScheduleDAGMI,
and it does reset the TopTPTracker in its schedule method. Any derived,
target-specific scheduler will need to do it as well, but the TopRPTracker
is only exposed as a "const" object to derived classes. Without the ability
to modify the tracker directly, this leaves a derived scheduler with a
potential of having the TopRPTracker out-of-sync with the CurrentTop.

The symptom of the problem:
  void llvm::ScheduleDAGMILive::scheduleMI(llvm::SUnit *, bool):
  Assertion `TopRPTracker.getPos() == CurrentTop && "out of sync"' failed.

Differential Revision: http://reviews.llvm.org/D19438

llvm-svn: 267918
2016-04-28 19:17:44 +00:00
Matt Arsenault c5fce69031 AMDGPU: Fix mishandling array allocations when promoting alloca
The canonical form for allocas is a single allocation of the array type.
In case we see a non-canonical array alloca, make sure we aren't
replacing this with an array N times smaller.

llvm-svn: 267916
2016-04-28 18:38:48 +00:00
Krzysztof Parzyszek 0e7d2d339d [Hexagon] Define certain aliases for vector instructions
Specifically:
  Vd = #0   -> Vd = vxor(Vd, Vd)
  Vdd = #0  -> Vdd.w = vsub(Vdd.w, Vdd.w)
  Vdd = Vss -> Vdd = vcombine(Vss.H, Vss.L)

llvm-svn: 267901
2016-04-28 16:43:16 +00:00
Simon Dardis a2d8cc3db9 [mips][atomics] Fix partword atomic binary operation implementation
Currently Mips::emitAtomicBinaryPartword() does not properly respect the
width of pointers. For MIPS64 this causes the memory address that the ll/sc
sequence uses to be truncated. At runtime this causes a segmentation fault.

This can be fixed by applying similar changes as r266204, so that a full 64bit
pointer is loaded.

Reviewers: dsanders

Differential Review: http://reviews.llvm.org/D19651

llvm-svn: 267900
2016-04-28 16:26:43 +00:00
Arch D. Robison 0e61034018 [SLPVectorizer] Extend SLP Vectorizer to deal with aggregates.
The refactoring portion part was done as r267748.

http://reviews.llvm.org/D14185

llvm-svn: 267899
2016-04-28 16:11:45 +00:00
Krzysztof Parzyszek e737b86f8c [Hexagon] Handle double-vector registers as new-value producers
Patch by Colin LeMahieu.

llvm-svn: 267897
2016-04-28 15:54:48 +00:00
Adrian Prantl e5447574c8 Debug Info: Restore the pre-r240853 behavior for DWARF2 bitfields.
The DWARF2 specification of DW_AT_bit_offset is ambiguous for
little-endian machines, but by restoring to the old behavior
we match what debuggers expect and what other popular compilers
generate.

llvm-svn: 267896
2016-04-28 15:37:52 +00:00
Adrian Prantl f393d313ec Debug info: Support DWARF4 bitfields via DW_AT_data_bit_offset.
The DWARF2 specification of DW_AT_bit_offset was written from the perspective of
a big-endian machine with unclear semantics for other systems.  DWARF4
deprecated DW_AT_bit_offset and introduced a new attribute DW_AT_data_bit_offset
that simply counts the number of bits from the beginning of the containing
entity regardless of endianness.

After this patch LLVM emits DW_AT_bit_offset for DWARF 2 or 3 and
DW_AT_data_bit_offset when DWARF 4 or later is requested.

llvm-svn: 267895
2016-04-28 15:37:48 +00:00
Krzysztof Parzyszek efd72857a3 [RDF] Handle undefined registers in RDF copy propagation
When updating the graph, make sure that new uses without reaching defs
are handled correctly.

llvm-svn: 267891
2016-04-28 15:09:19 +00:00
Simon Pilgrim bd4a3be7d2 [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.

This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.

Differential Revision: http://reviews.llvm.org/D19614

llvm-svn: 267873
2016-04-28 12:22:53 +00:00
Matthias Braun fbe85ae12e CodeGen: Add DetectDeadLanes pass.
The DetectDeadLanes pass performs a dataflow analysis of used/defined
subregister lanes across COPY instructions and instructions that will
get lowered to copies. It detects dead definitions and uses reading
undefined values which are obscured by COPY and subregister usage.

These dead definitions cause trouble in the register coalescer which
cannot deal with definitions suddenly becoming dead after coalescing
COPY instructions.

For now the pass only adds dead and undef flags to machine operands. It
should be possible to extend it in the future to remove the dead
instructions and redo the analysis for the affected virtual
registers.

Differential Revision: http://reviews.llvm.org/D18427

llvm-svn: 267851
2016-04-28 03:07:16 +00:00
Sanjay Patel 21bd38a07b Update test to use FileCheck
Also, add some metadata to show what that currently looks like.

llvm-svn: 267827
2016-04-28 00:29:27 +00:00
Bryan Chan 893110ecaf [SystemZ] Support Swift Calling Convention
Summary:
Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see:

RFC: Implementing the Swift calling convention in LLVM and Clang
https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0

Reviewers: kbarton, manmanren, rjmccall, uweigand

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19414

llvm-svn: 267823
2016-04-28 00:17:23 +00:00
Peter Collingbourne edf8432480 LTO: Don't bother trying to mangle unnamed globals, as they can't be preserved with MustPreserveSymbols.
Summary: Should fix sanitizer-windows bot.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19635

llvm-svn: 267820
2016-04-27 23:48:11 +00:00
Kevin Enderby 8eccdad5ec Fix bugs in llvm-objdump printing the last word for -section in non i386 and x86 files.
Two problems, 1) for the last 4 bytes it would print them as separate bytes not a word
and 2) it would print the same last byte for those bytes less than a word.

rdar://25938224

llvm-svn: 267819
2016-04-27 23:43:00 +00:00
Zachary Turner 1822af542f Parse module information from DBI stream.
This gets more data out of the DBI strema of the PDB.  In
particular it extracts the metadata for the list of modules
(compilands) that this PDB contains info about, and adds support
for dumping these fields to llvm-pdbdump.

Differential Revision: http://reviews.llvm.org/D19570
Reviewed By: ruiu

llvm-svn: 267818
2016-04-27 23:41:42 +00:00
Rong Xu a4c3f67fe8 more buildbot failure fix to r267792
__llvm_prf_nm length is embedded in llvm_used. Relax llvm_used check.

llvm-svn: 267816
2016-04-27 23:23:53 +00:00
Rong Xu 6e34c490ff [PGO] Promote indirect calls to conditional direct calls with value-profile
This patch implements the transformation that promotes indirect calls to
conditional direct calls when the indirect-call value profile meta-data is
available.

Differential Revision: http://reviews.llvm.org/D17864

llvm-svn: 267815
2016-04-27 23:20:27 +00:00
Mitch Bodart e60465ddf7 [X86] Enable the post-RA-scheduler for clang's default 32-bit cpu.
For compilations with no explicit cpu specified, this exhibits
nice gains on Silvermont, with neutral performance on big cores.

Differential Revision: http://reviews.llvm.org/D19138

llvm-svn: 267809
2016-04-27 22:52:35 +00:00
Kevin Enderby c493085c8d Fix a bug in llvm-objdump printing of 32-bit addresses for -section in non i386 and x86 files.
rdar://25896202

llvm-svn: 267807
2016-04-27 22:36:18 +00:00
Quentin Colombet bf200688de [X86][FastISel] Make sure we use the right register class when we select stores.
llvm-svn: 267806
2016-04-27 22:33:42 +00:00
Rong Xu 4b1dc5d60b Fix buildbot failure due to r267792
Relax the test check as some targets do not have name compression.

llvm-svn: 267803
2016-04-27 22:06:35 +00:00
Colin LeMahieu a3782da3e3 [Hexagon] Merging nops in to previous packet rather than always creating a new one.
llvm-svn: 267798
2016-04-27 21:37:44 +00:00
Quentin Colombet d6dbec4c6f [X86] Fix the lowering of TLS calls.
The callseq_end node must be glued with the TLS calls, otherwise,
the generic code will miss the uses of the returned value and will
mark it dead.
Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI,
the pseudo uses the symbol address at this point not RDI and the
lowering will do the right thing.

llvm-svn: 267797
2016-04-27 21:37:37 +00:00
Rong Xu af5aebaa32 [PGO] Prohibit address recording if the function is both internal and COMDAT
Differential Revision: http://reviews.llvm.org/D19515

llvm-svn: 267792
2016-04-27 21:17:30 +00:00
Matt Arsenault 0547b016b1 AMDGPU: Account for globals in AMDGPUPromoteAlloca pass
Patch by Bas Nieuwenhuizen

llvm-svn: 267791
2016-04-27 21:05:08 +00:00
Kevin Enderby 60c4e6aafa Add a test case for the crash fixed with r267037. David Blaikie said it would be nice to have!
This was crashing llvm-objdump with -macho -objc-meta-data when trying dump a non-existent section.
So the test binary is simply created from an empty .s file compiled with: clang -arch armv7 empty.s -c

llvm-svn: 267782
2016-04-27 20:37:06 +00:00
Ahmed Bougacha 9e71425f54 [AArch64] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB,
whereas it should be a successor of the original MBB.

Follow-up to r266339.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267779
2016-04-27 20:33:02 +00:00
Ahmed Bougacha b4af107239 [ARM] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas
it should be a successor of the original MBB.

The testcase changes are caused by Thumb2SizeReduction, which
was previously confused by the broken CFG.

Follow-up to r266679.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267778
2016-04-27 20:32:54 +00:00
Simon Pilgrim 3f595aabe2 [InstCombine][AVX2] Add AVX2 per-element vector shift tests
At the moment we don't simplify PSRAV/PSRLV/PSLLV intrinsics to generic IR for constant shift amounts, but we could.

llvm-svn: 267777
2016-04-27 20:25:34 +00:00
Kevin B. Smith c378a99ba5 [X86]: Quit promoting 16 bit loads to 32 bit.
Differential Revision: http://reviews.llvm.org/D19592

llvm-svn: 267773
2016-04-27 19:58:03 +00:00
David Majnemer 0c80e2eac6 [CodeGenPrepare] Don't sink a cast past its user
The sink cast machinery is supposed to sink casts as close to their user
as possible.  However, an EH pad is the first instruction in it's basic
block.  Don't sink if the user is an EH pad.

This fixes PR27536.

llvm-svn: 267767
2016-04-27 19:36:38 +00:00
Ahmed Bougacha ace97c1f7d [LIR] Set attributes on memset_pattern16.
"inferattrs" will deduce the attribute, but it will be too late for
many optimizations. Set it ourselves when creating the call.

Differential Revision: http://reviews.llvm.org/D17598

llvm-svn: 267762
2016-04-27 19:04:50 +00:00
Ahmed Bougacha 44c19876c7 [InferAttrs] Mark memset_pattern16 params nocapture.
Differential Revision: http://reviews.llvm.org/D19471

llvm-svn: 267760
2016-04-27 19:04:43 +00:00
Chad Rosier 03e1647d19 Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD."
This reverts commit r267733 due to a -Werror,-Wunused-function error.

llvm-svn: 267752
2016-04-27 18:29:11 +00:00
Matthew Simpson 622b95be7b [LV] Reallow positive-stride interleaved load groups with gaps
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.

Differential Revision: http://reviews.llvm.org/D19487

llvm-svn: 267751
2016-04-27 18:21:36 +00:00
Marcin Koscielnicki 7efdca5622 [Mips] Add support for llvm.thread.pointer intrinsic.
This will be used to implement __builtin_thread_pointer in clang.

Differential Revision: http://reviews.llvm.org/D19569

llvm-svn: 267743
2016-04-27 17:21:49 +00:00
Gerolf Hoflehner 88017c08a6 [InstCombine] Sharpended test case in pr21210.ll
llvm-svn: 267742
2016-04-27 17:19:54 +00:00
Artem Tamazov 3896f8f83d [AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.
Added support of TTMP quads.
Reworked M0 exclusion machinery for SMRD and similar instructions
to enable usage of TTMP registers in those instructions as destinations.
Tests added.

Differential Revision: http://reviews.llvm.org/D19342

llvm-svn: 267733
2016-04-27 16:20:23 +00:00
Reid Kleckner 0336cc05e7 [PDB] Fix function names for private symbols in PDBs
Summary:
llvm-symbolizer wants to get linkage names of functions for historical
reasons. Linkage names are only recorded in the PDB for public symbols,
and the linkage name is apparently stored separately in some "public
symbol" record. We had a workaround in PDBContext which would look for
such symbols when the user requested linkage names.

However, when given an address that was truly in a private function and
public funciton, we would accidentally find nearby public symbols and
return those function names. The fix is to look for both function
symbols and public symbols and only prefer the public symbol name if the
addresses of the symbols agree.

Fixes PR27492

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19571

llvm-svn: 267732
2016-04-27 16:10:29 +00:00
Nicolai Haehnle f66bdb5ea8 AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic
Summary:
So it appears that to guarantee some of the ordering requirements of a GLSL
memoryBarrier() executed in the shader, we need to emit an s_waitcnt.

(We can't use an s_barrier, because memoryBarrier() may appear anywhere in
the shader, in particular it may appear in non-uniform control flow.)

Reviewers: arsenm, mareko, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19203

llvm-svn: 267729
2016-04-27 15:46:01 +00:00
Matthew Simpson e5dfb08fcb [TTI] Add hook for vector extract with extension
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.

Differential Revision: http://reviews.llvm.org/D18523

llvm-svn: 267725
2016-04-27 15:20:21 +00:00
Artem Tamazov 5cd55b1784 [AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware registers.
Possibility to specify code of hardware register kept.
Disassemble to symbolic name, if name is known.
Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19335

llvm-svn: 267724
2016-04-27 15:17:03 +00:00
Nico Weber e69b9548b8 Revert r267649, it caused PR27539.
llvm-svn: 267723
2016-04-27 15:16:54 +00:00
Kristof Beyls a14e9a5576 Remove size 1 from check as that isn't part of what the test is meant to be testing.
This test also runs on e.g. ARM-native builds when the X86 backend is also
built.  This test produces code for the default instruction set, even though it
is in a "X86" sub-directory. Given that this test doesn't seem to be testing
anything architecture-specific, it seems it's best to adapt the check to not
check for an architecture-dependent value (the size of the function), rather
than hard-code the test to target x86.

llvm-svn: 267722
2016-04-27 15:03:09 +00:00
Teresa Johnson 02e98331c0 [ThinLTO] Use valueid instead of bitcode offsets in combined index file
Summary:
With the removal of support for lazy parsing of combined index summary
records (e.g. r267344), we no longer need to include the summary record
bitcode offset in the VST entries for definitions. Change the combined
index format to be similar to the per-module index format in using value
ids to cross-reference from the summary record to the VST entry (rather
than the summary record bitcode offset to cross-reference in the other
direction).

The visible changes are:
1) Add the value id to the combined summary records
2) Remove the summary offset from the combined VST records, which has
the following effects:
- No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all
  combined index VST entries now only contain the value id and
  corresponding GUID.
- No longer have duplicate VST entries in the case where there are
  multiple definitions of a symbol (e.g. weak/linkonce), as they all
  have the same value id and GUID.

An implication of #2 above is that in order to hook up an alias to the
correct aliasee based on the value id of the aliasee recorded in the
combined index alias record, we need to scan the entries in the index
for that GUID to find the one from the same module (i.e. the case where
there are multiple entries for the aliasee). But the reader no longer
has to maintain a special map to hook up the alias/aliasee.

Reviewers: joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19481

llvm-svn: 267712
2016-04-27 13:28:35 +00:00
Simon Pilgrim f23aa2a9c9 [InstCombine][SSE] Regenerated vector shift tests
llvm-svn: 267699
2016-04-27 12:04:44 +00:00
Zlatko Buljan de0bbe6d1c [mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU instructions
Differential Revision: http://reviews.llvm.org/D16676

llvm-svn: 267694
2016-04-27 11:31:44 +00:00
Zlatko Buljan 29813620bc [mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, SRAV, SRL and SRLV instructions
Differential Revision: http://reviews.llvm.org/D17989

llvm-svn: 267693
2016-04-27 11:02:23 +00:00
Artur Pilipenko c97eac6555 Use DL preferred alignment for alloca in Value::getPointerAlignment
Teach Value::getPointerAlignment that allocas with no explicit alignment are aligned to preferred alignment of the allocated type.

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D17569

llvm-svn: 267689
2016-04-27 10:42:29 +00:00
Simon Pilgrim d2ea708739 [InstCombine][SSE] Added DemandedBits tests for MOVMSK instructions
MOVMSK zeros the upper bits of the gpr - we should be able to use this.

llvm-svn: 267686
2016-04-27 09:53:09 +00:00
Adam Nemet d2fa414718 [LoopDist] Add llvm.loop.distribute.enable loop metadata
Summary:
D19403 adds a new pragma for loop distribution.  This change adds
support for the corresponding metadata that the pragma is translated to
by the FE.

As part of this I had to rethink the flag -enable-loop-distribute.  My
goal was to be backward compatible with the existing behavior:

  A1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute is specified

  A2. pass is on when invoked directly from opt (e.g. for unit-testing)

The new pragma/metadata overrides these defaults so the new behavior is:

  B1. A1 + enable distribution for individual loop with the pragma/metadata

  B2. A2 + disable distribution for individual loop with the pragma/metadata

The default value whether the pass is on or off comes from the initiator
of the pass.  From the PassManagerBuilder the default is off, from opt
it's on.

I moved -enable-loop-distribute under the pass.  If the flag is
specified it overrides the default from above.

Then the pragma/metadata can further modifies this per loop.

As a side-effect, we can now also use -enable-loop-distribute=0 from opt
to emulate the default from the optimization pipeline.  So to be precise
this is the new behavior:

  C1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute or the pragma/metadata enables it

  C2. pass is on when invoked directly from opt
  unless -enable-loop-distribute=0 or the pragma/metadata disables it

Reviewers: hfinkel

Subscribers: joker.eph, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19431

llvm-svn: 267672
2016-04-27 05:28:18 +00:00
Mehdi Amini c7b950171d Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator"
This reverts commit r267665.
ASAN shows that there is a use of undefined value.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267668
2016-04-27 05:11:44 +00:00
Mehdi Amini 360ed847bc Support "preserving" the summary information when using setModule() API in LTOCodeGenerator
Another attempt at r267655...

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267665
2016-04-27 04:24:10 +00:00
Mehdi Amini a1b8b6cd56 Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator"
This reverts commit r267657, r267656, and r267655.
The test does not pass on multiple bots, I'm unsure why yet but let's unbreak them.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267664
2016-04-27 03:34:28 +00:00
Evgeny Stupachenko 23ce61b663 The patch fixes PR27392.
Summary:
 It is incorrect to compare TripCount (which is BECount + 1)
  with extraiters (or Count) to check if we should enter unrolled
  loop or not, because TripCount can potentially overflow
  (when BECount is max unsigned integer).
 While comparing BECount with (Count - 1) is overflow safe and
  therefore correct.

Reviewer: hfinkel

Differential Revision: http://reviews.llvm.org/D19256

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 267662
2016-04-27 03:04:54 +00:00
Chuang-Yu Cheng 8676c3d599 [ppc64] fix bug in prologue that mfocrf's cr operand should be explict state instead of implicit
This fixes PR27414

Reviewers: kbarton mgrang tjablin

http://reviews.llvm.org/D19255

llvm-svn: 267660
2016-04-27 02:59:28 +00:00
Mehdi Amini 4960919723 Fix the test from r267656: Support "preserving" the summary information when using setModule() API in LTOCodeGenerator
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267657
2016-04-27 01:49:11 +00:00
Mehdi Amini 1f65039f06 Add a test for r267655: Support "preserving" the summary information when using setModule() API in LTOCodeGenerator
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267656
2016-04-27 01:47:46 +00:00
Ahmed Bougacha 19a2ee591a [X86] Don't assume that MMX extractelts are from index 0.
It's probably the case for all 3 MMX users out there, but with
hand-crafted IR, you can trigger selection failures. Fix that.

llvm-svn: 267652
2016-04-27 01:35:29 +00:00
Ahmed Bougacha e68363a03c [X86] Re-enable MMX i32 extractelt combine.
This effectively adds back the extractelt combine removed by r262358:
the direct case can still occur (because x86_mmx is special, see
r262446), but it's the indirect case that's now superseded by the
generic combine.

llvm-svn: 267651
2016-04-27 01:35:25 +00:00
Cong Hou 6f879d9eb1 Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched.
Differential revision: http://reviews.llvm.org/D14840

llvm-svn: 267649
2016-04-27 01:29:18 +00:00
Mehdi Amini b4e1e8297b ThinLTO: do not promote GlobalVariable that have a specific section.
Differential Revision: http://reviews.llvm.org/D18298

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267646
2016-04-27 00:32:13 +00:00
Mehdi Amini da168fbc2e LTOCodeGenerator: turns linkonce(_odr) into weak_(odr) when present "MustPreserve" set
Summary:
If the linker requested to preserve a linkonce function, we should
honor this even if we drop all uses.

Reviewers: dexonsmith

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19527

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267644
2016-04-27 00:32:02 +00:00
Philip Reames 3f83dbeed9 [LVI] Reduce compile time by lazily scanning blocks if needed
When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null.  That's all well and good, but *then* we'd go recurse through our input blocks.  As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain.  The previous code papered over this by using the isKnownNonNull routine from value tracking.  This made the duplication less painful in the common case.

Instead, we know do the block scan only *after* we've gotten the recursive results back.  This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks.  For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so.

Note that the case where we *can't* prove something non-null is still the really expensive case.  We end up scanning each and every block looking for a dereference and never end up finding one.

llvm-svn: 267642
2016-04-27 00:30:55 +00:00
Quentin Colombet 4ff3cfb673 [X86] Make sure it is safe to clobber EFLAGS, if need be, when choosing
the prologue.

Do not use basic blocks that have EFLAGS live-in as prologue if we need
to realign the stack. Realigning the stack uses AND instruction and this
clobbers EFLAGS.

An other alternative would have been to save and restore EFLAGS around
the stack realignment code, but this is likely inefficient.

Fixes PR27531.

llvm-svn: 267634
2016-04-26 23:44:14 +00:00
Justin Bogner c2bf63d29d PM: Port Reassociate to the new pass manager
llvm-svn: 267631
2016-04-26 23:39:29 +00:00
Mitch Bodart 807e13379b [X86] Replace -mcpu with -mattr in several tests
Differential Revision: http://reviews.llvm.org/D19568

llvm-svn: 267629
2016-04-26 23:36:38 +00:00
Sanjay Patel 29dea0d230 [SimplifyCFG] propagate branch metadata when creating select
llvm-svn: 267624
2016-04-26 23:15:48 +00:00
Quentin Colombet 08e79990a0 [MachineBasicBlock] Take advantage of the partially dead information.
Thanks to that information we wouldn't lie on a register being live whereas it
is not.

llvm-svn: 267622
2016-04-26 23:14:29 +00:00
Quentin Colombet 3f19245015 [MachineInstrBundle] Improvement the recognition of dead definitions.
Now, it is possible to know that partial definitions are dead definitions and
recognize that clobbered registers are also dead.

llvm-svn: 267621
2016-04-26 23:14:24 +00:00
Philip Reames 053c2a6f25 [LVI] Apply transfer rule for overdefine inputs for binary operators
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules.  Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined.  This greatly impacts the precision of the overall analysis and makes it far more fragile as well.

This patch builds on 267609 which did the same thing for unary casts.

llvm-svn: 267620
2016-04-26 23:10:35 +00:00
Philip Reames e5030e85ea [LVI] A better fix for the assertion error introduced by 267609
Essentially, I was using the wrong size function.  For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert.  I fixed this, and also added a guard that the input is a sized type.  Test case is for the original mistake.  I'm not sure how to actually exercise the sized type check.

llvm-svn: 267618
2016-04-26 22:52:30 +00:00
Sanjay Patel d2d2aa52cd [LowerExpectIntrinsic] make default likely/unlikely ratio bigger
We need the default ratio to be sufficiently large that it triggers transforms 
based on block frequency info (BFI) and plays well with the recently introduced
BranchProbability used by CGP.

Differential Revision: http://reviews.llvm.org/D19435

llvm-svn: 267615
2016-04-26 22:23:38 +00:00
Philip Reames 38c87c2e50 [LVI] Infer local facts from unary expressions
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.

This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations.

Differential Revision: http://reviews.llvm.org/D19492

llvm-svn: 267609
2016-04-26 21:48:16 +00:00
David Majnemer abb9f55c80 Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes"
The destination buffer that sprintf uses is restrict qualified, we do
not need to worry about derived pointers referenced via format
specifiers.

This reverts commit r267580.

llvm-svn: 267605
2016-04-26 21:04:47 +00:00
Nico Weber 2f1459cbb7 Try to get ResponseFile.ll passing on Windows after r267556.
llvm-svn: 267599
2016-04-26 20:32:51 +00:00
Elena Demikhovsky 308a7eb0d2 Masked Store in Loop Vectorizer - bugfix
Fixed a bug in loop vectorization with conditional store.

Differential Revision: http://reviews.llvm.org/D19532

llvm-svn: 267597
2016-04-26 20:18:04 +00:00
Justin Bogner 4563a06cee PM: Port Internalize to the new pass manager
llvm-svn: 267596
2016-04-26 20:15:52 +00:00
Zachary Turner 53a65ba5c9 Parse and dump PDB DBI Stream Header Information
The DBI stream contains a lot of bookkeeping information for other
streams. In particular it contains information about section contributions
and linked modules. This patch is a first attempt at parsing some of the
information out of the DBI stream. It currently only parses and dumps the
headers of the DBI stream, so none of the module data or section
contribution data is pulled out.

This is just a proof of concept that we understand the basic properties of
the DBI stream's metadata, and followup patches will try to extract more
detailed information out.

Differential Revision: http://reviews.llvm.org/D19500
Reviewed By: majnemer, ruiu

llvm-svn: 267585
2016-04-26 18:42:34 +00:00
Krzysztof Parzyszek 4773f647bd [Tail duplication] Handle source registers with subregisters
When a block is tail-duplicated, the PHI nodes from that block are
replaced with appropriate COPY instructions. When those PHI nodes
contained use operands with subregisters, the subregisters were
dropped from the COPY instructions, resulting in incorrect code.

Keep track of the subregister information and use this information
when remapping instructions from the duplicated block.

Differential Revision: http://reviews.llvm.org/D19337

llvm-svn: 267583
2016-04-26 18:36:34 +00:00
Tim Northover 4397837be2 Reapply: "ARM: put correct symbol index on indirect pointers in __thread_ptr.""
A latent bug in llvm-objdump used the wrong format specifier on 32-bit
targets, causing the test to fail. This fixes the issue.

llvm-svn: 267582
2016-04-26 18:29:16 +00:00
David Majnemer 8cd77baebc [SimplifyLibCalls] sprintf doesn't copy null bytes
sprintf doesn't read or copy the terminating null byte from it's string
operands.  sprintf will append it's own after processing all of the
format specifiers.

This fixes PR27526.

llvm-svn: 267580
2016-04-26 18:16:49 +00:00
Manman Ren 1c3f65a18c Swift Calling Convention: use %RAX for sret.
We don't need to copy the sret argument into %rax upon return.
rdar://25671494

llvm-svn: 267579
2016-04-26 18:08:06 +00:00
Saleem Abdulrasool 4c6c4e2bbb tests: tweak MIR for ARM tests to correct MI issues
The Machine Instruction Verifier flagged some issues in the serialized MIR.
Adjust the input to correct them.

Fixes the remaining portion of PR27480.

llvm-svn: 267578
2016-04-26 17:54:21 +00:00
Saleem Abdulrasool 601e029ba3 test: remove some bleeding whitespace
Kill bleeding whitespace.  NFC

llvm-svn: 267577
2016-04-26 17:54:16 +00:00
Sanjay Patel d66607bd8c [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344

CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.

For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.

As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for 
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 
50/50 taken/not-taken might still be 100% predictable.

Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.

Differential Revision: http://reviews.llvm.org/D19488

llvm-svn: 267572
2016-04-26 17:11:17 +00:00
Zachary Turner f34e01624a Refactor some more PDB reading code into DebugInfoPDB.
Differential Revision: http://reviews.llvm.org/D19445
Reviewed By: David Majnemer

llvm-svn: 267564
2016-04-26 16:20:00 +00:00
Konstantin Zhuravlyov 1d99c4d03c [AMDGPU] Reserve VGPRs for trap handler usage if instructed
Differential Revision: http://reviews.llvm.org/D19235

llvm-svn: 267563
2016-04-26 15:43:14 +00:00
Sam Kolton 3025e7f25f [AMDGPU] Assembler: basic support for SDWA instructions
Support for SDWA instructions for VOP1 and VOP2 encoding.
Not done yet:
  - converters for support optional operands and modifiers
  - VOPC
  - sext() modifier
  - intrinsics
  - VOP2b (see vop_dpp.s)
  - V_MAC_F32 (see vop_dpp.s)

Differential Revision: http://reviews.llvm.org/D19360

llvm-svn: 267553
2016-04-26 13:33:56 +00:00
Andrey Turetskiy b405606432 [X86] PR27502: Fix the LEA optimization pass.
Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass.

Differential Revision: http://reviews.llvm.org/D19409

llvm-svn: 267551
2016-04-26 12:18:12 +00:00
Marcin Koscielnicki 0cfb612413 [PowerPC] Add support for llvm.thread.pointer
Differential Revision: http://reviews.llvm.org/D19304

llvm-svn: 267546
2016-04-26 10:37:22 +00:00
Marcin Koscielnicki 33571e2c41 [SPARC] [SSP] Add support for LOAD_STACK_GUARD.
This fixes PR22248 on sparc.

Differential Revision: http://reviews.llvm.org/D19386

llvm-svn: 267545
2016-04-26 10:37:14 +00:00
Marcin Koscielnicki fafb44951a [SPARC] Add support for llvm.thread.pointer.
Differential Revision: http://reviews.llvm.org/D19387

llvm-svn: 267544
2016-04-26 10:37:01 +00:00
Mehdi Amini aa309b1a81 ThinLTOCodeGenerator: preserve linkonce when in "MustPreserved" set
If the linker specifically requested for a linkonce to be preserved,
we need to make sure we won't drop it even if all the uses in the
current module disappear.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267543
2016-04-26 10:35:01 +00:00
Renato Golin 5a55a029c0 Revert "ARM: put correct symbol index on indirect pointers in __thread_ptr."
This reverts commit r267488, as it broke some ARM buildbots.

llvm-svn: 267541
2016-04-26 10:02:02 +00:00
Sanjoy Das 51df5fae4a Symbolize operand bundle blocks for bcanalyzer
Reviewers: joker.eph

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19523

llvm-svn: 267524
2016-04-26 05:59:08 +00:00
Craig Topper c5551bfc26 [AArch64] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267522
2016-04-26 05:26:51 +00:00
Craig Topper d8d6be4f99 [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.
The default is Legal, which results in 'Cannot select' errors.

llvm-svn: 267521
2016-04-26 05:04:37 +00:00
Craig Topper edb4a6ba98 [ARM] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267520
2016-04-26 05:04:33 +00:00
Dehao Chen 5d6d4841ed Tune basic block annotation algorithm.
Summary:
Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary.

This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19301

llvm-svn: 267519
2016-04-26 04:59:11 +00:00
Bill Seurer d6e92135bd [powerpc] mark JIT tests as UNSUPPORTED on powerpc64 big endian
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit 
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.  
To get the buildbots running I am marking these as UNSUPPORTED for now.

If this is fixed remove the UNSUPPORTED flag "powerpc64-unknown-linux-gnu".

In r267516 I marked these as XFAIL but they succeed on some of the bots
on stage1.

llvm-svn: 267518
2016-04-26 03:59:19 +00:00
Richard Trieu d7f31a31d1 Pass the test file in through stdin instead of by filename.
When passed in via filename, this test will fail if the path to the test
has the strings "f1" and "f2" in somewhere.  Pass the file through stdin
to prevent test failures due to coincidences in path names.

llvm-svn: 267517
2016-04-26 03:43:49 +00:00
Bill Seurer ab5171f988 [powerpc] mark JIT tests as XFAIL on powerpc64 big endian
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit 
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.  
To get the buildbots running I am marking these as XFAIL for now.

If this is fixed remove the XFAIL flag "powerpc64-unknown-linux-gnu".

llvm-svn: 267516
2016-04-26 02:33:22 +00:00
Hal Finkel e4c0c1679b [SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.

llvm-svn: 267515
2016-04-26 02:06:06 +00:00
Hal Finkel 411d31ad72 [LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops
I really thought we were doing this already, but we were not. Given this input:

void Test(int *res, int *c, int *d, int *p) {
  for (int i = 0; i < 16; i++)
    res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}

we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.

One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.

The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.

Differential Revision: http://reviews.llvm.org/D19512

llvm-svn: 267514
2016-04-26 02:00:36 +00:00
Dan Gohman f456290fca [WebAssembly] Account for implicit operands when computing operand indices.
llvm-svn: 267511
2016-04-26 01:40:56 +00:00
Sanjay Patel a31b0c0ece [CodeGenPrepare] don't convert an unpredictable select into control flow
Suggested in the review of D19488:
http://reviews.llvm.org/D19488

llvm-svn: 267504
2016-04-26 00:47:39 +00:00
Justin Bogner 1a07501379 PM: Port GlobalOpt to the new pass manager
llvm-svn: 267499
2016-04-26 00:28:01 +00:00
Ahmed Bougacha 5cf735a5b1 [X86] Use LivePhysRegs in X86FixupBWInsts.
Kill-flags, which computeRegisterLiveness uses, are not reliable.
LivePhysRegs is.

Differential Revision: http://reviews.llvm.org/D19472

llvm-svn: 267495
2016-04-26 00:00:48 +00:00
Justin Bogner 6f6c5f2a02 GlobalOpt: Convert a bunch of tests from grep to FileCheck
llvm-svn: 267493
2016-04-25 23:36:50 +00:00
Sanjay Patel 82059090d3 Add check for "branch_weights" with prof metadata
While we're here, fix the comment and variable names to make it
clear that these are raw weights, not percentages.

llvm-svn: 267491
2016-04-25 23:15:16 +00:00
James Y Knight 51208eaccc [Sparc] Fix double-float fabs and fneg on little endian CPUs.
The SparcV8 fneg and fabs instructions interestingly come only in a
single-float variant. Since the sign bit is always the topmost bit no
matter what size float it is, you simply operate on the high
subregister, as if it were a single float.

However, the layout of double-floats in the float registers is reversed
on little-endian CPUs, so that the high bits are in the second
subregister, rather than the first.

Thus, this expansion must check the endianness to use the correct
subregister.

llvm-svn: 267489
2016-04-25 22:54:09 +00:00
Tim Northover cbba0aba16 ARM: put correct symbol index on indirect pointers in __thread_ptr.
Otherwise the linker has no idea what should be resolved.

llvm-svn: 267488
2016-04-25 22:36:07 +00:00
Arch D. Robison be0490a6e8 Optimize store of "bitcast" from vector to aggregate.
This patch is what was the "instcombine" portion of D14185, with an additional 
test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). 
The patch causes instcombine to replace sequences of extractelement-insertvalue-store 
that act essentially like a bitcast followed by a store.

Differential review: http://reviews.llvm.org/D14260

llvm-svn: 267482
2016-04-25 22:22:39 +00:00
Tim Northover 5c3140f745 ARM: put extern __thread stubs in a special section.
The linker needs to know that the symbols are thread-local to do its job
properly.

llvm-svn: 267473
2016-04-25 21:12:04 +00:00