Commit Graph

153074 Commits

Author SHA1 Message Date
Florian Hahn a5ba4ee8bc [Triple] Add isThumb and isARM functions.
Summary:
isThumb returns true for Thumb triples (little and big endian), isARM
returns true for ARM triples (little and big endian).
There are a few more checks using arm/thumb that are not covered by
those functions, e.g. that the architecture is either ARM or Thumb
(little endian) or ARM/Thumb little endian only.

Reviewers: javed.absar, rengolin, kristof.beyls, t.p.northover

Reviewed By: rengolin

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D34682

llvm-svn: 310781
2017-08-12 17:40:18 +00:00
Simon Pilgrim 32546d1434 [X86] Regenerate merge store tests. NFCI.
Gives us a much better idea of what is going on than just relying on a few checks.

llvm-svn: 310780
2017-08-12 17:27:35 +00:00
Sanjay Patel fe346f9f5b [BDCE] clear poison generators after turning a value into zero (PR33695, PR34037)
nsw, nuw, and exact carry implicit assumptions about their operands, so we need
to clear those after trivializing a value. We decided there was no danger for
llvm.assume or metadata, so there's just a comment about that.

This fixes miscompiles as shown in:
https://bugs.llvm.org/show_bug.cgi?id=33695
https://bugs.llvm.org/show_bug.cgi?id=34037

Differential Revision: https://reviews.llvm.org/D36592

llvm-svn: 310779
2017-08-12 16:41:08 +00:00
Sylvestre Ledru 3655495b49 Fix some minor typos in the llvm XRay exemple
llvm-svn: 310777
2017-08-12 15:08:11 +00:00
Richard Smith 3704eba1d1 D36604: PR34148: Do not assume we can use a copy relocation for an `external_weak` global
An `external_weak` global may be intended to resolve as a null pointer if it's
not defined, so it doesn't make sense to use a copy relocation for it.

Differential Revision: https://reviews.llvm.org/D36604

llvm-svn: 310773
2017-08-11 23:52:28 +00:00
Kostya Serebryany 0873be2ad0 [libFuzzer] experimental support for Clang's coverage (fprofile-instr-generate), Linux-only
llvm-svn: 310771
2017-08-11 23:03:22 +00:00
Sanjay Patel 2b452c7192 [x86] add tests for rotate left/right with masked shifter; NFC
As noted in the test comment, instcombine now produces the masked
shift value even when it's not included in the source, so we should
handle this.

Although the AMD/Intel docs don't say it explicitly, over-rotating
the narrow ops produces the same results. An existence proof that
this works as expected on all x86 comes from gcc 4.9 or later:
https://godbolt.org/g/K6rc1A

llvm-svn: 310770
2017-08-11 22:38:40 +00:00
John Baldwin eebcc47500 [MIPS] Use ABI to determine stack alignment.
Summary:
The stack alignment depends on the ABI (16 bytes for N32 and N64 and 8
bytes for O32), not the CPU type.

Reviewers: sdardis

Reviewed By: sdardis

Subscribers: atanasyan, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D36326

llvm-svn: 310768
2017-08-11 22:07:56 +00:00
Sanjay Patel 5d6df36fde [x86] regenerate test checks, add 64-bit run; NFC
llvm-svn: 310767
2017-08-11 22:05:33 +00:00
Eugene Zelenko 530851c2bc [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 310766
2017-08-11 21:30:02 +00:00
Zachary Turner b57884e818 Fix some broken tests.
These were pending in a separate patch but I forgot to squash them
before comitting, and this one didn't go through.

llvm-svn: 310764
2017-08-11 21:14:01 +00:00
Eli Friedman 51cf2604b6 [OptDiag] Updating Remarks in SampleProfile
Updating remark API to newer OptimizationDiagnosticInfo API. This
allows remarks to show up in diagnostic yaml file, and enables use
of opt-viewer tool.

Hotness information for remarks (L505 and L751) do not display hotness
information, most likely due to profile information not being
propagated yet. Unsure if this is the desired outcome.

Patch by Tarun Rajendran.

Differential Revision: https://reviews.llvm.org/D36127

llvm-svn: 310763
2017-08-11 21:12:04 +00:00
Craig Topper ac217b7aa3 [X86] Don't use fsin/fcos/fsincos instructions ever
Summary:
Previously we would use these instructions if sse was disabled and fastmath was enabled.

As mentioned in D28335, this is a bad idea.

Reviewers: efriedma, scanon, DavidKreitzer

Reviewed By: DavidKreitzer

Subscribers: zvi, llvm-commits

Differential Revision: https://reviews.llvm.org/D36344

llvm-svn: 310762
2017-08-11 20:55:29 +00:00
Rafael Espindola b8956a70d3 Fix access to undefined weak symbols in pic code
When the access to a weak symbol is not a call, the access has to be
able to produce the value 0 at runtime.

We were sometimes producing code sequences where that was not possible
if the code was leaded more than 4g away from 0.

llvm-svn: 310756
2017-08-11 20:49:27 +00:00
Zachary Turner 28e31ee45e Output S_SECTION symbols to the Linker module.
PDBs need to contain 1 module for each object file/compiland,
and a special one synthesized by the linker.  This one contains
a symbol record for each output section in the executable with
its address information.  This patch adds such symbols to the
linker module.  Note that we also are supposed to add an
S_COFFGROUP symbol for what appears to be each input section that
contributes to each output section, but it's not entirely clear
how to generate these yet, so I'm leaving that for a separate
patch.

llvm-svn: 310754
2017-08-11 20:46:28 +00:00
Matt Arsenault 71bcbd451f AMDGPU: Start adding tail call support
Handle the sibling call cases.

llvm-svn: 310753
2017-08-11 20:42:08 +00:00
Kostya Serebryany a85ab2e5a1 [libFuzzer] recommend Clang Coverage for coverage visualization
llvm-svn: 310751
2017-08-11 20:32:47 +00:00
George Karpenkov d20e8b4edb [libFuzzer] Re-enable coverage.test on Darwin.
llvm-svn: 310750
2017-08-11 20:30:52 +00:00
Daniel Sanders e6c216ed5b Revert r310716 (and r310735): [globalisel][tablegen] Support zero-instruction emission.
Two of the Windows bots are failing test\CodeGen\X86\GlobalISel\select-inc.mir
which should not have been affected by the change. Reverting while I investigate.

Also reverted r310735 because it builds on r310716.

llvm-svn: 310745
2017-08-11 19:19:21 +00:00
Zachary Turner 460ed0a0c5 Add documentation for llvm-pdbutil.
llvm-svn: 310744
2017-08-11 19:00:22 +00:00
Zachary Turner ee9906d884 [LLD/PDB] Write actual records to the globals stream.
Previously we were writing an empty globals stream.  Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this.  Regardless,
without it we don't have information about global variables, so
we need to fix it anyway.  This patch does that.

With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.

Differential Revision: https://reviews.llvm.org/D36535

llvm-svn: 310743
2017-08-11 19:00:03 +00:00
John Baldwin 3a1a951800 [mips] clang-format MipsSubtarget.cpp.
This only fixes a few things and serves as my initial test commit.

llvm-svn: 310742
2017-08-11 18:35:19 +00:00
Brian Gesiak fd6c89dc36 [opt-viewer] Decode HTML bytes for Python 3
Summary:
When using Python 3, `pygments.highlight()` returns a `bytes` object, not
a `str`, causing the call to `str.replace` on the following line to fail
with a runtime exception:
`TypeError: 'str' does not support the buffer interface`. Decode the
bytes into a string in order to fix the exception.

Test Plan:
Run `opt-viewer.py` with Python 3.4, and confirm no runtime error occurs
when calling `str.replace`.

Reviewers: anemet

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36624

llvm-svn: 310741
2017-08-11 18:05:26 +00:00
Brian Gesiak 34f07f9e0e [opt-viewer] Use Python 3-compatible iteritems
Summary:
Replace a usage of a Python 2-specific `dict.iteritems()` with the
Python 3-compatible definition provided at the top of the same file.

Test Plan:
Run `opt-viewer.py` using Python 3 and confirm it no longer encounters a
runtime error when calling `dict.iteritems()`.

Reviewers: anemet

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36623

llvm-svn: 310740
2017-08-11 18:02:07 +00:00
Brian Gesiak efd227f3c7 [opt-viewer] Use Python 3-compatible `intern()`
Summary:
In Python 2, `intern()` is a builtin function available to all programs.
In Python 3, it was moved into the `sys` module, available as
`sys.intern`. Import it such that, within `optrecord.py`, `intern()` is
available whether run using Python 2 or 3.

Test Plan:
Run `opt-viewer.py` using Python 3, confirm it no longer
encounters a runtime error when `intern()` is called.

Reviewers: anemet

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36622

llvm-svn: 310739
2017-08-11 17:56:57 +00:00
Stanislav Mekhanoshin 976cedda26 [AMDGPU] Fix santizer error after last commit
Removed useless assert.

llvm-svn: 310738
2017-08-11 17:54:43 +00:00
Xinliang David Li 24524f314c Fix typo /NFC
llvm-svn: 310737
2017-08-11 17:49:20 +00:00
Daniel Sanders 6ac981151e [globalisel][tablegen] Generate TypeObject table. NFC
Summary:
Generate the type table from the types used by a target rather than hard-coding
the union of types used by all targets.

Depends on D36084

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36085

llvm-svn: 310735
2017-08-11 17:30:37 +00:00
George Karpenkov 73b7e78350 Update libFuzzer documentation for -fsanitize=fuzzer-no-link flag
Differential Revision: https://reviews.llvm.org/D36602

llvm-svn: 310734
2017-08-11 17:23:45 +00:00
Stanislav Mekhanoshin 7f37794ebd [AMDGPU] Ported and adopted AMDLibCalls pass
The pass does simplifications of well known AMD library calls.
If given -amdgpu-prelink option it works in a pre-link mode which
allows to reference new library functions which will be linked in
later.

In addition it also used to process traditional AMD option
-fuse-native which allows to replace some of the functions with
their fast native implementations from the library.

The necessary glue to pass the prelink option and translate
-fuse-native is to be added to the driver.

Differential Revision: https://reviews.llvm.org/D36436

llvm-svn: 310731
2017-08-11 16:42:09 +00:00
David Blaikie 32512e161f Orc: PR33769: Don't rely on comparisons with default constructed iterators
llvm-svn: 310729
2017-08-11 16:38:28 +00:00
Craig Topper 561092f233 [AVX512] Remove and autoupgrade many of the broadcast intrinsics
Summary:
This autoupgrades most of the broadcast intrinsics. They've been unused in clang for some time.

This leaves the 32x2 intrinsics because they are still used in clang.

Reviewers: RKSimon, zvi, igorb

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36606

llvm-svn: 310725
2017-08-11 16:22:45 +00:00
Craig Topper 0f30fe9634 [x86] Enable some support for lowerVectorShuffleWithUndefHalf with AVX-512
Summary:
This teaches 512-bit shuffles to detect unused halfs in order to reduce shuffle size.

We may need to refine the 512-bit exit point. I couldn't remember if we had good cross lane shuffles for 8/16 bit with AVX-512 or not.

I believe this is step towards being able to handle D36454 without a special case.

From here we need to improve our ability to combine extract_subvector with insert_subvector and other extract_subvectors. And we need to support narrowing binary operations where we don't demand all elements. This may be improvements to DAGCombiner::narrowExtractedVectorBinOp(by recognizing an insert_subvector in addition to concat) or we may need a target specific combiner.

Reviewers: RKSimon, zvi, delena, jbhateja

Reviewed By: RKSimon, jbhateja

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36601

llvm-svn: 310724
2017-08-11 16:20:05 +00:00
Sanjay Patel 169dae70a6 [x86] use more shift or LEA for select-of-constants (2nd try)
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.

We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the 
   push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a 
   post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
   that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310717
2017-08-11 15:44:14 +00:00
Daniel Sanders 1fb1ce0c87 [globalisel][tablegen] Support zero-instruction emission.
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.

Depends on D35833

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36084

llvm-svn: 310716
2017-08-11 15:40:32 +00:00
Simon Dardis ae5b53e7cd [mips] Lift the assertion on the types that can be used with MipsGPRel
Post commit review of rL308619 highlighted the need for handling N64
with -fno-pic. Testing reveale a stale assert when generating a GP
relative addressing mode.

This patch removes that assert and adds the necessary patterns for
MIPS64 to perform gp relative addressing with -fno-pic
(and the implicit -mno-abicalls + -mgpopt).

Reviewers: atanasyan, nitesh.jain

Differential Revision: https://reviews.llvm.org/D36472

llvm-svn: 310713
2017-08-11 14:36:05 +00:00
Michal Gorny 348240536f [cmake] Expose the dependencies of ExecutionEngine as PUBLIC
Expose the dependencies of LLVMExecutionEngine library as PUBLIC rather
than PRIVATE when building a shared library. This is necessary because
the library is not contained but exposes API of other LLVM libraries via
its headers.

This causes other libraries to fail to link if the linker verifies for
correctness of -l flags (i.e. fails on indirect dependencies). This e.g.
happens when building LLDB against shared LLVM:

  lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTIN4llvm18MCJITMemoryManagerE[_ZTIN4llvm18MCJITMemoryManagerE]+0x10): undefined reference to `typeinfo for llvm::RuntimeDyld::MemoryManager'
  lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN4llvm18MCJITMemoryManagerE[_ZTVN4llvm18MCJITMemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
  lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x48): undefined reference to `llvm::RTDyldMemoryManager::deregisterEHFrames()'
  lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
  lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0xd0): undefined reference to `llvm::JITSymbolResolver::anchor()'
  collect2: error: ld returned 1 exit status

Declaring the dependencies as PUBLIC guarantees that any package using
the ExecutionEngine library will also get explicit -l flags for
the dependent libraries guaranteeing that the symbols exposed in headers
could be resolved.

Patch originally written by NAKAMURA Takumi.

Differential Revision: https://reviews.llvm.org/D36211

llvm-svn: 310712
2017-08-11 13:25:20 +00:00
Nirav Dave 0a48e5d506 Improve handling of insert_subvector of bitcast values
Fix insert_subvector / extract_subvector merges of bitcast values.

Reviewers: efriedma, craig.topper, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D34571

llvm-svn: 310711
2017-08-11 13:21:41 +00:00
Nirav Dave d1b3f09faa [X86][DAG] Switch X86 Target to post-legalized store merge
Move store merge to happen after intrinsic lowering to allow lowered
stores to be merged.

Some regressions due in MergeConsecutiveStores to missing
insert_subvector that are addressed in follow up patch.

Reviewers: craig.topper, efriedma, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34559

llvm-svn: 310710
2017-08-11 13:21:35 +00:00
Sam Parker 6d42de7847 [AArch64] Enable ARMv8.3-A pointer authentication
Add assembler and disassembler support for the ARMv8.3-A pointer
authentication instructions.

Differential Revision: https://reviews.llvm.org/D36517

llvm-svn: 310709
2017-08-11 13:14:00 +00:00
Sjoerd Meijer d7129f9a4c [AArch64] Remove dotprod from base extension list
Dot product is an optional ARMv8.2a extension; remove it from the ARMv8.2a base
extension list. This was introduced in commit r310480.

Differential Revision: https://reviews.llvm.org/D36609

llvm-svn: 310708
2017-08-11 13:12:49 +00:00
Sjoerd Meijer 7426c97bc6 [ARM] Assembler support for the ARMv8.2a dot product instructions
Commit r310480 added the AArch64 ARMv8.2a dot product instructions;
this adds the AArch32 instructions.

Differential Revision: https://reviews.llvm.org/D36575

llvm-svn: 310701
2017-08-11 09:52:30 +00:00
Simon Pilgrim 83cf3a29b5 [DAGCombiner] Remove shuffle support from simplifyShuffleMask
rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs.

Removing support until I can triage the problem

llvm-svn: 310699
2017-08-11 08:37:00 +00:00
Mikael Holmen 8b10680922 [IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert*
Summary:
This fixes PR32721 in IfConvertTriangle and possible similar problems in
IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond.

In PR32721 we had a triangle

   EBB
   | \
   |  |
   | TBB
   |  /
   FBB

where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).

The edge updating code and branch probability updating code is now pushed
into MergeBlocks() which allows us to share the same update logic between
more callsites. This lets us remove several dependencies on analyzeBranch
and completely eliminate RemoveExtraEdges.

One thing that showed up with this patch was that IfConversion sometimes
left a successor with 0% probability even if there was no branch or
fallthrough to the successor.

One such example from the test case ifcvt_bad_zero_prob_succ.mir. The
indirect branch tBRIND can only jump to bb.1, but without the patch we
got:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000), %bb.2(0x00000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

  bb.2:

There is no way to jump from bb.1 to bb2, but still there is a 0% edge
from bb.1 to bb.2.

With the patch applied we instead get the expected:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

Since bb.2 had no predecessor at all, it was removed.

Several testcases had to be updated due to this since the removed
successor made the "Branch Probability Basic Block Placement" pass
sometimes place blocks in a different order.

Finally added a couple of new test cases:

* PR32721_ifcvt_triangle_unanalyzable.mir:
  Regression test for the original problem dexcribed in PR 32721.

* ifcvt_triangleWoCvtToNextEdge.mir:
  Regression test for problem that caused a revert of my first attempt
  to solve PR 32721.

* ifcvt_simple_bad_zero_prob_succ.mir:
  Test case showing the problem where a wrong successor with 0% probability
  was previously left.

* ifcvt_[diamond|forked_diamond|simple]_unanalyzable.mir
  Very simple test cases for the simple and (forked) diamond cases
  involving unanalyzable branches that can be nice to have as a base if
  wanting to write more complicated tests.

Reviewers: iteratee, MatzeB, grosser, kparzysz

Reviewed By: kparzysz

Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34099

llvm-svn: 310697
2017-08-11 06:57:08 +00:00
Chandler Carruth 19913b22c0 [PM] Switch the CGSCC debug messages to use the standard LLVM debug
printing techniques with a DEBUG_TYPE controlling them.

It was a mistake to start re-purposing the pass manager `DebugLogging`
variable for generic debug printing -- those logs are intended to be
very minimal and primarily used for testing. More detailed and
comprehensive logging doesn't make sense there (it would only make for
brittle tests).

Moreover, we kept forgetting to propagate the `DebugLogging` variable to
various places making it also ineffective and/or unavailable. Switching
to `DEBUG_TYPE` makes this a non-issue.

llvm-svn: 310695
2017-08-11 05:47:13 +00:00
Jessica Paquette 6315d2d21d [MachineOutliner] Add RegState::Define to LDRXpost in insertOutlinedCall
This fixes a MachineVerifier failure in machine-outliner.mir. Not explicitly
adding RegState::Define to the LR argument makes it unhappy because an explicit
definition is marked as a use.

Build failure:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/7496/testReport/junit/LLVM/CodeGen_AArch64/machine_outliner_mir/

llvm-svn: 310671
2017-08-10 23:11:24 +00:00
Ahmed Bougacha a24e4cda00 Revert "[AsmParser] Hash is not a comment on some targets"
This reverts commit r310457.

It causes clang-produced IR to fail llvm codegen.

llvm-svn: 310662
2017-08-10 21:23:00 +00:00
Reid Kleckner 6be1ed05c7 Disable some IR death tests when SEH is available
They hang for me locally. I suspect that there is a use-after-free when
attempting to destroy an LLVMContext after asserting from the middle of
metadata tracking. It doesn't seem worth debugging it further.

llvm-svn: 310660
2017-08-10 21:14:07 +00:00
Nirav Dave f8556ad48f Revert "[DAG] Cleanup unused nodes after store merge. NFCI."
This reverts commit r310648 which causes an unexpected assertion failure

llvm-svn: 310659
2017-08-10 21:03:36 +00:00
Craig Topper 9a6110b2d3 [InstCombine] Make (X|C1)^C2 -> X^(C1^C2) iff X&~C1 == 0 work for splat vectors
This also corrects the description to match what was actually implemented. The old comment said X^(C1|C2), but it implemented X^((C1|C2)&~(C1&C2)). I believe ((C1|C2)&~(C1&C2)) is equivalent to (C1^C2).

Differential Revision: https://reviews.llvm.org/D36505

llvm-svn: 310658
2017-08-10 20:35:34 +00:00
Nirav Dave 4d28c0ff4f [DAG] Relax type restriction for store merge
Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary.

Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34569

llvm-svn: 310655
2017-08-10 19:52:45 +00:00
Simon Pilgrim b59c2d9d73 [CostModel][X86] Add SSE2 two-src shuffle costs
llvm-svn: 310654
2017-08-10 19:32:35 +00:00
Eli Friedman e9499dd4c7 [ARM] Clarify legal addressing modes for ARM and Thumb2. NFC
The existing code is very clever, but not clear, which seems
like the wrong tradeoff here.

Differential Revision: https://reviews.llvm.org/D36559

llvm-svn: 310653
2017-08-10 19:31:27 +00:00
Benjamin Kramer dbbe5756c1 [gold-plugin] Use more StringRef. No functionality change intended.
llvm-svn: 310652
2017-08-10 19:28:00 +00:00
Simon Pilgrim 7354531b82 [CostModel][X86] Add avx1 two-src shuffle costs
llvm-svn: 310650
2017-08-10 19:02:51 +00:00
Nirav Dave 99d9d24553 [DAG] Cleanup unused nodes after store merge. NFCI.
llvm-svn: 310648
2017-08-10 18:53:14 +00:00
Simon Pilgrim ac2e50a4ca [CostModel][X86] Add avx2 two-src shuffle costs
llvm-svn: 310645
2017-08-10 18:29:34 +00:00
Taewook Oh f5040b9685 Make .file directive to have basename only
Summary:
Currently LLVM puts directory along with the filename in .file directive, but this behavior doesn't match gcc. There's a no clear description about which one is right (https://sourceware.org/binutils/docs/as/File.html#File), but one document (https://sourceware.org/gdb/current/onlinedocs/stabs/ELF-Linker-Relocation.html) suggests that STT_FILE symbol in elf file is expected to have basename only, which should have a same sting file .file directive according to (https://docs.oracle.com/cd/E26502_01/html/E28388/eoiyg.html).

This also affects badly on the build system that uses hashing, as the directory info could be differnt from developer to developer even when they're working on same file.

Reviewers: pcc, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36018

llvm-svn: 310642
2017-08-10 18:17:11 +00:00
Simon Pilgrim 2f529412e1 [CostModel][X86] Extend two src shuffle cost tests
Cover most 128/256/512/1024-bit cases for vXf64/vXi64, vXf32/vXi32, vXi16 + vXi8

llvm-svn: 310641
2017-08-10 18:02:45 +00:00
Craig Topper 57b4d8646b [InstCombine] Fix a crash in getSelectCondition if we happen to have two inverse vectors of i1 constants.
We used to try to truncate the constant vector to vXi1, but if it's already i1 this would fail. Instead we now use IRBuilder::getZExtOrTrunc which should check the type and only create a trunc if needed. I believe this should trigger constant folding in the IRBuilder and ultimately do the same thing just with the additional type check.

llvm-svn: 310639
2017-08-10 17:48:14 +00:00
Craig Topper cd13ebca5f [InstCombine] Add a DEBUG_COUNTER to InstCombine to limit how many instructions are visited for debug
Sometimes it would be nice to stop InstCombine mid way through its combining to see the current IR. By using a debug counter we can place an upper limit on how many instructions to process.

This will also allow skipping the first X combines, but that has the potential to change later combines since earlier canonicalizations might have been skipped.

Differential Revision: https://reviews.llvm.org/D36553

llvm-svn: 310638
2017-08-10 17:48:12 +00:00
Craig Topper 9cd976d041 [DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use.
This make it consistent with STATISTIC which it will often appears near.

While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary.

llvm-svn: 310637
2017-08-10 17:48:11 +00:00
Benjamin Kramer 74fbf45f4c [gold-plugin] Avoid race condition when creating temporary files.
This is both a potential security issue and a potential functionality
issue because we create temporary files from multiple threads. Use
the safe version of createTemporaryFile instead.

llvm-svn: 310636
2017-08-10 17:38:41 +00:00
Simon Pilgrim fe67612eba [CostModel][X86] Add avx512vbmi broadcast/reverse/single-src shuffle cost tests
llvm-svn: 310633
2017-08-10 17:33:25 +00:00
Simon Pilgrim 702e5fa391 [CostModel][X86] Improve single src shuffle costs
Add missing SK_PermuteSingleSrc costs for AVX2 targets and earlier, also added some of the simpler SK_PermuteTwoSrc costs to support splitting of SK_PermuteSingleSrc shuffles

llvm-svn: 310632
2017-08-10 17:27:20 +00:00
Simon Pilgrim c3e546fe42 Fix 'not all control paths return' warning on windows builds. NFCI.
llvm-svn: 310631
2017-08-10 17:20:09 +00:00
Marek Sokolowski d0c5bfa233 Fixup for r310621: Hint the compilers about unreachable code.
llvm-svn: 310623
2017-08-10 16:46:52 +00:00
Marek Sokolowski 719e22d4f4 Add .rc scripts tokenizer.
This extends the shell of llvm-rc tool with the ability of tokenization
of the input files. Currently, ASCII and ASCII-compatible UTF-8 files
are supported.

Thanks to Nico Weber (thakis) for his original work in this area.

Differential Revision: https://reviews.llvm.org/D35957

llvm-svn: 310621
2017-08-10 16:21:44 +00:00
Krzysztof Parzyszek bea30c6286 Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Sanjay Patel 71de9d7cab [InstCombine] add memcpy expansion tests with potential DL dependency; NFC
Current behavior is to transform these independently of the datalayout.

There's a proposal to change this in D35035:
https://reviews.llvm.org/D35035

llvm-svn: 310611
2017-08-10 15:37:26 +00:00
Marcello Maggioni 61f48ca35e [unittests] Adding a unittest for ChangeTaTargetIndex. NFC
Differential Revision: https://reviews.llvm.org/D36565

llvm-svn: 310610
2017-08-10 15:35:25 +00:00
Nirav Dave 06242a99ce [DAG] Rewrite expression. NFC.
llvm-svn: 310608
2017-08-10 15:29:33 +00:00
Simon Pilgrim 419215abb7 [CostModel][X86] Added v2f64/v2i64 single src shuffle model tests
Fixed label checks for all prefixes

llvm-svn: 310606
2017-08-10 15:25:08 +00:00
Nirav Dave 926e2d39bf [X86] Keep dependencies when constructing loads in combineStore
Summary:
Preserve chain dependecies between old and new loads constructed to
prevent loads from reordering below later stores.

Fixes PR34088.

Reviewers: craig.topper, spatel, RKSimon, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36528

llvm-svn: 310604
2017-08-10 15:12:32 +00:00
Sanjay Patel 6470011e9c [InstCombine] regenerate test checks; NFC
llvm-svn: 310603
2017-08-10 15:07:37 +00:00
Krzysztof Parzyszek 709e4f9b73 [Hexagon] Use isMetaInstruction instead of isDebugValue
llvm-svn: 310601
2017-08-10 15:00:30 +00:00
Alexander Potapenko 5241081532 [sanitizer-coverage] Change cmp instrumentation to distinguish const operands
This implementation of SanitizerCoverage instrumentation inserts different
callbacks depending on constantness of operands:

  1. If both operands are non-const, then a usual
     __sanitizer_cov_trace_cmp[1248] call is inserted.
  2. If exactly one operand is const, then a
     __sanitizer_cov_trace_const_cmp[1248] call is inserted. The first
     argument of the call is always the constant one.
  3. If both operands are const, then no callback is inserted.

This separation comes useful in fuzzing when tasks like "find one operand
of the comparison in input arguments and replace it with the other one"
have to be done. The new instrumentation allows us to not waste time on
searching the constant operands in the input.

Patch by Victor Chibotaru.

llvm-svn: 310600
2017-08-10 15:00:13 +00:00
Sanjay Patel 8dfe8e21e8 [InstCombine] regenerate test checks, add comments; NFC
llvm-svn: 310598
2017-08-10 14:51:42 +00:00
Chad Rosier a5508e3119 [NewGVN] Add CL option to control the generation of phi-of-ops (disable by default).
Differential Revision: https://reviews.llvm.org/D36478539

llvm-svn: 310594
2017-08-10 14:12:57 +00:00
Guy Blank 136b543745 [SelectionDAG] Allow constant folding for implicitly truncating BUILD_VECTOR nodes.
In FoldConstantArithmetic, handle BUILD_VECTOR nodes that do implicit truncation on the elements.

This is similar to what is done in FoldConstantVectorArithmetic.

Differential Revision:
https://reviews.llvm.org/D36506

llvm-svn: 310593
2017-08-10 14:09:50 +00:00
Alexander Potapenko 7235bcdf8f [libFuzzer] Update LibFuzzer w.r.t. the new comparisons instrumentation API
Added the _sanitizer_cov_trace_const_cmp[1248] callbacks.
For now they are implemented the same way as _sanitizer_cov_trace_cmp[1248].
For more details, please see https://reviews.llvm.org/D36465.

Patch by Victor Chibotaru.

llvm-svn: 310592
2017-08-10 14:01:45 +00:00
Oleg Ranevskyy fa66a340eb [CMake][LLVM] Remove duplicated library mask. Broken clang linking against clangShared
Summary:
The `LLVM${c}Info` mask is listed twice in LLVM-Config.cmake. This results in the libraries such as LLVMARMInfo, LLVMAArch4Info, etc appearing twice in `extract_symbols.py` command line while building `clangShared`. `Extract_symbols.py` does not work well in such a case and completely ignores the symbols from the duplicated libraries. Thus, the LLVM(...)Info symbols do not get exported from `clangShared` and linking clang against it fails with unresolved dependencies.

Seems to be a mere copy-paste mistake.

Reviewers: beanz, chapuni

Reviewed By: chapuni

Subscribers: chapuni, aemerson, mgorny, kristof.beyls, llvm-commits, asl

Differential Revision: https://reviews.llvm.org/D36119

llvm-svn: 310590
2017-08-10 13:37:58 +00:00
Nikolai Bozhenov d97136c182 [ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2.
The original patch was an improvement to IR ValueTracking on non-negative
integers. It has been checked in to trunk (D18777, r284022). But was disabled by
default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
 
Reviewers: reames, hfinkel
 
Differential Revision: https://reviews.llvm.org/D34101
 
Patch by: Olga Chupina <olga.chupina@intel.com>

llvm-svn: 310583
2017-08-10 11:24:57 +00:00
Zoran Jovanovic f4f2d084c6 [mips][microMIPS] Extending size reduction pass with XOR16
Author: milena.vujosevic.janicic
Reviewers: sdardis
The patch extends size reduction pass for MicroMIPS.
XOR instruction is transformed into 16-bit instruction XOR16, if possible.
Differential Revision: https://reviews.llvm.org/D34239

llvm-svn: 310579
2017-08-10 10:27:29 +00:00
Sam Parker 71a474d563 [AArch64] Assembler support for v8.3 RCpc
Added assembler and disassembler support for the new Release
Consistent processor consistent instructions, introduced with ARM
v8.3-A for AArch64.

Differential Revision: https://reviews.llvm.org/D36522

llvm-svn: 310575
2017-08-10 09:52:55 +00:00
Sam Parker 9d95764c3b [ARM][AArch64] ARMv8.3-A enablement
The beta ARMv8.3 ISA specifications have been released for AArch64
and AArch32, these can be found at:
https://developer.arm.com/products/architecture/a-profile/exploration-tools

An introduction to this architecture update can be found at:
https://community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions

This patch is the first in a series which will add ARM v8.3-A support
in LLVM and Clang. It adds the necessary changes that create targets
for both the ARM and AArch64 backends.

Differential Revision: https://reviews.llvm.org/D36514

llvm-svn: 310561
2017-08-10 09:41:00 +00:00
Elad Cohen 22ba97a0a6 [SelectionDAG] When scalarizing vselect, don't assert on
a legal cond operand.

When scalarizing the result of a vselect, the legalizer currently expects
to already have scalarized the operands. While this is true for the true/false
operands (which have the same type as the result), it is not case for the
condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such
as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is
illegal to hit an assertion during the scalarization.

The handling is similar to r205625.
This also exposes the fact that (v1i1 extract_subvector) should be legal
and selectable on AVX512 - We do this by custom lowering to vector_extract_elt.
This still leaves us in some cases with redundant dag nodes which will be
combined in a separate soon to come patch.

This fixes pr33349.

Differential revision: https://reviews.llvm.org/D36511

llvm-svn: 310552
2017-08-10 07:44:23 +00:00
Dehao Chen 2f4e2e2758 Revert part of r310296 to make it really NFC for instrumentation PGO.
Summary: Part of r310296 will disable PGOIndirectCallPromotion in ThinLTO backend if PGOOpt is None. However, as PGOOpt is not passed down to ThinLTO backend for instrumentation based PGO, that change would actually disable ICP entirely in ThinLTO backend, making it behave differently in instrumentation PGO mode. This change reverts that change, and only disable ICP there when it is SamplePGO.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36566

llvm-svn: 310550
2017-08-10 05:10:32 +00:00
Chandler Carruth 9c161e894a [LCG] Fix an assert in a on-scope-exit lambda that checked the contents
of the returned value.

Checking the returned value from inside of a scoped exit isn't actually
valid. It happens to work when NRVO fires and the stars align, which
they reliably do with Clang but don't, for example, on MSVC builds.

llvm-svn: 310547
2017-08-10 03:05:21 +00:00
Hiroshi Yamauchi ccd412f48d [LVI] Fix LVI compile time regression around constantFoldUser()
Summary:
Avoid checking each operand and calling getValueFromCondition() before calling
constantFoldUser() when the instruction type isn't supported by
constantFoldUser().

This fixes a large compile time regression in an internal build.

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36552

llvm-svn: 310545
2017-08-10 02:23:14 +00:00
Peter Collingbourne 00f808ffcc Linker: Create a function declaration when moving a non-prevailing alias of function type.
We were previously creating a global variable of function type,
which is invalid IR. This issue was exposed by r304690, in which we
started asserting that global variables were of a valid type.

Fixes PR33462.

Differential Revision: https://reviews.llvm.org/D36438

llvm-svn: 310543
2017-08-10 01:07:44 +00:00
Craig Topper ba69187988 [InstSimplify] Add test cases that show that simplifySelectWithICmpCond doesn't work with non-canonical comparisons.
llvm-svn: 310542
2017-08-10 01:02:02 +00:00
Eugene Zelenko c8fbf6ffea [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 310541
2017-08-10 00:46:15 +00:00
Evgeniy Stepanov 3427d17cf8 Fix thinlto cache key computation for cfi-icall.
Summary:
Fixed PR33966.

CFI code generation for users (not just callers) of a function depends
on whether this function has a jumptable entry or not. This
information needs to be encoded in of thinlto cache key.

We filter the jumptable list against functions that are actually
referenced in the current module.

Subscribers: mehdi_amini, inglorion, eraman, hiraditya

Differential Revision: https://reviews.llvm.org/D36346

llvm-svn: 310536
2017-08-09 23:24:07 +00:00
Matthias Braun a88587ce0c ARM: Fix CMP_SWAP expansion
Clean up after my misguided attempt in r304267 to "fix" CMP_SWAP
returning an uninitialized status value.

- I was always using tMOVi8 to zero the status register which cannot
  encode higher register numbers and llvm would silently miscompile)

- Nobody was ever looking at that status value outside the expansion.
  ARMDAGToDAGISel::SelectCMP_SWAP() the only place creating CMP_SWAP
  instructions was not mapping anything to it. (The cmpxchg status value
  from llvm IR is lowered to a manual comparison after the CMP_SWAP)

So this:
- Renames the register from "status" to "temp" it make it obvious that
  it isn't used outside the expansion.
- Remove the zeroing status/temp register.
- Keep the live-in list improvements from r304267

Fixes http://llvm.org/PR34056

llvm-svn: 310534
2017-08-09 22:22:05 +00:00
Matthias Braun 93f2b4bdde LangRef: Fix/improve cmpxchg wording
llvm-svn: 310533
2017-08-09 22:22:04 +00:00
Benjamin Kramer 324d96b83a [Path] Sink predicate computations to their uses. NFCI.
llvm-svn: 310531
2017-08-09 22:06:32 +00:00
Coby Tayree 7683ca04eb [X86][Asm] Allow negative immediate to appear before bracketed expression
Currently, only non-negative immediate is allowed prior to a brac expression (memory reference).
MASM / GAS does not have any problem cope with the left side of the real line, so we should be able to as well.

Differntial Revision: https://reviews.llvm.org/D36229

llvm-svn: 310528
2017-08-09 21:49:17 +00:00
Krzysztof Parzyszek 1966fd79a7 [Hexagon] Ignore DBG_VALUEs when counting instructions in hexagon-early-if
llvm-svn: 310524
2017-08-09 21:22:05 +00:00
Benoit Belley b1a9ad81c5 [Linker] PR33527 - Linker::LinkOnlyNeeded should import AppendingLinkage globals
Linker::LinkOnlyNeeded should always import globals with
AppendingLinkage.

This resolves PR33527.

Differential Revision: https://reviews.llvm.org/D34448

llvm-svn: 310522
2017-08-09 20:58:39 +00:00
Craig Topper f983a410c9 [Docs] Remove a stray period from a code example in the Programmer's Manual.
llvm-svn: 310520
2017-08-09 20:55:33 +00:00
Eli Friedman c0c182cce1 [llvm-cov] Rearrange entries in report index.
Files which don't contain any functions are likely useless; don't
include them in the main table. Put the links at the bottom of the
page, in case someone wants to figure out coverage for code inside
a macro.

Not sure if this is the best solution, but it seems like an
improvement.

Differential Revision: https://reviews.llvm.org/D36298

llvm-svn: 310518
2017-08-09 20:43:31 +00:00
Lang Hames 14a22a442d [RuntimeDyld][ORC] Add support for Thumb mode to RuntimeDyldMachOARM.
This patch adds support for thumb relocations to RuntimeDyldMachOARM, and adds
a target-specific flags field to JITSymbolFlags (so that on ARM we can record
whether each symbol is Thumb-mode code).

RuntimeDyldImpl::emitSection is modified to ensure that stubs memory is
correctly aligned based on the size returned by getStubAlignment().

llvm-svn: 310517
2017-08-09 20:19:27 +00:00
Matt Arsenault 36cd1859f3 AMDGPU: Fix assert on n inline asm constraint
llvm-svn: 310515
2017-08-09 20:09:35 +00:00
Krzysztof Parzyszek 688843d657 [Hexagon] Tie implicit uses to defs in predicated instructions
llvm-svn: 310514
2017-08-09 19:58:00 +00:00
Sanjay Patel 3239518b90 [SimplifyCFG] remove checks for crasher test from r310481
Not sure why the earlier version would fail, but trying to get the bots
(and my local machine) to pass again.

llvm-svn: 310510
2017-08-09 18:56:26 +00:00
Sanjay Patel c50e55d0e6 [InstCombine] narrow rotate left/right patterns to eliminate zext/trunc (PR34046)
I couldn't find any smaller folds to help the cases in:
https://bugs.llvm.org/show_bug.cgi?id=34046
after:
rL310141

The truncated rotate-by-variable patterns elude all of the existing transforms because 
of multiple uses and knowledge about demanded bits and knownbits that doesn't exist 
without the whole pattern. So we need an unfortunately large pattern match. But by 
simplifying this pattern in IR, the backend is already able to generate 
rolb/rolw/rorb/rorw for x86 using its existing rotate matching logic (although
there is a likely extraneous 'and' of the rotate amount). 

Note that rotate-by-constant doesn't have this problem - smaller folds should already 
produce the narrow IR ops.

Differential Revision: https://reviews.llvm.org/D36395

llvm-svn: 310509
2017-08-09 18:37:41 +00:00
David Blaikie 2988479ffb PointerLikeTypeTraits: class->struct & remove the base definition
This simplifies implementations and removing the base definition paves
the way for detecting whether a type is 'pointer like'.

llvm-svn: 310507
2017-08-09 18:34:21 +00:00
David Blaikie 76fb649b0d Reduce variable scope by moving declaration into if clause
llvm-svn: 310506
2017-08-09 18:34:18 +00:00
Matt Morehouse 49e5acab33 [asan] Fix instruction emission ordering with dynamic shadow.
Summary:
Instrumentation to copy byval arguments is now correctly inserted
after the dynamic shadow base is loaded.

Reviewers: vitalybuka, eugenis

Reviewed By: vitalybuka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D36533

llvm-svn: 310503
2017-08-09 17:59:43 +00:00
Mandeep Singh Grang 083b505f32 [COFF, ARM64] Add MS builtins __dmb, __dsb, __isb
Reviewers: mstorsjo, rnk, ruiu, compnerd, efriedma

Reviewed By: efriedma

Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36110

llvm-svn: 310502
2017-08-09 17:58:39 +00:00
Guy Blank 7f60c991ae [X86][AVX512] Choose correct registers in vpbroadcastb/w
Fixes the vpbroadcastb/w instructions which use GPRs as source operands, to use the correct registers.
The full GPR should be used, and not the subregister, as it happens before the patch.

Fixes pr33795

Differential Revision:
https://reviews.llvm.org/D36479

llvm-svn: 310498
2017-08-09 17:21:01 +00:00
Dmitry Preobrazhensky 1e32550de6 [AMDGPU][MC][GFX9] Added 16-bit renamed and "_legacy" VALU opcodes
See Bug 33629: https://bugs.llvm.org//show_bug.cgi?id=33629

Reviewers: vpykhtin, SamWot, arsenm

Differential Revision: https://reviews.llvm.org/D36322

llvm-svn: 310497
2017-08-09 17:10:47 +00:00
Nuno Lopes 7829506731 CFLAA: return MustAlias when pointers p, q are equal, i.e.,
must-alias(p, sz_p, p, sz_q)  irrespective of access sizes sz_p, sz_q

As discussed a couple of weeks ago on the ML.
This makes the behavior consistent with that of BasicAA.
AA clients already check the obj size themselves and may not require the
obj size to match exactly the access size (e.g., in case of store forwarding)

llvm-svn: 310495
2017-08-09 17:02:18 +00:00
Davide Italiano 1a943a90f5 [ValueTracking] Turn a test into an assertion.
As discussed with Chad, this should never happen, but this
assertion is basically free, so, keep it around just in case.

llvm-svn: 310493
2017-08-09 16:06:54 +00:00
Davide Italiano 5753239307 [ValueTracking] Update tests to unbreak the bots.
llvm-svn: 310492
2017-08-09 16:06:04 +00:00
Sanjay Patel 6f80d6b46f [x86] add more tests for select-of-constants; NFC
This is to help recommit a fixed version of r310208. As shown in PR34097, 
we could miscompile if subtraction of the constants overflowed.

llvm-svn: 310490
2017-08-09 15:57:02 +00:00
Florian Hahn d68bc7ae8d [ARM] Emit error when ARM exec mode is not available.
Summary:
A similar error message has been removed from the ARMTargetMachineBase
constructor in r306939. With this patch, we generate an error message
for the example below, compiled with -mcpu=cortex-m0, which does not
have ARM execution mode.

    __attribute__((target("arm"))) int foo(int a, int b)
    {
        return a + b % a;
    }

    __attribute__((target("thumb"))) int bar(int a, int b)
    {
        return a + b % a;
    }

By adding this error message to ARMBaseTargetMachine::getSubtargetImpl,
we can deal with functions that set -thumb-mode in target-features.
At the moment it seems like Clang does not have access to target-feature
specific information, so adding the error message to the frontend will
be harder.

Reviewers: echristo, richard.barton.arm, t.p.northover, rengolin, efriedma

Reviewed By: echristo, efriedma

Subscribers: efriedma, aemerson, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D35627

llvm-svn: 310486
2017-08-09 15:39:10 +00:00
Coby Tayree 3638655325 [X86][Asm]Allow far jmp/call to be picked when using explicit FWORD size specifier
Currently, far jmp/call which utilizes a 48bit memory operand would have been invoked via the 'lcall/ljmp' mnemonic (intel style).
This patch align those variants to formal intel spec

Differential Revision: https://reviews.llvm.org/D35846

llvm-svn: 310485
2017-08-09 15:34:55 +00:00
Davide Italiano 30e5194287 [ValueTracking] Honour recursion limit.
The recently improved support for `icmp` in ValueTracking
(r307304) exposes the fact that `isImplied` condition doesn't
really bail out if we hit the recursion limit (and calls
`computeKnownBits` which increases the depth and asserts).

Differential Revision:  https://reviews.llvm.org/D36512

llvm-svn: 310481
2017-08-09 15:13:50 +00:00
Sjoerd Meijer 7987633263 [AArch64] Assembler support for the ARMv8.2a dot product instructions
Dot product is an optional ARMv8.2a extension, see also the public architecture
specification here:
https://developer.arm.com/products/architecture/a-profile/exploration-tools.
This patch adds AArch64 assembler support for these dot product instructions.

Differential Revision: https://reviews.llvm.org/D36515

llvm-svn: 310480
2017-08-09 14:59:54 +00:00
Florian Hahn fc4b3951e9 [ARM] Remove FeatureNoARM implies ModeThumb.
Summary:
By removing FeatureNoARM implies ModeThumb, we can detect cases where a
function's target-features contain -thumb-mode (enables ARM codegen for the
function), but the architecture does not support ARM mode. Previously, the
implication caused the FeatureNoARM bit to be cleared for functions with
-thumb-mode, making the assertion in ARMSubtarget::ARMSubtarget [1]
pointless for such functions.

This assertion is the only guard against generating ARM code for
architectures without ARM codegen support. Is there a place where we
could easily generate error messages for the user? At the moment, we
would generate ARM code for Thumb-only architectures. X86 has the same
behavior as ARM, as in it only has an assertion and no error message,
but I think for ARM an error message would be helpful. What do you
think?

For the example below, `llc -mtriple=armv7m-eabi test.ll -o -` will
generate ARM assembler (or fail with an assertion error with this patch).
Note that if we run the resulting assembler through llvm-mc, we get
an appropriate error message, but not when codegen is handled
through clang.

```
define void @bar() #0 {
entry:
  ret void
}

attributes #0 = { "target-features"="-thumb-mode" }
```

[1] c1f7b54cef/lib/Target/ARM/ARMSubtarget.cpp (L147)

Reviewers: t.p.northover, rengolin, peter.smith, aadg, silviu.baranga, richard.barton.arm, echristo

Reviewed By: rengolin, echristo

Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35569

llvm-svn: 310476
2017-08-09 13:53:28 +00:00
Benoit Belley d9017cc65e [Support] PR33388 - Fix formatv_object move constructor
formatv_object currently uses the implicitly defined move constructor,
but it is buggy. In typical use-cases, the problem doesn't show-up
because all calls to the move constructor are elided. Thus, the buggy
constructors are never invoked.

The issue especially shows-up when code is compiled using the
-fno-elide-constructors compiler flag. For instance, this is useful when
attempting to collect accurate code coverage statistics.

The exact issue is the following:

The Parameters data member is correctly moved, thus making the
parameters occupy a new memory location in the target
object. Unfortunately, the default copying of the Adapters blindly
copies the vector of pointers, leaving each of these pointers
referencing the parameters in the original object instead of the copied
one. These pointers quickly become dangling when the original object is
deleted. This quickly leads to crashes.

The solution is to update the Adapters pointers when performing a move.
The copy constructor isn't useful for format objects and can thus be
deleted.

This resolves PR33388.

Differential Revision: https://reviews.llvm.org/D34463

llvm-svn: 310475
2017-08-09 13:47:01 +00:00
Nirav Dave 6110d3ad00 [DAG] Explicitly cleanup merged load values during store merge. NFCI.
llvm-svn: 310474
2017-08-09 13:37:07 +00:00
Haojian Wu c1cae0bd64 Fix -Wpessimizing-move warning.
llvm-svn: 310469
2017-08-09 12:49:20 +00:00
Coby Tayree 3bfb365f52 [AsmParser][AVX512]Enhance OpMask/Zero/Merge syntax check rubostness
Adopt a more strict approach regarding what marks should/can appear after a destination register, when operating upon an AVX512 platform.

Differential Revision: https://reviews.llvm.org/D35785

llvm-svn: 310467
2017-08-09 12:32:05 +00:00
Jonas Paulsson 6228aeda65 [LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess()
isLegalAddressingMode() has recently gained the extra optional Instruction*
parameter, and therefore it can now do the job that previously only
isFoldableMemAccess() could do.

The SystemZ implementation of isLegalAddressingMode() has gained the
functionality of checking for offsets, which used to be done with
isFoldableMemAccess().

The isFoldableMemAccess() hook has been removed everywhere.

Review: Quentin Colombet, Ulrich Weigand
https://reviews.llvm.org/D35933

llvm-svn: 310463
2017-08-09 11:28:01 +00:00
Jonas Paulsson 5052771af3 [LoopStrengthReduce] Don't neglect the Fixup.Offset in isAMCompletelyFolded().
In the recursive call to isAMCompletelyFolded(), the passed offset should be
the sum of F.BaseOffset and Fixup.Offset.

Review: Quentin Colombet.
llvm-svn: 310462
2017-08-09 11:27:46 +00:00
Simon Dardis c6be2251b7 [mips] PR34083 - Wimplicit-fallthrough warning in MipsAsmParser.cpp
Assert that a binary expression is actually a binary expression,
rather than potientially incorrectly attempting to handle it as a
unary expression.

This resolves PR34083.

Thanks to Simonn Pilgrim for reporting the issue!

llvm-svn: 310460
2017-08-09 10:47:52 +00:00
Gabor Horvath 18bda5b0f2 Suppress a warning. NFC.
llvm-svn: 310459
2017-08-09 10:38:53 +00:00
Oliver Stannard 7f569a2d54 [AsmParser] Hash is not a comment on some targets
The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.

Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.

Differential Revision: https://reviews.llvm.org/D36405

llvm-svn: 310457
2017-08-09 09:40:51 +00:00
Chandler Carruth 2cd28b2ba0 [LCG] Completely remove the map-based association of post-order numbers
to Nodes when removing ref edges from a RefSCC.

This map based association turns out to be pretty expensive for large
RefSCCs and pointless as we already have embedded data members inside
nodes that we use to track the DFS state. We can reuse one of those and
the map becomes unnecessary.

This also fuses the update of those numbers into the scan across the
pending stack of nodes so that we don't walk the nodes twice during the
DFS.

With this I expect the new PM to be faster than the old PM for the test
case I have been optimizing. That said, it also seems simpler and more
direct in many ways. The side storage was always pretty awkward.

The last remaining hot-spot in the profile of the LCG once this is done
will be the edge iterator walk in the DFS. I'll take a look at improving
that next.

llvm-svn: 310456
2017-08-09 09:37:39 +00:00
Davide Italiano c163fac184 [GlobalOpt] Switch an explicit loop to llvm::all_of(). NFCI.
llvm-svn: 310453
2017-08-09 09:23:29 +00:00
Chandler Carruth 9c3deaa653 [LCG] Special case when removing a ref edge from a RefSCC leaves
that RefSCC still connected.

This is common and can be handled much more efficiently. As soon as we
know we've covered every node in the RefSCC with the DFS, we can simply
reset our state and return. This avoids numerous data structure updates
and other complexity.

On top of other changes, this appears to get new PM back to parity with
the old PM for a large protocol buffer message source code. The dense
map updates are very hot in this function.

llvm-svn: 310451
2017-08-09 09:14:34 +00:00
Chandler Carruth 23c2f44cc7 [LCG] Switch one of the update methods for the LazyCallGraph to support
limited batch updates.

Specifically, allow removing multiple reference edges starting from
a common source node. There are a few constraints that play into
supporting this form of batching:

1) The way updates occur during the CGSCC walk, about the most we can
   functionally batch together are those with a common source node. This
   also makes the batching simpler to implement, so it seems
   a worthwhile restriction.
2) The far and away hottest function for large C++ files I measured
   (generated code for protocol buffers) showed a huge amount of time
   was spent removing ref edges specifically, so it seems worth focusing
   there.
3) The algorithm for removing ref edges is very amenable to this
   restricted batching. There are just both API and implementation
   special casing for the non-batch case that gets in the way. Once
   removed, supporting batches is nearly trivial.

This does modify the API in an interesting way -- now, we only preserve
the target RefSCC when the RefSCC structure is unchanged. In the face of
any splits, we create brand new RefSCC objects. However, all of the
users were OK with it that I could find. Only the unittest needed
interesting updates here.

How much does batching these updates help? I instrumented the compiler
when run over a very large generated source file for a protocol buffer
and found that the majority of updates are intrinsically updating one
function at a time. However, nearly 40% of the total ref edges removed
are removed as part of a batch of removals greater than one, so these
are the cases batching can help with.

When compiling the IR for this file with 'opt' and 'O3', this patch
reduces the total time by 8-9%.

Differential Revision: https://reviews.llvm.org/D36352

llvm-svn: 310450
2017-08-09 09:05:27 +00:00
Craig Topper b049158a55 [X86] Add the rest of the ADC and SBB instructions to isDefConvertible.
I don't know if this really affects anything. Just thought it was weird that we had all of the ADD/SUB/AND/OR/XOR instructions.

llvm-svn: 310447
2017-08-09 06:17:49 +00:00
Craig Topper 5706c01c0b [InstCombine] Use regular dyn_cast instead of a matcher for a simple case. NFC
llvm-svn: 310446
2017-08-09 06:17:48 +00:00
Serguei Katkov 6ea2e81cf6 [ImplicitNullCheck] Fix the bug when dependent instruction accesses memory
It is possible that dependent instruction may access memory.
In this case we must reject optimization because the memory change will
be visible in null handler basic block. So we will execute an instruction which
we must not execute if check fails.

Reviewers: sanjoy, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36392

llvm-svn: 310443
2017-08-09 05:17:02 +00:00
Zachary Turner 2a77b266b5 Fix broken pdb test.
For some reason I didn't see this failure the first time.  The
output format changed slightly, so we just have to update the
test for the new format.

llvm-svn: 310442
2017-08-09 04:48:16 +00:00
Zachary Turner e7a3463d2a Fix -Wreorder-fields warning.
llvm-svn: 310440
2017-08-09 04:34:11 +00:00
Zachary Turner 5448dabbdd [PDB] Fix an issue writing the publics stream.
In the refactor to merge the publics and globals stream, a bug
was introduced that wrote the wrong value for one of the fields
of the PublicsStreamHeader.  This caused debugging in WinDbg
to break.

We had no way of dumping any of these fields, so in addition to
fixing the bug I've added dumping support for them along with a
test that verifies the correct value is written.

llvm-svn: 310439
2017-08-09 04:23:59 +00:00
Zachary Turner 946204c83e [PDB] Merge Global and Publics Builders.
The publics stream and globals stream are very similar. They both
contain a list of hash buckets that refer into a single shared stream,
the symbol record stream. Because of the need for each builder to manage
both an independent hash stream as well as a single shared record
stream, making the two builders be independent entities is not the right
design. This patch merges them into a single class, of which only a
single instance is needed to create all 3 streams.  PublicsStreamBuilder
and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder
class, which writes all 3 streams at once.

Note that this patch does not contain any functionality change. So we're
still not yet writing any records to the globals stream. All we're doing
is making it so that when we do start writing records to the globals,
this refactor won't have to be part of that patch.

Differential Revision: https://reviews.llvm.org/D36489

llvm-svn: 310438
2017-08-09 04:23:25 +00:00
Craig Topper 24a6951187 [InstCombine] Add a test case for a missed opportunity to turn a select into logic ops.
llvm-svn: 310434
2017-08-09 01:30:22 +00:00
Eugene Zelenko d8e3efd5c2 [AMDGPU] Revert r310429 changes in AMDKernelCodeT.h which broke some build bots.
llvm-svn: 310430
2017-08-09 00:06:29 +00:00
Eugene Zelenko d16eff816b [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 310429
2017-08-08 23:53:55 +00:00
Quentin Colombet 8dd90fb54b Revert "[GlobalISel] Remove the GISelAccessor API."
This reverts commit r310115.

It causes a linker failure for the one of the unittests of AArch64 on one
of the linux bot:
http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429

: && /home/fedora/gcc/install/gcc-7.1.0/bin/g++   -fPIC
-fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W
-Wno-unused-parameter -Wwrite-strings -Wcast-qual
-Wno-missing-field-initializers -pedantic -Wno-long-long
-Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment
-ffunction-sections -fdata-sections -O2
-L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined
-Wl,-O3 -Wl,--gc-sections
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o  -o
unittests/Target/AArch64/AArch64Tests
lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn
lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn
lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn
lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn
lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread
lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread
-Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib
&& :
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0):
undefined reference to `vtable for llvm::LegalizerInfo'
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8):
undefined reference to `vtable for llvm::RegisterBankInfo'

The particularity of this bot is that it is built with
BUILD_SHARED_LIBS=ON

However, I was not able to reproduce the problem so far.
Reverting to unblock the bot.

llvm-svn: 310425
2017-08-08 22:22:30 +00:00
Nemanja Ivanovic 3f98fb4a3f My commit r310346 introduced some valid warnings. This cleans them up.
llvm-svn: 310424
2017-08-08 22:17:31 +00:00
Jessica Paquette d36945bf3a [MachineOutliner] Ensure AArch64 outliner doesn't mess with W30 or LR
Before, the outliner would mark all instructions that read from/modify LR as
illegal. This doesn't handle W30, which overlaps with LR. This shouldn't be
outlined.

This commit fixes that by making modifiesRegister() and readsRegister() look at
W30 + take in a TRI argument. This makes sure that modifiesRegister() and
readsRegister() won't outline either of W30 and LR.

https://reviews.llvm.org/D36435

llvm-svn: 310422
2017-08-08 21:51:26 +00:00
Wei Mi bb9106ac4b [GVN] Remove stale entries in phitranslate cache when new phi is generated for PRE
When a new phi is generated for scalarpre of an expression, the phiTranslate cache
will become stale: Before PRE, the candidate expression must not be available in a
predecessor block, and phitranslate will cache the information. After PRE, the
expression will become available in all predecessor blocks, so the related entries
in phiTranslate cache becomes stale. The patch will simply remove the stale entries
so phiTranslate can be recomputed next time.

The stale entries in phitranslate cache will not affect correctness but will cause
missing PRE opportunity for later instructions.

Differential Revision: https://reviews.llvm.org/D36124

llvm-svn: 310421
2017-08-08 21:40:14 +00:00
Nuno Lopes 598d1632e1 BasicAA: assert on another case where aliasGEP shouldn't get a PartialAlias response
llvm-svn: 310420
2017-08-08 21:25:26 +00:00
Dehao Chen 34cfcb29aa Make ICP uses PSI to check for hotness.
Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted.

Reviewers: davidxl, tejohnson, eraman

Reviewed By: davidxl

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36341

llvm-svn: 310416
2017-08-08 20:57:33 +00:00
Reid Kleckner e2e82061f9 [codeview] Emit nested enums and typedefs from classes
Previously we limited ourselves to only emitting nested classes, but we
need other kinds of types as well.

This fixes the Visual Studio STL visualizers, so that users can
visualize std::string and other objects.

llvm-svn: 310410
2017-08-08 20:30:14 +00:00
Craig Topper 364359e4fc [InstCombine] Support pulling left shifts through a subtract with constant LHS
We already support pulling through an add with constant RHS. We can do the same for subtract.

Differential Revision: https://reviews.llvm.org/D36443

llvm-svn: 310407
2017-08-08 20:14:11 +00:00
Nirav Dave 8a813cf646 [DAG] Introduce peekThroughBitcast function. NFCI.
llvm-svn: 310405
2017-08-08 20:01:18 +00:00
Nirav Dave 515116d7c2 [DAG] Update comments. NFC.
llvm-svn: 310404
2017-08-08 19:52:19 +00:00
Connor Abbott 249fc7bd2a [AMDGPU] Add llvm.amdgpu.update.dpp intrinsic
Summary:
Now that we've made all the necessary backend changes, we can add a new
intrinsic which exposes the new capabilities to IR producers. Since
llvm.amdgpu.update.dpp is a strict superset of llvm.amdgpu.mov.dpp, we
should deprecate the former. We also add tests for all the functionality
that was added in previous changes, now that we can access it via an IR
construct.

Reviewers: tstellar, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D34718

llvm-svn: 310399
2017-08-08 18:52:22 +00:00
Chad Rosier 4d852597f8 [NewGVN] Use a cast instead of a dyn_cast.
Differential Revision: https://reviews.llvm.org/D36478

llvm-svn: 310397
2017-08-08 18:41:49 +00:00
Zachary Turner 59e3ae827d [PDB] Fix linking of function symbols and local variables.
The compiler outputs PROC32_ID symbols into the object files
for functions, and these symbols have an embedded type index
which, when copied to the PDB, refer to the IPI stream.  However,
the symbols themselves are also converted into regular symbols
(e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular
symbol records refer to the TPI stream.  So this patch applies
two fixes to function records.
  1. It converts ID symbols to the proper non-ID record type.
  2. After remapping the type index from the object file's index
     space to the PDB file/IPI stream's index space, it then
     remaps that index to the TPI stream's index space by.

Besides functions, during the remapping process we were also
discarding symbol record types which we did not recognize.
In particular, we were discarding S_BPREL32 records, which is
what MSVC uses to describe local variables on the stack.  So
this patch fixes that as well by copying them to the PDB.

Differential Revision: https://reviews.llvm.org/D36426

llvm-svn: 310394
2017-08-08 18:34:44 +00:00
Adrian Prantl e502f00538 dsymutil: support dwarf version mismatches between object and clang module
This adds a missing call to maybeUpdateMaxDwarfVersion when visitng a
clang module. Failing to do so will cause a failure when emitting
DWARF 4 forms into a CU that AsmPrinter believes to be DWARF 2.

rdar://problem/33666528

llvm-svn: 310392
2017-08-08 18:26:12 +00:00
Anna Thomas 9b6e12f3dc [LoopVectorize] Fix assertion failure in Fcmp vectorization
Summary:
When vectorizing fcmps we can trip on incorrect cast assertion when setting the
FastMathFlags after generating the vectorized FCmp.
This can happen if the FCmp can be folded to true or false directly. The fix
here is to set the FastMathFlag using the FastMathFlagBuilder *before* creating
the FCmp Instruction. This is what's done by other optimizations such as
InstCombine.
Added a test case which trips on cast assertion without this patch.

Reviewers: Ayal, mssimpso, mkuper, gilr

Reviewed by: Ayal, mssimpso

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D36244

llvm-svn: 310389
2017-08-08 18:07:44 +00:00
Tim Northover f370f2e3c6 Revert "[ARM] Fix assembly and disassembly for VMRS/VMSR"
This reverts r310243. Only MVFR2 is actually restricted to v8 and it'll be a
little while before we can get a proper fix together. Better that we allow
incorrect code than reject correct in the meantime.

llvm-svn: 310384
2017-08-08 17:16:46 +00:00
Sanjoy Das 6c14b84404 [DomTree] Use a non-recursive DFS instead of a recursive one; NFC
Summary: The recursive DFS can stack overflow in pathological cases.

Reviewers: kuhar

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36442

llvm-svn: 310383
2017-08-08 17:15:29 +00:00
Craig Topper b498a23f0e [KnownBits][ValueTracking] Move the math for calculating known bits for add/sub into a static method in KnownBits object
I want to reuse this code in SimplifyDemandedBits handling of Add/Sub. This will make that easier.

Wonder if we should use it in SelectionDAG's computeKnownBits too.

Differential Revision: https://reviews.llvm.org/D36433

llvm-svn: 310378
2017-08-08 16:29:35 +00:00
Alex Bradbury 6b34069ab7 [RISCV] Fix warning about unused getSubtargetFeatureName()
llvm-svn: 310375
2017-08-08 16:20:39 +00:00
Nuno Lopes c7d4110aa7 BasicAA: aliasGEP shouldn't get a PartialAlias response here
add an assert() to ensure that's the case (as I'm not convinced it won't happen)

llvm-svn: 310373
2017-08-08 16:13:24 +00:00
Simon Pilgrim 91b7b991d4 [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as well as BUILD_VECTOR
Minor extension to D36393

llvm-svn: 310372
2017-08-08 16:10:33 +00:00
Alex Bradbury 04f06d922f [RISCV] Add basic RISCVAsmParser (missing files)
This commit adds the files missing from rL310361. Apologies for the noise.

Differential Revision: https://reviews.llvm.org/D23563

llvm-svn: 310363
2017-08-08 14:43:36 +00:00
Alex Bradbury 1a4272914d [RISCV] Add basic RISCVAsmParser
This doesn't yet support parsing things like %pcrel_hi(foo), but will handle
basic instructions with register or immediate operands.

Differential Revision: https://reviews.llvm.org/D23563

llvm-svn: 310361
2017-08-08 14:32:35 +00:00
Nemanja Ivanovic 979dcb6f09 [PowerPC] Don't crash on larger splats achieved through 1-byte splats
We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will
produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch
prevents crashing in that situation.

Differential Revision: https://reviews.llvm.org/D35650

llvm-svn: 310358
2017-08-08 13:52:45 +00:00
Daniel Sanders 75b84fc5ce [globalisel][tablegen] Remove unnecessary ; to satisfy ubuntu-gcc7.1-werror.
llvm-svn: 310357
2017-08-08 13:21:26 +00:00
Nemanja Ivanovic bed7136eee Appease compilers that have the -Wcovered-switch-default switch.
llvm-svn: 310356
2017-08-08 12:41:56 +00:00
Amjad Aboud 6fa6813aec [X86] Improved X86::CMOV to Branch heuristic.
Resolved PR33954.
This patch contains two more constraints that aim to reduce the noise cases where we convert CMOV into branch for small gain, and end up spending more cycles due to overhead.

Differential Revision: https://reviews.llvm.org/D36081

llvm-svn: 310352
2017-08-08 12:17:56 +00:00
Nemanja Ivanovic 809fbfa6a1 [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE
Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds
the handling for the special case where RHS == 0.

Differential Revision: https://reviews.llvm.org/D34048

llvm-svn: 310346
2017-08-08 11:20:44 +00:00
Simon Pilgrim ef44228acb [DAGCombiner] Simplify shuffle mask index if the referenced input element is UNDEF
Fixes one of the cases in PR34041.

Differential Revision: https://reviews.llvm.org/D36393

llvm-svn: 310344
2017-08-08 11:03:30 +00:00
Daniel Sanders 0554004698 [globalisel][tablegen] Add support for importing 'imm' operands.
Summary:
This patch enables the import of rules containing 'imm' operands that do not
constrain the acceptable values using predicates. Support for ImmLeaf will
arrive in a later patch.

Depends on D35681

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35833

llvm-svn: 310343
2017-08-08 10:44:31 +00:00
Chandler Carruth 6e35c31d2d [PM] Fix a likely more critical infloop bug in the CGSCC pass manager.
This was just a bad oversight on my part. The code in question should
never have worked without this fix. But it turns out, there are
relatively few places that involve libfunctions that participate in
a single SCC, and unless they do, this happens to not matter.

The effect of not having this correct is that each time through this
routine, the edge from write_wrapper to write was toggled between a call
edge and a ref edge. First time through, it becomes a demoted call edge
and is turned into a ref edge. Next time it is a promoted call edge from
a ref edge. On, and on it goes forever.

I've added the asserts which should have always been here to catch silly
mistakes like this in the future as well as a test case that will
actually infloop without the fix.

The other (much scarier) infinite-inlining issue I think didn't actually
occur in practice, and I simply misdiagnosed this minor issue as that
much more scary issue. The other issue *is* still a real issue, but I'm
somewhat relieved that so far it hasn't happened in real-world code
yet...

llvm-svn: 310342
2017-08-08 10:13:23 +00:00
Craig Topper 8e351e9018 [InstCombine] Cast to BinaryOperator earlier in foldSelectIntoOp to simplify the code.
We no longer need the explicit operand count check or the later dynamic cast.

llvm-svn: 310339
2017-08-08 06:19:24 +00:00
Tom Stellard 03aa3aee11 AMDGPU: Fix warnings introduced by r310336
llvm-svn: 310337
2017-08-08 05:52:00 +00:00
Tom Stellard 20287697f8 AMDGPU: Move R600 parts of AMDGPUISelDAGToDAG into their own class
Summary: This refactoring is required in order to split the R600 and GCN tablegen files.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D36286

llvm-svn: 310336
2017-08-08 04:57:55 +00:00
Konstantin Zhuravlyov 6cbcb27b5e AMDGPU: Also remove SI from docs
Differential Revision: https://reviews.llvm.org/D36424

llvm-svn: 310335
2017-08-08 04:28:31 +00:00
Chandler Carruth e32ebcaa87 [PM] Relax the spelling of a pass name slightly in this test.
I forgot that MSVC doesn't preserve this typedef, my bad.

llvm-svn: 310334
2017-08-08 02:27:49 +00:00
Chandler Carruth 7c888dca46 [PM] Fix new LoopUnroll function pass by invalidating loop analysis
results when a loop is completely removed.

This is very hard to manifest as a visible bug. You need to arrange for
there to be a subsequent allocation of a 'Loop' object which gets the
exact same address as the one which the unroll deleted, and you need the
LoopAccessAnalysis results to be significant in the way that they're
stale. And you need a million other things to align.

But when it does, you get a deeply mysterious crash due to actually
finding a stale analysis result. This fixes the issue and tests for it
by directly checking we successfully invalidate things. I have not been
able to get *any* test case to reliably trigger this. Changes to LLVM
itself caused the only test case I ever had to cease to crash.

I've looked pretty extensively at less brittle ways of fixing this and
they are actually very, very hard to do. This is a somewhat strange and
unusual case as we have a pass which is deleting an IR unit, but is not
running within that IR unit's pass framework (which is what handles this
cleanly for the normal loop unroll). And where there isn't a definitive
way to clear *all* of the stale cache entries. And where the pass *is*
updating the core analysis that provides the IR units!

For example, we don't have any of these problems with Function analyses
because it is easy to clear out function analyses when the functions
themselves may have been deleted -- we clear an entire module's worth!
But that is too heavy of a hammer down here in the LoopAnalysisManager
layer.

A better long-term solution IMO is to require that AnalysisManager's
make their keys durable to this kind of thing. Specifically, when
caching an analysis for one IR unit that is conceptually "owned" by
a higher level IR unit, the AnalysisManager should incorporate this into
its data structures so that we can reliably clear these results without
having to teach each and every pass to do so manually as we do here. But
that is a change for another day as it will be a fairly invasive change
to the AnalysisManager infrastructure. Until then, this fortunately
seems to be quite rare.

llvm-svn: 310333
2017-08-08 02:24:20 +00:00
Eugene Zelenko 59e128266c [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 310328
2017-08-08 00:47:13 +00:00
Kostya Serebryany e863796dca [libFuzzer] simplify code, NFC
llvm-svn: 310326
2017-08-08 00:17:20 +00:00
Kostya Serebryany 22e5f9a16a [libFuzzer] remove stale code
llvm-svn: 310325
2017-08-08 00:14:49 +00:00
Kostya Serebryany 854be98c93 [libFuzzer] simplify the implementation of -print_coverage=1
llvm-svn: 310324
2017-08-08 00:12:09 +00:00
Craig Topper 0ff0d74187 [KnownBits] Fix copy pasto in comment. NFC
llvm-svn: 310320
2017-08-07 22:35:55 +00:00
Simon Pilgrim 8c1167df5c [X86][AVX] Added test for broadcast shuffle from binary sources with undefs (D36393)
llvm-svn: 310317
2017-08-07 22:20:06 +00:00
Matt Arsenault bd57cea6e4 AMDGPU: Implement getMinimumNopSize
llvm-svn: 310310
2017-08-07 22:00:58 +00:00
Reid Kleckner ad7dc6e31f [Object] Initialize LoadConfig member to null
Executables may not contain a load config, and clients should be able to
test for nullability. Previously we'd return uninitialized memory. Now
getLoadConfig32/64 return valid pointers or null.

Fixes PR34108

llvm-svn: 310308
2017-08-07 21:23:38 +00:00
George Karpenkov 00e25c5459 Do not instrument libFuzzer itself when built with -DLLVM_USE_SANITIZE_COVERAGE
Fixes regression from https://reviews.llvm.org/D36295

Differential Revision: https://reviews.llvm.org/D36428

llvm-svn: 310305
2017-08-07 20:56:11 +00:00
Zachary Turner 489a7a09ad [llvm-pdbutil] Don't crash when a section contrib's isect is invalid.
llvm-svn: 310298
2017-08-07 20:24:01 +00:00
Dehao Chen 08f8831e57 Move the SampleProfileLoader right after EarlyFPM.
Summary: SampleProfileLoader pass do need to happen after some early cleanup passes so that inlining can happen correctly inside the SampleProfileLoader pass.

Reviewers: chandlerc, davidxl, tejohnson

Reviewed By: chandlerc, tejohnson

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36333

llvm-svn: 310296
2017-08-07 20:23:20 +00:00
Evgeny Stupachenko c675290680 Reapply fix PR23384 (part 3 of 3) r304824 (was reverted in r305720).
The root cause of reverting was fixed - PR33514.

Summary:
The patch makes instruction count the highest priority for
 LSR solution for X86 (previously registers had highest priority).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D30562

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 310289
2017-08-07 19:56:34 +00:00
Aaron Ballman 428f0fe910 Removing an unused variable that was missed with the refactoring in r310272; NFC.
llvm-svn: 310285
2017-08-07 19:26:17 +00:00
Connor Abbott 79f3ade51a [AMDGPU] Add pseudo "old" source to all DPP instructions
Summary:
All instructions with the DPP modifier may not write to certain lanes of
the output if bound_ctrl=1 is set or any bits in bank_mask or row_mask
aren't set, so the destination register may be both defined and modified.
The right way to handle this is to add a constraint that the destination
register is the same as one of the inputs. We could tie the destination
to the first source, but that would be too restrictive for some use-cases
where we want the destination to be some other value before the
instruction executes. Instead, add a fake "old" source and tie it to the
destination. Effectively, the "old" source defines what value unwritten
lanes will get. We'll expose this functionality to users with a new
intrinsic later.

Also, we want to use DPP instructions for computing derivatives, which
means we need to set WQM for them. We also need to enable the entire
wavefront when using DPP intrinsics to implement nonuniform subgroup
reductions, since otherwise we'll get incorrect results in some cases.
To accomodate this, add a new operand to all DPP instructions which will
be interpreted by the SI WQM pass. This will be exposed with a new
intrinsic later. We'll also add support for Whole Wavefront Mode later.

I also fixed llvm.amdgcn.mov.dpp to overwrite the source and fixed up
the test. However, I could also keep the old behavior (where lanes that
aren't written are undefined) if people want it.

Reviewers: tstellar, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D34716

llvm-svn: 310283
2017-08-07 19:10:56 +00:00
Matt Arsenault 36b4b0bed7 AMDGPU: Remove -mcpu=SI
Leftover from before amdgcn/r600 split.

llvm-svn: 310277
2017-08-07 18:30:35 +00:00
Matt Arsenault 9d288e69f5 AMDGPU: Remove redundant opt level check
addOptimizedRegAlloc isn't used for -O0 already.

llvm-svn: 310275
2017-08-07 18:12:48 +00:00
Matt Arsenault 3db456820d AMDGPU: Remove FixControlFlowLiveIntervals pass
This hasn't done anything in a long time. This was
running after the the control flow pseudos were expanded,
so this would never find them. The control flow pseudo
expansion was moved to solve the problem this pass was
supposed to solve in the first place, except handling
it earlier also fixes it for fast regalloc which doesn't
use LiveIntervals.

Noticed by checking LCOV reports.

llvm-svn: 310274
2017-08-07 18:12:47 +00:00
Craig Topper 7091a743b4 [InstCombine] Support (X | C1) & C2 --> (X & C2^(C1&C2)) | (C1&C2) for vector splats
Note the original code I deleted incorrectly listed this as (X | C1) & C2 --> (X & C2^(C1&C2)) | C1 Which is only valid if C1 is a subset of C2. This relied on SimplifyDemandedBits to remove any extra bits from C1 before we got to that code.

My new implementation avoids relying on that behavior so that it can be naively verified with alive.

Differential Revision: https://reviews.llvm.org/D36384

llvm-svn: 310272
2017-08-07 18:10:39 +00:00