Commit Graph

95831 Commits

Author SHA1 Message Date
Quentin Colombet b3f5a8c644 [AArch64][RegisterBankInfo] Switch to fully static opds mapping for G_BITCAST.
NFC.

llvm-svn: 284146
2016-10-13 18:46:38 +00:00
Teresa Johnson 7943fecee8 Add interface to compute number of physical cores on host system
Summary:
For now I have only added support for x86_64 Linux, but other systems
can be added incrementally.

This is to be used for setting the default parallelism for ThinLTO
backends (instead of thread::hardware_concurrency which includes
hyperthreading and is too aggressive). I'll send this as a follow-on
patch, and it will fall back to hardware_concurrency when the new
getHostNumPhysicalCores returns -1 (when not supported for a given
host system).

I also added an interface to MemoryBuffer to force reading a file
as a stream - this is required for /proc/cpuinfo which is a special
file that looks like a normal file but appears to have 0 size.
The existing readers of this file in Host.cpp are reading the first
1024 or so bytes from it, because the necessary info is near the top.
But for the new functionality we need to be able to read the entire
file. I can go back and change the other readers to use the new
getFileAsStream as a follow-on patch since it seems much more robust.

Added a unittest.

Reviewers: mehdi_amini

Subscribers: beanz, mgorny, llvm-commits, modocache

Differential Revision: https://reviews.llvm.org/D25564

llvm-svn: 284138
2016-10-13 17:43:20 +00:00
Reid Kleckner edfc9dcf42 Truncate long names in type records
In the MS ABI, the frontend is supposed to MD5 such pathologically long
names. LLVM should still defend itself from long names, though.

Fixes part of PR29098.

llvm-svn: 284136
2016-10-13 17:33:22 +00:00
Igor Breger 8409c356ad [X86][AVX512] Fix sext v32i1 -> v32i8 lowering.
Fix PR30600.

Differential Revision: https://reviews.llvm.org/D25554

llvm-svn: 284134
2016-10-13 17:20:38 +00:00
Kostya Serebryany 17d176e16b [libFuzzer] reapply r283946: refactoring to speed things up, NFC. Now with a fix for gcc build
llvm-svn: 284132
2016-10-13 16:19:09 +00:00
Reid Kleckner 468e793fea Fix for PR30687. Avoid dereferencing MBB.end().
We don't need to return a MachineInstr* from these stack probe insertion
calls anyway. If we ever need to add it back, we can return an iterator
instead.

Based on a patch by David Kreitzer

This bug is a consequence of

r279314 | dexonsmith | 2016-08-19 13:40:12 -0700 (Fri, 19 Aug 2016) | 110 lines

We hit the "Assertion `!NodePtr->isKnownSentinel()' failed" assertion,
but only when inserting a stack probe call at the end of an MBB, which
isn't necessarily a common situation.

Differential Revision: https://reviews.llvm.org/D25566

llvm-svn: 284130
2016-10-13 15:48:48 +00:00
Eric Liu af5a28fe87 Do not delete leading ../ in remove_dots.
Reviewers: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25561

llvm-svn: 284129
2016-10-13 15:07:14 +00:00
Javed Absar 85874a9360 [ARM]: Assign cost of scaling used in addressing mode for ARM cores
This patch assigns cost of the scaling used in addressing.
On many ARM cores, a negated register offset takes longer than a
non-negated register offset, in a register-offset addressing mode.

For instance:

LDR R0, [R1, R2 LSL #2]
LDR R0, [R1, -R2 LSL #2]

Above, (1) takes less cycles than (2).

By assigning appropriate scaling factor cost, we enable the LLVM
to make the right trade-offs in the optimization and code-selection phase.

Differential Revision: http://reviews.llvm.org/D24857

Reviewers: jmolloy, rengolin
llvm-svn: 284127
2016-10-13 14:57:43 +00:00
Matthew Simpson 1d4b163fc0 [LV] Account for predicated stores in instruction costs
This patch ensures that we scale the estimated cost of predicated stores by
block probability. This is a follow-on patch for r284123.

llvm-svn: 284126
2016-10-13 14:54:31 +00:00
Matthew Simpson 6cdb5a6f96 [LV] Avoid rounding errors for predicated instruction costs
This patch modifies the cost calculation of predicated instructions (div and
rem) to avoid the accumulation of rounding errors due to multiple truncating
integer divisions. The calculation for predicated stores will be addressed in a
follow-on patch since we currently don't scale the cost of predicated stores by
block probability.

Differential Revision: https://reviews.llvm.org/D25333

llvm-svn: 284123
2016-10-13 14:19:48 +00:00
Simon Pilgrim cb59b5257c [DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), Y) style combines
llvm-svn: 284122
2016-10-13 14:04:35 +00:00
Matt Arsenault 253640e18d AMDGPU: Assume spilling will occur at -O0
Because everything live is spilled at the end of a
block by fast regalloc, assume this will happen and
avoid the copies of the resource descriptor.

llvm-svn: 284119
2016-10-13 13:10:00 +00:00
Simon Pilgrim fa8fadc0e5 [DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A folding
llvm-svn: 284117
2016-10-13 12:49:31 +00:00
Matt Arsenault dac31db12f AMDGPU: Fix truncate to bool warnings
llvm-svn: 284116
2016-10-13 12:45:16 +00:00
Simon Dardis 515e8699f4 [mips] Add IAS support for dvp, evp
These instructions were only defined for microMIPSR6 previously. Add
definitions for MIPSR6, correct definitions for microMIPSR6, flag these
instructions as having unmodelled side effects (they disable/enable
virtual processors) and add missing disassember tests for microMIPSR6.

Reviewers: vkalintiris

Differential Review: https://reviews.llvm.org/D24291

llvm-svn: 284115
2016-10-13 12:12:56 +00:00
Simon Pilgrim 833b8a2071 [DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalization
Improves commutation potential

llvm-svn: 284113
2016-10-13 12:05:20 +00:00
Oren Ben Simhon 92ccbf20ff [X86] Basic additions to support RegCall Calling Convention.
The Register Calling Convention (RegCall) was introduced by Intel to optimize parameter transfer on function call.
This calling convention ensures that as many values as possible are passed or returned in registers.
This commit presents the basic additions to LLVM CodeGen in order to support RegCall in X86.

Differential Revision: http://reviews.llvm.org/D25022

llvm-svn: 284108
2016-10-13 07:53:43 +00:00
Daniel Jasper bee9dea306 Silence unused warning in non-assert builds.
llvm-svn: 284107
2016-10-13 06:39:44 +00:00
Craig Topper ff23af4299 [AVX-512] Teach shuffle lowering to recognize 512-bit zero extends.
llvm-svn: 284105
2016-10-13 05:29:41 +00:00
Craig Topper 8cb2efa58a [X86] Simplify the lowering code for extracting and inserting subvectors.
We don't need to check if AVX is enabled. It's implied by the operation action being set to Custom.
We don't need to check both the input and output type widths. We only need to check the type that's being inserted or extracted. The other type is known to be a legal type and we can assume its a different width.

llvm-svn: 284102
2016-10-13 04:14:47 +00:00
Sebastian Pop 5068d7a338 Memory-SSA: strengthen defClobbersUseOrDef interface
As Danny pointed out, defClobbersUseOrDef should use MemoryLocOrCall to make
sure fences are properly handled.

llvm-svn: 284099
2016-10-13 03:23:33 +00:00
Sebastian Pop 5ba9f24ed7 commit back "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)"
This is with an extra change to avoid calling MemoryLocation::get() on a call instruction.

Differential Revision: https://reviews.llvm.org/D25542

llvm-svn: 284098
2016-10-13 01:39:10 +00:00
Quentin Colombet 6b87a3109c [AArch64][RegisterBankInfo] Provide alternative mappings for 64-bit load
This allows RegBankSelect in greedy mode to get rid some of the cross
register bank copies when loads are involved in the chain of
computation.

llvm-svn: 284097
2016-10-13 01:01:23 +00:00
Reid Kleckner 741d8a21d3 Correct PrivateLinkage for COFF
- Use storage class C_STAT for 'PrivateLinkage' The storage class for
  PrivateLinkage should equal to the Internal Linkage.

- Set 'PrivateGlobalPrefix' from "L" to ".L" for MM_WinCOFF (includes
  x86_64) MM_WinCOFF has empty GlobalPrefix '\0' so PrivateGlobalPrefix
  "L" may conflict to the normal symbol name starting with 'L'.

Based on a patch by Han Sangjin! Manually updated test cases.

llvm-svn: 284096
2016-10-13 00:55:24 +00:00
Quentin Colombet cd80e97e88 [AArch64][RegisterBankInfo] Provide alternative mappings for G_BITCASTs.
Thanks to this patch, RegBankSelect is able to get rid of some register
bank copies as demonstrated in the test case.

llvm-svn: 284094
2016-10-13 00:34:48 +00:00
Reid Kleckner 8958f6a529 Revert "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)"
This CL didn't actually address the test case in PR30499, and clang
still crashes.

Also revert dependent change "Memory-SSA cleanup of clobbers interface, NFC"

Reverts r283965 and r283967.

llvm-svn: 284093
2016-10-13 00:18:26 +00:00
Quentin Colombet 45c9c1432f [AArch64][RegisterBankInfo] Describe cross regbank copies statically.
NFC.

llvm-svn: 284091
2016-10-13 00:12:06 +00:00
Quentin Colombet 9e64919b7c [AArch64][RegisterBankInfo] Use static mapping for same bank G_BITCAST.
NFC.

llvm-svn: 284090
2016-10-13 00:12:04 +00:00
Quentin Colombet db643d9091 [AArch64][MachineLegalizer] Mark more G_BITCAST as legal.
Basically any vector types that fits in a 32-bit register is also valid
as far as copies are concerned.

llvm-svn: 284089
2016-10-13 00:12:01 +00:00
Quentin Colombet f760799c40 [AArch64][RegisterBankInfo] Bump the cost of vector loads.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284088
2016-10-13 00:11:59 +00:00
Quentin Colombet f35a8c5bdc [AArch64][RegisterBankInfo] Use a proper cost for cross regbank G_BITCASTs.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284087
2016-10-13 00:11:57 +00:00
Quentin Colombet 27b40356f7 [AArch64][RegisterBankInfo] Provide more realistic copy costs.
llvm-svn: 284086
2016-10-13 00:11:55 +00:00
Krzysztof Parzyszek abc0662f04 Handle lane masks in LivePhysRegs when adding live-ins
Differential Revision: https://reviews.llvm.org/D25533

llvm-svn: 284076
2016-10-12 22:53:41 +00:00
Tim Northover fb8d989818 GlobalISel: support G_TRUNC selection on AArch64.
Ahmed's patch again.

llvm-svn: 284075
2016-10-12 22:49:15 +00:00
Tim Northover 69271c64d5 GlobalISel: support int <-> float conversions on AArch64.
More of Ahmed's work.

llvm-svn: 284074
2016-10-12 22:49:11 +00:00
Tim Northover 7dd378dd08 GlobalISel: select G_FCMP instructions on AArch64.
Another of Ahmed's patches.

llvm-svn: 284073
2016-10-12 22:49:07 +00:00
Tim Northover 6c02ad5e4f GlobalISel: support selection of G_ICMP on AArch64.
Patch from Ahmed Bougaca again.

llvm-svn: 284072
2016-10-12 22:49:04 +00:00
Tim Northover 5e3dbf326c GlobalISel: select G_BRCOND instructions on AArch64.
llvm-svn: 284071
2016-10-12 22:49:01 +00:00
Tim Northover 6aacd27cd7 GlobalISel: mark G_BRCOND on s1 as legal.
It's going to be a TBNZ (at -O0) anyway, so the high bits don't matter.

llvm-svn: 284070
2016-10-12 22:48:36 +00:00
Vedant Kumar 68216d7bd3 [Coverage] Factor out logic to create FunctionRecords (NFC)
llvm-svn: 284063
2016-10-12 22:27:45 +00:00
Albert Gutowski 795d7d6381 Create llvm.addressofreturnaddress intrinsic
Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows.

Reviewers: majnemer, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25293

llvm-svn: 284061
2016-10-12 22:13:19 +00:00
Reid Kleckner fb58be862c Update _MSC_VER equality checks for msdiaNNN.dll
Use inequality instead of equality to defend against minor version
increases in _MSC_VER. An _MSC_VER value of 1901 should still use
msdia140.dll, as described in this blog post:
https://blogs.msdn.microsoft.com/vcblog/2016/10/05/visual-c-compiler-version/

llvm-svn: 284058
2016-10-12 21:51:14 +00:00
Haicheng Wu 1ef17e90b2 Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

llvm-svn: 284053
2016-10-12 21:29:38 +00:00
Krzysztof Parzyszek d62669d791 [MIRParser] Parse lane masks for register live-ins
Differential Revision: https://reviews.llvm.org/D25530

llvm-svn: 284052
2016-10-12 21:06:45 +00:00
Haicheng Wu 45e4ef737d Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
This reverts commit r284044.

llvm-svn: 284051
2016-10-12 21:02:22 +00:00
Haicheng Wu 6cac34fd41 [LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop
This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

Differential Revision: https://reviews.llvm.org/D24790

llvm-svn: 284044
2016-10-12 20:24:32 +00:00
Peter Collingbourne 46aafc16e8 LTO: Use the correct mangler function in LTOCodeGenerator::applyScopeRestrictions().
We need to use the overload of Mangler::getNameWithPrefix that takes a
GlobalValue in order to mangle in the stdcall stack byte count for Windows
targets.

Differential Revision: https://reviews.llvm.org/D25529

llvm-svn: 284040
2016-10-12 20:12:19 +00:00
Krzysztof Parzyszek 8271be9a1d Do not remove implicit defs in BranchFolder
Branch folder removes implicit defs if they are the only non-branching
instructions in a block, and the branches do not use the defined registers.
The problem is that in some cases these implicit defs are required for
the liveness information to be correct.

Differential Revision: https://reviews.llvm.org/D25478

llvm-svn: 284036
2016-10-12 19:50:57 +00:00
Matt Arsenault d486d3f8d1 AMDGPU: Initial implementation of VGPR indexing mode
This is the most basic handling of the indirect access
pseudos using GPR indexing mode. This currently only enables
the mode for a single v_mov_b32 and then disables it.
This is much more complicated to use than the movrel instructions,
so a new optimization pass is probably needed to fold the access
into the uses and keep the mode enabled for them.

llvm-svn: 284031
2016-10-12 18:49:05 +00:00
Teresa Johnson 4b9b379172 [ThinLTO] Don't link module level assembly when importing
Module inline asm was always being linked/concatenated
when running the IRLinker. This is correct for full LTO but not when
we are importing for ThinLTO, as it can result in multiply defined
symbols when the module asm defines a global symbol.

In order to test with llvm-lto2, I had to work around PR30396,
where a symbol that is defined in module assembly but defined in the
LLVM IR appears twice. Added workaround to llvm-lto2 with a FIXME.

Fixes PR30610.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25359

llvm-svn: 284030
2016-10-12 18:39:29 +00:00