Summary:
For now I have only added support for x86_64 Linux, but other systems
can be added incrementally.
This is to be used for setting the default parallelism for ThinLTO
backends (instead of thread::hardware_concurrency which includes
hyperthreading and is too aggressive). I'll send this as a follow-on
patch, and it will fall back to hardware_concurrency when the new
getHostNumPhysicalCores returns -1 (when not supported for a given
host system).
I also added an interface to MemoryBuffer to force reading a file
as a stream - this is required for /proc/cpuinfo which is a special
file that looks like a normal file but appears to have 0 size.
The existing readers of this file in Host.cpp are reading the first
1024 or so bytes from it, because the necessary info is near the top.
But for the new functionality we need to be able to read the entire
file. I can go back and change the other readers to use the new
getFileAsStream as a follow-on patch since it seems much more robust.
Added a unittest.
Reviewers: mehdi_amini
Subscribers: beanz, mgorny, llvm-commits, modocache
Differential Revision: https://reviews.llvm.org/D25564
llvm-svn: 284138
In the MS ABI, the frontend is supposed to MD5 such pathologically long
names. LLVM should still defend itself from long names, though.
Fixes part of PR29098.
llvm-svn: 284136
We don't need to return a MachineInstr* from these stack probe insertion
calls anyway. If we ever need to add it back, we can return an iterator
instead.
Based on a patch by David Kreitzer
This bug is a consequence of
r279314 | dexonsmith | 2016-08-19 13:40:12 -0700 (Fri, 19 Aug 2016) | 110 lines
We hit the "Assertion `!NodePtr->isKnownSentinel()' failed" assertion,
but only when inserting a stack probe call at the end of an MBB, which
isn't necessarily a common situation.
Differential Revision: https://reviews.llvm.org/D25566
llvm-svn: 284130
This patch assigns cost of the scaling used in addressing.
On many ARM cores, a negated register offset takes longer than a
non-negated register offset, in a register-offset addressing mode.
For instance:
LDR R0, [R1, R2 LSL #2]
LDR R0, [R1, -R2 LSL #2]
Above, (1) takes less cycles than (2).
By assigning appropriate scaling factor cost, we enable the LLVM
to make the right trade-offs in the optimization and code-selection phase.
Differential Revision: http://reviews.llvm.org/D24857
Reviewers: jmolloy, rengolin
llvm-svn: 284127
This patch modifies the cost calculation of predicated instructions (div and
rem) to avoid the accumulation of rounding errors due to multiple truncating
integer divisions. The calculation for predicated stores will be addressed in a
follow-on patch since we currently don't scale the cost of predicated stores by
block probability.
Differential Revision: https://reviews.llvm.org/D25333
llvm-svn: 284123
Because everything live is spilled at the end of a
block by fast regalloc, assume this will happen and
avoid the copies of the resource descriptor.
llvm-svn: 284119
These instructions were only defined for microMIPSR6 previously. Add
definitions for MIPSR6, correct definitions for microMIPSR6, flag these
instructions as having unmodelled side effects (they disable/enable
virtual processors) and add missing disassember tests for microMIPSR6.
Reviewers: vkalintiris
Differential Review: https://reviews.llvm.org/D24291
llvm-svn: 284115
The Register Calling Convention (RegCall) was introduced by Intel to optimize parameter transfer on function call.
This calling convention ensures that as many values as possible are passed or returned in registers.
This commit presents the basic additions to LLVM CodeGen in order to support RegCall in X86.
Differential Revision: http://reviews.llvm.org/D25022
llvm-svn: 284108
We don't need to check if AVX is enabled. It's implied by the operation action being set to Custom.
We don't need to check both the input and output type widths. We only need to check the type that's being inserted or extracted. The other type is known to be a legal type and we can assume its a different width.
llvm-svn: 284102
This is with an extra change to avoid calling MemoryLocation::get() on a call instruction.
Differential Revision: https://reviews.llvm.org/D25542
llvm-svn: 284098
This allows RegBankSelect in greedy mode to get rid some of the cross
register bank copies when loads are involved in the chain of
computation.
llvm-svn: 284097
- Use storage class C_STAT for 'PrivateLinkage' The storage class for
PrivateLinkage should equal to the Internal Linkage.
- Set 'PrivateGlobalPrefix' from "L" to ".L" for MM_WinCOFF (includes
x86_64) MM_WinCOFF has empty GlobalPrefix '\0' so PrivateGlobalPrefix
"L" may conflict to the normal symbol name starting with 'L'.
Based on a patch by Han Sangjin! Manually updated test cases.
llvm-svn: 284096
This CL didn't actually address the test case in PR30499, and clang
still crashes.
Also revert dependent change "Memory-SSA cleanup of clobbers interface, NFC"
Reverts r283965 and r283967.
llvm-svn: 284093
Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows.
Reviewers: majnemer, rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25293
llvm-svn: 284061
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.
The original summary:
This patch tries to fully unroll loops having break statement like this
for (int i = 0; i < 8; i++) {
if (a[i] == value) {
found = true;
break;
}
}
GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.
The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.
The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.
llvm-svn: 284053
This patch tries to fully unroll loops having break statement like this
for (int i = 0; i < 8; i++) {
if (a[i] == value) {
found = true;
break;
}
}
GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.
The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.
The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.
Differential Revision: https://reviews.llvm.org/D24790
llvm-svn: 284044
We need to use the overload of Mangler::getNameWithPrefix that takes a
GlobalValue in order to mangle in the stdcall stack byte count for Windows
targets.
Differential Revision: https://reviews.llvm.org/D25529
llvm-svn: 284040
Branch folder removes implicit defs if they are the only non-branching
instructions in a block, and the branches do not use the defined registers.
The problem is that in some cases these implicit defs are required for
the liveness information to be correct.
Differential Revision: https://reviews.llvm.org/D25478
llvm-svn: 284036
This is the most basic handling of the indirect access
pseudos using GPR indexing mode. This currently only enables
the mode for a single v_mov_b32 and then disables it.
This is much more complicated to use than the movrel instructions,
so a new optimization pass is probably needed to fold the access
into the uses and keep the mode enabled for them.
llvm-svn: 284031
Module inline asm was always being linked/concatenated
when running the IRLinker. This is correct for full LTO but not when
we are importing for ThinLTO, as it can result in multiply defined
symbols when the module asm defines a global symbol.
In order to test with llvm-lto2, I had to work around PR30396,
where a symbol that is defined in module assembly but defined in the
LLVM IR appears twice. Added workaround to llvm-lto2 with a FIXME.
Fixes PR30610.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25359
llvm-svn: 284030