Commit Graph

81236 Commits

Author SHA1 Message Date
Rafael Espindola 2b05416be8 Add a herper function. NFC.
llvm-svn: 242100
2015-07-14 01:06:16 +00:00
Alex Lorenz 418f3ec17d MIR Serialization: Serialize the variable sized stack objects.
llvm-svn: 242095
2015-07-14 00:26:26 +00:00
Reid Kleckner 486fa3977a Update enforceKnownAlignment after the isWeakForLinker semantic change
Previously we would refrain from attempting to increase the linkage of
available_externally globals because they were considered weak for the
linker. Now they are treated more like a declaration instead of a weak
definition.

This was causing SSE alignment faults in Chromuim, when some code
assumed it could increase the alignment of a dllimported global that it
didn't control.  http://crbug.com/509256

llvm-svn: 242091
2015-07-14 00:11:08 +00:00
Alex Lorenz 2eacca86ef MIR Serialization: Serialize the sub register indices.
This commit serializes the sub register indices from the register machine
operands.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242084
2015-07-13 23:24:34 +00:00
Rafael Espindola c60d0d2a15 Fix reading archive members with / in the name.
This is important for thin archives.

llvm-svn: 242082
2015-07-13 23:07:05 +00:00
Bill Schmidt 15deb803b4 [PPC64LE] More improvements to VSX swap optimization
This patch allows VSX swap optimization to succeed more frequently.
Specifically, it is concerned with common code sequences that occur
when copying a scalar floating-point value to a vector register.  This
patch currently handles cases where the floating-point value is
already in a register, but does not yet handle loads (such as via an
LXSDX scalar floating-point VSX load).  That will be dealt with later.

A typical case is when a scalar value comes in as a floating-point
parameter.  The value is copied into a virtual VSFRC register, and
then a sequence of SUBREG_TO_REG and/or COPY operations will convert
it to a full vector register of the class required by the context.  If
this vector register is then used as part of a lane-permuted
computation, the original scalar value will be in the wrong lane.  We
can fix this by adding a swap operation following any widening
SUBREG_TO_REG operation.  Additional COPY operations may be needed
around the swap operation in order to keep register assignment happy,
but these are pro forma operations that will be removed by coalescing.

If a scalar value is otherwise directly referenced in a computation
(such as by one of the many XS* vector-scalar operations), we
currently disable swap optimization.  These operations are
lane-sensitive by definition.  A MentionsPartialVR flag is added for
use in each swap table entry that mentions a scalar floating-point
register without having special handling defined.

A common idiom for PPC64LE is to convert a double-precision scalar to
a vector by performing a splat operation.  This ensures that the value
can be referenced as V[0], as it would be for big endian, whereas just
converting the scalar to a vector with a SUBREG_TO_REG operation
leaves this value only in V[1].  A doubleword splat operation is one
form of an XXPERMDI instruction, which takes one doubleword from a
first operand and another doubleword from a second operand, with a
two-bit selector operand indicating which doublewords are chosen.  In
the general case, an XXPERMDI can be permitted in a lane-swapped
region provided that it is properly transformed to select the
corresponding swapped values.  This transformation is to reverse the
order of the two input operands, and to reverse and complement the
bits of the selector operand (derivation left as an exercise to the
reader ;).

A new test case that exercises the scalar-to-vector and generalized
XXPERMDI transformations is added as CodeGen/PowerPC/swaps-le-5.ll.
The patch also requires a change to CodeGen/PowerPC/swaps-le-3.ll to
use CHECK-DAG instead of CHECK for two independent instructions that
now appear in reverse order.

There are two small unrelated changes that are added with this patch.
First, the XXSLDWI instruction was incorrectly omitted from the list
of lane-sensitive instructions; this is now fixed.  Second, I observed
that the same webs were being rejected over and over again for
different reasons.  Since it's sufficient to reject a web only once, I
added a check for this to speed up the compilation time slightly.

llvm-svn: 242081
2015-07-13 22:58:19 +00:00
Pete Cooper 90d95edbb4 Loop idiom recognizer was replacing too many uses of popcount.
When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop.

The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed

 %tobool.5 = icmp ne i32 %num, 0
 store i1 %tobool.5, i1* %ptr
 br i1 %tobool.5, label %for.body.lr.ph, label %for.end

to

 store i1 %1, i1* %ptr
 %0 = call i32 @llvm.ctpop.i32(i32 %num)
 %1 = icmp ne i32 %0, 0
 br i1 %1, label %for.body.lr.ph, label %for.end

Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect.  The store doesn’t want the result of ctpop.

The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses.

Reviewed by Hal Finkel.

llvm-svn: 242068
2015-07-13 21:25:33 +00:00
Reid Kleckner 9a1a919465 [WinEH] Emit the LSDA even if no lpads remain but outlining occurred
The outlined funclets call intrinsics which reference labels from the
LSDA. This situation can easily arise in small functions with a single
cleanup at -O0, where Clang marks a definition as nounwind, and then
WinEHPrepare "discovers" that the landingpad is dead by accident and
deletes it.

We now need to ask the LLVM IR Function for it's personality directly,
rather than going through MachineModuleInfo.

Fixes PR23892.

llvm-svn: 242063
2015-07-13 20:41:46 +00:00
Benjamin Kramer d8861517bf [Hexagon] Move BitTracker into the llvm namespace and remove redundant qualifications
No functional change intended.

llvm-svn: 242062
2015-07-13 20:38:16 +00:00
Rafael Espindola 6a8e86f26e Add support deterministic output in llvm-ar and make it the default.
llvm-svn: 242061
2015-07-13 20:38:09 +00:00
Matt Arsenault ca95d44110 AMDGPU: Minor cleanups to always inline pass
llvm-svn: 242053
2015-07-13 19:08:36 +00:00
David Majnemer 1305e2c0f5 [MC] Correctly escape .safeseh's symbol
This fixes PR24107.

llvm-svn: 242050
2015-07-13 18:51:15 +00:00
Mark Heffernan 4c8ca53f7e Enable partial and runtime loop unrolling for NVPTX.
Enable partial and runtime loop unrolling for NVPTX backend via
TTI::UnrollingPreferences with a small threshold. This partially unrolls
small loops which are often unrolled by the PTX to SASS compiler
and unrolling earlier can be beneficial.

llvm-svn: 242049
2015-07-13 18:33:21 +00:00
Mark Heffernan d7ebc24112 Enable runtime unrolling with unroll pragma metadata
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N")
and a runtime trip count. Also, do not unroll loops with unroll full metadata if the
loop has a runtime loop count. Previously, such loops would be unrolled with a
very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be
enabled resulting in a very large (and likely unwise) unroll factor.

llvm-svn: 242047
2015-07-13 18:26:27 +00:00
Adrian Prantl 857237ee70 Service the doxygen comments in DwarfUnit and DwarfDebug.
llvm-svn: 242046
2015-07-13 18:25:29 +00:00
Alex Lorenz de491f0515 MIR Serialization: Serialize the fixed stack objects.
This commit serializes the fixed stack objects, including fixed spill slots.
The fixed stack objects are serialized using a YAML sequence of YAML inline
mappings. Each mapping has the object's ID, type, size, offset, and alignment.
The objects that aren't spill slots also serialize the isImmutable and isAliased
flags.

The fixed stack objects are a part of the machine function's YAML mapping.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242045
2015-07-13 18:07:26 +00:00
Reid Kleckner 5f4dd92209 [WinEH] Strip the \01 character from the __CxxFrameHandler3 thunk name
Add another C++ 32-bit EH table test.

llvm-svn: 242044
2015-07-13 17:55:14 +00:00
Benjamin Kramer a667d1adb7 Remove macro guards for extern template instantiations.
This is a C++11 feature that both GCC and MSVC have supported as ane extension
long before C++11 was approved.

llvm-svn: 242042
2015-07-13 17:21:31 +00:00
Benjamin Kramer e448b5be05 Avoid using Loop::getSubLoopsVector.
Passes should never modify it, just use the const version. While there
reduce copying in LoopInterchange. No functional change intended.

llvm-svn: 242041
2015-07-13 17:21:14 +00:00
James Y Knight 46f91c8457 Fix handling of the 'n' asm constraint with invalid operands.
It had accidently accepted a symbol+offset value (and emitted
incorrect code for it, keeping only the offset part) instead of
properly reporting the constraint as invalid.

Differential Revision: http://reviews.llvm.org/D11039

llvm-svn: 242040
2015-07-13 16:36:22 +00:00
Tom Stellard db5a11f698 AMDGPU/SI: Select mad patterns to v_mac_f32
The two-address instruction pass will convert these back to v_mad_f32
if necessary.

Differential Revision: http://reviews.llvm.org/D11060

llvm-svn: 242038
2015-07-13 15:47:57 +00:00
Logan Chien 0a43abc9f8 ARM: Fix cttz expansion on vector types.
The 64/128-bit vector types are legal if NEON instructions are
available.  However, there was no matching patterns for @llvm.cttz.*()
intrinsics and result in fatal error.

This commit fixes the problem by lowering cttz to:
a. ctpop((x & -x) - 1)
b. width - ctlz(x & -x) - 1

llvm-svn: 242037
2015-07-13 15:37:30 +00:00
Scott Douglass 69bf1ce03a [ARM] Handle commutativity when converting to tADDhirr in Thumb2
Also, run thumb_rewrite.s tests in Thumb2 now that they pass.

Differential Revision: http://reviews.llvm.org/D11132

llvm-svn: 242036
2015-07-13 15:31:48 +00:00
Scott Douglass d9d8d26458 [ARM] Add Thumb2 ADD with SP narrowing from 3 operand to 2
Differential Revision: http://reviews.llvm.org/D11131

llvm-svn: 242035
2015-07-13 15:31:40 +00:00
Scott Douglass 039f768c42 [ARM] Small refactor of tryConvertingToTwoOperandForm (nfc)
Also, add more Thumb2 ADD tests requested during review of
http://reviews.llvm.org/D11053.

Differential Revision: http://reviews.llvm.org/D11130

llvm-svn: 242034
2015-07-13 15:31:33 +00:00
Silviu Baranga a647c30f88 Cleanup after r241809 - remove uncessary call to std::sort
Summary:
The iteration order within a member of DepCands is deterministic
and therefore we don't have to sort the accesses within a member.
We also don't have to copy the indices of the pointers into a
vector, since we can iterate over the members of the class.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11145

llvm-svn: 242033
2015-07-13 14:48:24 +00:00
Rafael Espindola c1d63f7499 Remove unused variable.
Sorry I missed it in the previous commit.

llvm-svn: 242032
2015-07-13 14:43:33 +00:00
Rafael Espindola 5895d6845e Aliases don't have available_externally linkage.
Allowing that is probably a good idea, but currently we don't, so
this is dead code.

llvm-svn: 242031
2015-07-13 14:39:02 +00:00
Rafael Espindola 237c3a6def Don't change the visibility when converting a definition to a declaration.
llvm-svn: 242030
2015-07-13 14:18:22 +00:00
Aaron Ballman 6d8f785073 Removing several -Wunused-but-set-variable warnings; NFC intended.
llvm-svn: 242028
2015-07-13 14:04:30 +00:00
Rafael Espindola 7068cbbc1a Print the visibility of available_externally functions.
We were already printing it for declarations, but not available_externally.

llvm-svn: 242027
2015-07-13 13:55:18 +00:00
Manuel Klimek 779cf85a4f Revert r241981 "Revert "Revert r236894 "[BasicAA] Fix zext & sext handling"""
The repros from PR23626 still fail.

llvm-svn: 242025
2015-07-13 13:50:55 +00:00
Elena Demikhovsky 0f370936a0 AVX-512: Added all AVX-512 forms of Vector Convert for Float/Double/Int/Long types.
In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch.
I temporary removed the old intrinsics test (just to split this patch).
Half types are not covered here.

Differential Revision: http://reviews.llvm.org/D11134

llvm-svn: 242023
2015-07-13 13:26:20 +00:00
Jingyue Wu 9a92d4fb04 [LSR] don't attempt to promote ephemeral values to indvars
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11115

llvm-svn: 242011
2015-07-13 03:28:53 +00:00
David Majnemer 599ca4426c [InstSimplify] Teach InstSimplify how to simplify extractelement
llvm-svn: 242008
2015-07-13 01:15:53 +00:00
David Majnemer 25a796e148 [InstSimplify] Teach InstSimplify how to simplify extractvalue
llvm-svn: 242007
2015-07-13 01:15:46 +00:00
Renato Golin 1ef7a0f7c0 [ARM] Add support for nest attribute using r12
Register r12 ('ip') is used by GCC for this purpose
and hence is used here. As discussed on the GCC mailing
list, the register choice is an ABI issue and so
choosing the same register as GCC means
__builtin_call_with_static_chain is compatible.

A similar patch has just gone in the AArch64 backend,
so this is just the ARM counterpart, following the same
discussion.

Patch by Stephen Cross.

llvm-svn: 241996
2015-07-12 18:16:40 +00:00
Simon Pilgrim 4f500525ef [X86][SSE] (V)PMINSB is commutable.
(V)PMINSB is no different to the other (V)PMIN/(V)PMAX B/D/W instructions - it is fully commutable.

llvm-svn: 241994
2015-07-12 16:44:11 +00:00
Simon Pilgrim ae5cd2773d Trim trailing whitespaces. NFC.
llvm-svn: 241990
2015-07-12 11:17:33 +00:00
Simon Pilgrim 64cc4ad0a2 [X86][SSE] Vectorized v4i32 non-uniform shifts.
While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized.

This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together.

Differential Revision: http://reviews.llvm.org/D11063

llvm-svn: 241989
2015-07-12 11:15:19 +00:00
David Majnemer 6bc83e0f43 [LICM] Don't try to sink values out of loops without any exits
There is no suitable basic block to sink instructions in loops without
exits.  The only way an instruction in a loop without exits can be used
is as an incoming value to a PHI.  In such cases, the incoming block for
the corresponding value is unreachable.

This fixes PR24013.

Differential Revision: http://reviews.llvm.org/D10903

llvm-svn: 241987
2015-07-12 03:53:05 +00:00
Hal Finkel cbf08925ef [PowerPC] Make use of the TargetRecip system
r238842 added the TargetRecip system for controlling use of reciprocal
estimates for sqrt and division using a set of parameters that can be set by
the frontend. Clang now supports a sophisticated -mrecip option, and this will
allow that option to effectively control the relevant code-generation
functionality of the PPC backend.

llvm-svn: 241985
2015-07-12 02:33:57 +00:00
Hal Finkel 965cea5670 [PowerPC] Support the nest parameter attribute
This adds support for the 'nest' attribute, which allows the static chain
register to be set for functions calls under non-Darwin PPC/PPC64 targets. r11
is the chain register (which the PPC64 ELF ABI calls the "environment
pointer"). For indirect calls under PPC64 ELFv1, this would normally be loaded
from the function descriptor, but providing an explicit 'nest' parameter will
override that process and use the value provided.

This allows __builtin_call_with_static_chain to work as expected on PowerPC.

llvm-svn: 241984
2015-07-12 00:37:44 +00:00
Hal Finkel ef28aad9f4 Revert "Revert r236894 "[BasicAA] Fix zext & sext handling""
r236894 caused PR23626 (Clang miscompiles webkit's base64 decoder), and was
reverted in r237984. This reapplies the patch with an additional test case for
PR23626 and the associated fix (both scales and offsets in the
BasicAliasAnalysis::constantOffsetHeuristic should initially be zero).

Patch by Nick White, thanks!

llvm-svn: 241981
2015-07-11 11:04:54 +00:00
Hal Finkel 9cf58c4095 Move getStrideFromPointer and friends from LoopVectorize to VectorUtils
The following functions are moved from the LoopVectorizer to VectorUtils:

  - getGEPInductionOperand
  - stripGetElementPtr
  - getUniqueCastUse
  - getStrideFromPointer

These used to be static functions in LoopVectorize, but will also be used by
the upcoming loop versioning LICM transformation.

Patch by Ashutosh Nema!

llvm-svn: 241980
2015-07-11 10:52:42 +00:00
Igor Laevsky 39d662f7ba Add argmemonly attribute.
This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis.

Differential Revision: http://reviews.llvm.org/D10398

llvm-svn: 241979
2015-07-11 10:30:36 +00:00
Chandler Carruth 00ebdbcc47 [PM/AA] Completely remove the AliasAnalysis::copyValue interface.
No in-tree alias analysis used this facility, and it was not called in
any particularly rigorous way, so it seems unlikely to be correct.

Note that one of the only stateful AA implementations in-tree,
GlobalsModRef is completely broken currently (and any AA passes like it
are equally broken) because Module AA passes are not effectively
invalidated when a function pass that fails to update the AA stack runs.

Ultimately, it doesn't seem like we know how we want to build stateful
AA, and until then trying to support and maintain correctness for an
untested API is essentially impossible. To that end, I'm planning to rip
out all of the update API. It can return if and when we need it and know
how to build it on top of the new pass manager and as part of *tested*
stateful AA implementations in the tree.

Differential Revision: http://reviews.llvm.org/D10889

llvm-svn: 241975
2015-07-11 04:39:00 +00:00
Tyler Nowicki 3960d85262 Renamed some uses of unroll to interleave in the vectorizer.
llvm-svn: 241971
2015-07-11 00:31:11 +00:00
Adrian Prantl 12d528493e Cleanup a couple of comments in DIBuilder.cpp
llvm-svn: 241966
2015-07-10 23:26:02 +00:00
Duncan P. N. Exon Smith e463e470f8 MC: Only allow changing feature bits in MCSubtargetInfo
Disallow all mutation of `MCSubtargetInfo` expect the feature bits.

Besides deleting the assignment operators -- which were dead "code" --
this restricts `InitMCProcessorInfo()` to subclass initialization
sequences, and exposes a new more limited function called
`setDefaultFeatures()` for use by the ARMAsmParser `.cpu` directive.

There's a small functional change here: ARMAsmParser used to adjust
`MCSubtargetInfo::CPUSchedModel` as a side effect of calling
`InitMCProcessorInfo()`, but I've removed that suspicious behaviour.
Since the AsmParser shouldn't be doing any scheduling, there shouldn't
be any observable change...

llvm-svn: 241961
2015-07-10 22:52:15 +00:00