Commit Graph

83082 Commits

Author SHA1 Message Date
Sanjay Patel 42574203e5 use "unpredictable" metadata in fast-isel when splitting compares
This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch.

Differential Revision: http://reviews.llvm.org/D12342

llvm-svn: 246692
2015-09-02 19:23:23 +00:00
Sanjay Patel fff7c6dc73 use "unpredictable" metadata in SelectionDAG when splitting compares
This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch.

Differential Revision: http://reviews.llvm.org/D12343

llvm-svn: 246691
2015-09-02 19:17:25 +00:00
Kostya Serebryany a9346c2e65 [libFuzzer] honour -only_ascii=1 when reading the initial corpus. Also, remove ugly #ifdef
llvm-svn: 246689
2015-09-02 19:08:08 +00:00
Sanjay Patel a99ab1f536 add unpredictable metadata type for control flow
This patch defines 'unpredictable' metadata. This metadata can be used to signal to the optimizer
or backend that a branch or switch is unpredictable, and therefore, it's probably better to not
split a compound predicate into multiple branches such as in CodeGenPrepare::splitBranchCondition().

This was discussed in:
https://llvm.org/bugs/show_bug.cgi?id=23827

Dependent patches to alter codegen and expose this in clang to follow.

Differential Revision; http://reviews.llvm.org/D12341

llvm-svn: 246688
2015-09-02 19:06:43 +00:00
Ahmed Bougacha 63fae0e58b [AArch64] More consistently separate asm opc and operands with '\t'.
Somehow missed these in r246686.

llvm-svn: 246687
2015-09-02 18:52:54 +00:00
Ahmed Bougacha cca07716f5 [AArch64] Consistently separate asm opc and operands with '\t'.
Some of the instructions use ' ', which drives OCD-me nuts.
Let's put an end to this.

NFC-ish: hopefully nobody cares about whitespace.
llvm-svn: 246686
2015-09-02 18:38:36 +00:00
Justin Bogner 58e0823ee9 IR: Invert a condition to make it more legible. NFC
Also updates the style to more modern conventions.

llvm-svn: 246681
2015-09-02 17:54:41 +00:00
James Molloy 569cea65f0 [ValueTracking] Look through casts when both operands are casts.
We only looked through casts when one operand was a constant. We can also look through casts when both operands are non-constant, but both are in fact the same cast type. For example:

%1 = icmp ult i8 %a, %b
%2 = zext i8 %a to i32
%3 = zext i8 %b to i32
%4 = select i1 %1, i32 %2, i32 %3

llvm-svn: 246678
2015-09-02 17:25:25 +00:00
Hal Finkel 77c8b7ffd3 [PowerPC] Don't always consider P8Altivec-only masks in LowerVECTOR_SHUFFLE
LowerVECTOR_SHUFFLE needs to decide whether to pass a vector shuffle off to the
TableGen-generated matching code, and it does this by testing the same
predicates used by the TableGen files. Unfortunately, when we added new
P8Altivec-only predicates, we started universally testing them in
LowerVECTOR_SHUFFLE, and if then matched when targeting a system prior to a P8,
we'd end up with a selection failure.

llvm-svn: 246675
2015-09-02 16:52:37 +00:00
Sanjay Patel fbcd189f8a [x86] fix allowsMisalignedMemoryAccesses() for 8-byte and smaller accesses
This is a continuation of the fix from:
http://reviews.llvm.org/D10662

and discussion in:
http://reviews.llvm.org/D12154

Here, we distinguish slow unaligned SSE (128-bit) accesses from slow unaligned
scalar (64-bit and under) accesses. Other lowering (eg, getOptimalMemOpType) 
assumes that unaligned scalar accesses are always ok, so this changes 
allowsMisalignedMemoryAccesses() to match that behavior.

Differential Revision: http://reviews.llvm.org/D12543

llvm-svn: 246658
2015-09-02 15:42:49 +00:00
Asaf Badouh d2c3599c5f [X86][AVX512VLBW] add support in byte shift and SAD
add byte shift left/right
add SAD - compute sum of absolute differences

Differential Revision: http://reviews.llvm.org/D12479

llvm-svn: 246654
2015-09-02 14:21:54 +00:00
Joseph Tremoulet 917c7382c1 [TableGen] Allow TokenTy in intrinsic signatures
Summary:
Add the necessary plumbing so that llvm_token_ty can be used as an
argument/return type in intrinsic definitions and correspondingly require
TokenTy in function types.  TokenTy is an opaque type that has no target
lowering, but can be used in machine-independent intrinsics.  It is
required for the upcoming llvm.eh.padparam intrinsic.

Reviewers: majnemer, rnk

Subscribers: stoklund, llvm-commits

Differential Revision: http://reviews.llvm.org/D12532

llvm-svn: 246651
2015-09-02 13:36:25 +00:00
Igor Breger 1e58e8adf6 AVX512: Implemented encoding and intrinsics for VGETMANTPD/S , VGETMANTSD/S instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11593

llvm-svn: 246642
2015-09-02 11:18:55 +00:00
Igor Breger a6297c701e AVX512: Implemented encoding and intrinsics for vshufps/d.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11709

llvm-svn: 246640
2015-09-02 10:50:58 +00:00
James Molloy 1e583704f5 [LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH instead
We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch.

There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity.

llvm-svn: 246637
2015-09-02 10:15:39 +00:00
James Molloy f2523e38d8 [LV] Move some code around slightly to make the intent of the function more clear.
NFC.

llvm-svn: 246636
2015-09-02 10:15:32 +00:00
James Molloy aca2f400ba [LV] Cleanup: Sink an IRBuilder closer to its uses.
NFC.

llvm-svn: 246635
2015-09-02 10:15:27 +00:00
James Molloy cba9230507 [LV] Refactor all runtime check emissions into helper functions.
This reduces the complexity of createEmptyBlock() and will open the door to further refactoring.

The test change is simply because we're now constant folding a trivial test.

llvm-svn: 246634
2015-09-02 10:15:22 +00:00
James Molloy ff623dce39 [LV] Pull creation of trip counts into a helper function.
... and do a tad of tidyup while we're at it. Because StartIdx must now be zero, there's no difference between Count and EndIdx.

llvm-svn: 246633
2015-09-02 10:15:16 +00:00
James Molloy 239ff5d193 [LV] Factor the creation of the loop induction variable out of createEmptyLoop()
It makes things easier to understand if this is in a helper method. This is part of my ongoing spaghetti-removal operation on createEmptyLoop.

llvm-svn: 246632
2015-09-02 10:15:09 +00:00
James Molloy a860a2216a [LV] Never widen an induction variable.
There's no need to widen canonical induction variables. It's just as efficient to create a *new*, wide, induction variable.

Consider, if we widen an indvar, then we'll have to truncate it before its uses anyway (1 trunc). If we create a new indvar instead, we'll have to truncate that instead (1 trunc) [besides which IndVars should go and clean up our mess after us anyway on principle].

This lets us remove a ton of special-casing code.

llvm-svn: 246631
2015-09-02 10:15:05 +00:00
James Molloy c07701b017 [LV] Switch to using canonical induction variables.
Vectorized loops only ever have one induction variable. All induction PHIs from the scalar loop are rewritten to be in terms of this single indvar.

We were trying very hard to pick an indvar that already existed, even if that indvar wasn't canonical (didn't start at zero). But trying so hard is really fruitless - creating a new, canonical, indvar only results in one extra add in the worst case and that add is trivially easy to push through the PHI out of the loop by instcombine.

If we try and be less clever here and instead let instcombine clean up our mess (as we do in many other places in LV), we can remove unneeded complexity.

llvm-svn: 246630
2015-09-02 10:14:54 +00:00
Elena Demikhovsky 9f83c7346f AVX-512: store <4 x i1> and <2 x i1> values in memory
Enabled DAG pattern lowering for SKX with DQI predicate.

Differential Revision: http://reviews.llvm.org/D12550

llvm-svn: 246625
2015-09-02 09:20:58 +00:00
Elena Demikhovsky 1b9d6914d3 Optimization for Gather/Scatter with uniform base
Vector 'getelementptr' with scalar base is an opportunity for gather/scatter intrinsic to generate a better sequence.
While looking for uniform base, we want to use the scalar base pointer of GEP, if exists.

Differential Revision: http://reviews.llvm.org/D11121

llvm-svn: 246622
2015-09-02 08:39:13 +00:00
Yaron Keren 611c7cff53 Move createEliminateAvailableExternallyPass earlier in the pass pipeline
to save running many ModulePasses on available external functions that
are thrown away anyhow.

llvm-svn: 246619
2015-09-02 06:34:11 +00:00
Vedant Kumar b5c2fd7257 [CodeGen] Fix FREM on 32-bit MSVC on x86
Patch by Dylan McKay!

Differential Revision: http://reviews.llvm.org/D12099

llvm-svn: 246615
2015-09-02 01:31:58 +00:00
David Majnemer 088ba020dd [MC] Generate a timestamp for COFF object files
The MS incremental linker seems to inspect the timestamp written into
the object file to determine whether or not it's contents need to be
considered.  Failing to set the timestamp to a date newer than the
executable will result in the object file not participating in
subsequent links.  To ameliorate this, write the current time into the
object file's TimeDateStamp field.

llvm-svn: 246607
2015-09-01 23:46:11 +00:00
David Majnemer 83c862ad52 [MC] Remove MCAssembler's copy of OS
We can just ask the ObjectWriter for it's stream instead of caching
around our own reference to it.  No functionality change is intended.

llvm-svn: 246604
2015-09-01 23:19:38 +00:00
Ahmed Bougacha 699a9dd7c3 [ARM] Don't abort on variable-idx extractelt in ReconstructShuffle.
The code introduced in r244314 assumed that EXTRACT_VECTOR_ELT only
takes constant indices, but it does accept variables.
Bail out for those: we can't use them, as the shuffles we want to
reconstruct do require constant masks.

llvm-svn: 246594
2015-09-01 21:56:00 +00:00
David Majnemer 6ddc636862 [MC] Add support for generating COFF CRCs
COFF sections are accompanied with an auxiliary symbol which includes a
checksum.  This checksum used to be filled with just zero but this seems
to upset LINK.exe when it is processing a /INCREMENTAL link job.
Instead, fill the CheckSum field with the JamCRC of the section
contents.  This matches MSVC's behavior.

This fixes PR19666.

N.B.  A rather simple implementation of JamCRC is given.  It implements
a byte-wise calculation using the method given by Sarwate.  There are
implementations with higher throughput like slice-by-eight and making
use of PCLMULQDQ.  We can switch to one of those techniques if it turns
out to be a significant use of time.

llvm-svn: 246590
2015-09-01 21:23:58 +00:00
Sanjay Patel 30145677a8 rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI)
This is a follow-on suggested by:
http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 )
http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 )

This makes the attribute name match most of the existing lowering logic
and regression test expectations.

But the current use of this attribute is inconsistent; see the FIXME
comment for "allowsMisalignedMemoryAccesses()". That change will
result in functional changes and should be coming soon.

llvm-svn: 246585
2015-09-01 20:51:51 +00:00
Hans Wennborg dada1d20ba DeadArgElim: don't eliminate arguments from naked functions
Differential Revision: http://reviews.llvm.org/D12534

llvm-svn: 246564
2015-09-01 18:06:46 +00:00
Artem Belevich 020d4fb17f New bitcode linker flags:
-only-needed -- link in only symbols needed by destination module
-internalize -- internalize linked symbols

Differential Revision: http://reviews.llvm.org/D12459

llvm-svn: 246561
2015-09-01 17:55:55 +00:00
Ahmed Bougacha b0ff6437cb [AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0.
This matches the ARM behavior. In both cases, the register is part
of the optional Performance Monitors extension, so, add the feature,
and enable it for the A-class processors we support.

Differential Revision: http://reviews.llvm.org/D12425

llvm-svn: 246555
2015-09-01 16:23:45 +00:00
David Majnemer abdb2d2aba [MC] Allow MCObjectWriter's output stream to be swapped out
There are occasions where it is useful to consider the entirety of the
contents of a section.  For example, compressed debug info needs the
entire section available before it can compress it and write it out.
The compressed debug info scenario was previously implemented by
mirroring the implementation of writeSectionData in the ELFObjectWriter.

Instead, allow the output stream to be swapped on demand.  This lets
callers redirect the output stream to a more convenient location before
it hits the object file.

No functionality change is intended.

Differential Revision: http://reviews.llvm.org/D12509

llvm-svn: 246554
2015-09-01 16:19:03 +00:00
Igor Breger f6f1bb6ddc AVX512: Implemented intrinsics for valign.
Differential Revision: http://reviews.llvm.org/D12526

llvm-svn: 246551
2015-09-01 15:27:18 +00:00
Silviu Baranga 755ec0e027 [AArch64] Turn on by default interleaved access vectorization
Summary:
This change turns on by default interleaved access vectorization
for AArch64.

We also clean up some tests which were spedifically enabling this
behaviour.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12149

llvm-svn: 246542
2015-09-01 11:26:46 +00:00
Silviu Baranga e748c9ef55 [ARM] Turn on by default interleaved access vectorization
Summary:
This change turns on by default interleaved access vectorization on ARM,
as it has shown to be beneficial on ARM.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12146

llvm-svn: 246541
2015-09-01 11:19:15 +00:00
Silviu Baranga 6d3f05c04b [ARM][AArch64] Turn on by default interleaved access lowering
Summary:
Interleaved access lowering removes a memory operation and a
sequence of vector shuffles and replaces it with a series of
memory operations. This should be always beneficial.

This pass in only enabled on ARM/AArch64.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12145

llvm-svn: 246540
2015-09-01 11:12:35 +00:00
Yaron Keren 55f5c3d43b Fix typo.
llvm-svn: 246538
2015-09-01 10:13:49 +00:00
Matt Arsenault 51d2d0f668 AMDGPU: Fix adding redundant implicit operands
These are already added during the MachineInstr construction,
so this was adding the implicit registers twice.

llvm-svn: 246525
2015-09-01 02:02:21 +00:00
Cong Hou 511298b919 Distribute the weight on the edge from switch to default statement to edges generated in lowering switch.
Currently, when edge weights are assigned to edges that are created when lowering switch statement, the weight on the edge to default statement (let's call it "default weight" here) is not considered. We need to distribute this weight properly. However, without value profiling, we have no idea how to distribute it. In this patch, I applied the heuristic that this weight is evenly distributed to successors.

For example, given a switch statement with cases 1,2,3,5,10,11,20, and every edge from switch to each successor has weight 10. If there is a binary search tree built to test if n < 10, then its two out-edges will have weight 4x10+10/2 = 45 and 3x10 + 10/2 = 35 respectively (currently they are 40 and 30 without considering the default weight). Each distribution (which is 5 here) will be stored in each SwitchWorkListItem for further distribution.

There are some exceptions:

For a jump table header which doesn't have any edge to default statement, we don't distribute the default weight to it.
For a bit test header which covers a contiguous range and hence has no edges to default statement, we don't distribute the default weight to it.
When the branch checks a single value or a contiguous range with no edge to default statement, we don't distribute the default weight to it.
In other cases, the default weight is evenly distributed to successors.

Differential Revision: http://reviews.llvm.org/D12418

llvm-svn: 246522
2015-09-01 01:42:16 +00:00
Duncan P. N. Exon Smith f4967754a5 LTO: Cleanup parameter names and header docs, NFC
Follow LLVM style for the parameter names (`CamelCase` not `camelCase`),
and surface the header docs in doxygen.  No functionality change
intended.

llvm-svn: 246509
2015-08-31 23:44:06 +00:00
Hal Finkel 1baec5323b [DAGCombine] Fixup SETCC legality checking
SETCC is one of those special node types for which operation actions (legality,
etc.) is keyed off of an operand type, not the node's value type. This makes
sense because the value type of a legal SETCC node is determined by its
operands' value type (via the TLI function getSetCCResultType). When the
SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value
type, or directly with the value type provided by TLI.getSetCCResultType.

The first problem being fixed here is that DAGCombine had several places
querying TLI.isOperationLegal on SETCC, but providing the return of
getSetCCResultType, instead of the operand type directly. This does not mean
what the author thought, and "luckily", most in-tree targets have SETCC with
Custom lowering, instead of marking them Legal, so these checks return false
anyway.

The second problem being fixed here is that two of the DAGCombines could create
SETCC nodes with arbitrary (integer) value types; specifically, those that
would simplify:

  (setcc a, b, op1) and|or (setcc a, b, op2) -> setcc a, b, op3
     (which is possible for some combinations of (op1, op2))

If the operands of the and|or node are actual setcc nodes, then this is not an
issue (because the and|or must share the same type), but, the relevant code in
DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls
DAGCombiner::isSetCCEquivalent on each operand, and that function will
recognise setcc-like select_cc nodes with other return types. And, thus, when
creating new SETCC nodes, we need to be careful to respect the value-type
constraint. This is even true before type legalization, because it is quite
possible for the SELECT_CC node to have a legal type that does not happen to
match the corresponding TLI.getSetCCResultType type.

To be explicit, there is nothing that later fixes the value types of SETCC
nodes (if the type is legal, but does not happen to match
TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to
work only because, either MVT::i1 is not legal, or it is what
TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change,
however. For the time being, restrict the relevant transformations to produce
only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1
prior to type legalization).

Fixes PR24636.

llvm-svn: 246507
2015-08-31 23:15:04 +00:00
Sanjay Patel 719b3e6a3e don't set a legal vector type if we know we can't use that type (NFCI)
Added benefit: the 'if' logic now matches the text of the comment that describes it.

llvm-svn: 246506
2015-08-31 22:59:03 +00:00
Quentin Colombet 5989bc6f41 [BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!

This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.

This should fix PR24596.

Original commit message for r221876:
Let's try this again...

This reverts r219432, plus a bug fix.

Description of the bug in r219432 (by Nick):

The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).

Nick also adds this comment:

ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the  == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.

And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).

Original commit message:

This reverts r218944, which reverted r218714, plus a bug fix.

Description of the bug in r218714 (by Nick):

The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.

Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.

Original commit message:

Two related things:

1. Fixes a bug when calculating the offset in GetLinearExpression. The code
   previously used zext to extend the offset, so negative offsets were converted
   to large positive ones.

2. Enhance aliasGEP to deduce that, if the difference between two GEP
   allocations is positive and all the variables that govern the offset are also
   positive (i.e. the offset is strictly after the higher base pointer), then
   locations that fit in the gap between the two base pointers are NoAlias.

Patch by Nick White!

Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).

Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!

llvm-svn: 246502
2015-08-31 22:32:47 +00:00
JF Bastien 73ff6afa87 WebAssembly: generate load/store
Summary: This handles all load/store operations that WebAssembly defines, and handles those necessary for C++ such as i1. I left a FIXME for outstanding features which aren't required for now.

Reviewers: sunfish

Subscribers: jfb, llvm-commits, dschuff
llvm-svn: 246500
2015-08-31 22:24:11 +00:00
Sanjay Patel 218cbd5a48 generalize helper function of MergeConsecutiveStores to handle vector types (NFCI)
This was part of D7208 (r227242), but that commit was reverted because it exposed
a bug in AArch64 lowering. I should have that fixed and the rest of the commit
reinstated soon.

llvm-svn: 246493
2015-08-31 21:50:16 +00:00
Karl Schimpf 4da0e12968 Fix bug in method LLLexer::FP80HexToIntPair
llvm-svn: 246489
2015-08-31 21:36:14 +00:00
Hans Wennborg 043bf5b296 Fix Windows build by including raw_ostream.h
llvm-svn: 246486
2015-08-31 21:19:18 +00:00
Naomi Musgrave 21c1bc46ae Rollback of commit "Repress sanitization on User dtor."
This would have suppressed bug 24578, about use-after-
destroy on User and MDNode. Rolled back suppression for
the sake of code cleanliness, in preferance for bug
tracking to keep track of this issue.

This reverts commit 6ff2baabc4625d5b0a8dccf76aa0f72d930ea6c0.

llvm-svn: 246484
2015-08-31 21:06:08 +00:00
Hal Finkel 2483f2060a [DAGCombine] Use getSetCCResultType utility function
DAGCombine has a utility wrapper around TLI's getSetCCResultType; use it in the
one place in DAGCombine still directly calling the TLI function. NFC.

llvm-svn: 246482
2015-08-31 20:42:38 +00:00
Sanjay Patel d9a5c225d1 [x86] enable machine combiner reassociations for scalar 'or' insts
llvm-svn: 246481
2015-08-31 20:27:03 +00:00
Reid Kleckner e00faf8ce1 [EH] Handle non-Function personalities like unknown personalities
Also delete and simplify a lot of MachineModuleInfo code that used to be
needed to handle personalities on landingpads.  Now that the personality
is on the LLVM Function, we no longer need to track it this way on MMI.
Certainly it should not live on LandingPadInfo.

llvm-svn: 246478
2015-08-31 20:02:16 +00:00
Philip Reames a88caeab6c [FunctionAttr] Infer nonnull attributes on returns
Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't.

Differential Revision: http://reviews.llvm.org/D9688

llvm-svn: 246476
2015-08-31 19:44:38 +00:00
Quentin Colombet a80b9c824e [AArch64][CollectLOH] Remove an invalid assertion and add a test case exposing it.
rdar://problem/22491525

llvm-svn: 246472
2015-08-31 19:02:00 +00:00
Naomi Musgrave 763468baec Undo reversion on commit: Revert "Revert "Repress sanitization on User dtor.
Modify msan macros for applying attribute""

This reverts commit 020e70a79878c96457e6882bcdfaf6628baf32b7.

llvm-svn: 246470
2015-08-31 18:49:31 +00:00
Hal Finkel a894266d28 [DAGCombine] Remove some old dead code for forming SETCC nodes
This code was dead when it was committed in r23665 (Oct 7, 2005), and before it
reaches its 10th anniversary, it really should go. We can always bring it back
if we'd like, but it forms more SETCC nodes, and the way we do legality
checking on SETCC nodes is wrong in a number of places, and removing this means
fewer places to fix. NFC.

llvm-svn: 246466
2015-08-31 18:38:55 +00:00
Philip Reames bb11d62a5a [LazyValueInfo] Look through Phi nodes when trying to prove a predicate
If asked to prove a predicate about a value produced by a PHI node, LazyValueInfo was unable to do so even if the predicate was known to be true for each input to the PHI. This prevented JumpThreading from eliminating a provably redundant branch.

The problematic test case looks something like this:
ListNode *p = ...;
while (p != null) {
  if (!p) return;
  x = g->x; // unrelated
  p = p->next
}

The null check at the top of the loop is redundant since the value of 'p' is null checked on entry to the loop and before executing the backedge. This resulted in us a) executing an extra null check per iteration and b) not being able to LICM unrelated loads after the check since we couldn't prove they would execute or that their dereferenceability wasn't effected by the null check on the first iteration.

Differential Revision: http://reviews.llvm.org/D12383

llvm-svn: 246465
2015-08-31 18:31:48 +00:00
Kit Barton d3cc1678e8 Rework of the new interface for shrink wrapping
Based on comments from Hal
(http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150810/292978.html),
I've changed the interface to add a callback mechanism to the
TargetFrameLowering class to query whether the specific target
supports shrink wrapping.  By default, shrink wrapping is disabled by
default. Each target can override the default behaviour using the
TargetFrameLowering::targetSupportsShrinkWrapping() method. Shrink
wrapping can still be explicitly enabled or disabled from the command
line, using the existing -enable-shrink-wrap=<true|false> option.

Phabricator: http://reviews.llvm.org/D12293
llvm-svn: 246463
2015-08-31 18:26:45 +00:00
Matthias Braun 0acbd08f3c AArch64: Fix loads to lower NEON vector lanes using GPR registers
The ISelLowering code turned insertion turned the element for the
lowest lane of a BUILD_VECTOR into an INSERT_SUBREG, this prohibited
the patterns for SCALAR_TO_VECTOR(Load) to match later. Restrict this
to cases without a load argument.

Reported in rdar://22223823

Differential Revision: http://reviews.llvm.org/D12467

llvm-svn: 246462
2015-08-31 18:25:15 +00:00
Matthias Braun 818c78d0cc X86: Fix FastISel SSESelect register class
X86FastISel has been using the wrong register class for VBLENDVPS which
produces a VR128 and needs an extra copy to the target register. The
problem was already hit by the existing test cases when using
> llvm-lit -Dllc="llc -verify-machineinstr"

llvm-svn: 246461
2015-08-31 18:25:11 +00:00
Filipe Cabecinhas 984fefdd81 [BitcodeReader] Ensure we can read constant vector selects with an i1 condition
Summary:
Constant vectors weren't allowed to have an i1 condition in the
BitcodeReader. Make sure we have the same restrictions that are
documented, not more.

Reviewers: nlewycky, rafael, kschimpf

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12440

llvm-svn: 246459
2015-08-31 18:00:30 +00:00
Vedant Kumar 86dbd92334 [MC/AsmParser] Avoid setting MCSymbol.IsUsed in some cases
Avoid marking some MCSymbols as used in MC/AsmParser.cpp when no uses
exist. This fixes a bug in parseAssignmentExpression() which
inadvertently sets IsUsed, thereby triggering:

    "invalid re-assignment of non-absolute variable"

on otherwise valid code. No other functionality change intended.

The original version of this patch touched many calls to MCSymbol
accessors. On rafael's advice, I have stripped this patch down a bit.

As a follow-up, I intend to find the call sites which intentionally set
IsUsed and force them to do so explicitly.

Differential Revision: http://reviews.llvm.org/D12347

llvm-svn: 246457
2015-08-31 17:44:53 +00:00
Karl Schimpf 36440082f8 Change comment to verify commit accesss.
llvm-svn: 246451
2015-08-31 16:43:55 +00:00
Naomi Musgrave 5f79c6653d Revert "Repress sanitization on User dtor. Modify msan macros for applying attribute"
This reverts commit 5e3bfbb38eb3fb6f568b107f6b239e0aa4c5f334.

llvm-svn: 246450
2015-08-31 16:26:44 +00:00
Naomi Musgrave d8c1a064e5 Repress sanitization on User dtor. Modify msan macros for applying attribute
to repress sanitization. Move attribute for repressing sanitization to
operator delete for User, MDNode.

Summary: In response to bug 24578, reported against failing LLVM test.

Reviewers: chandlerc, rsmith, eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12335

llvm-svn: 246449
2015-08-31 15:57:40 +00:00
Benjamin Kramer efeddcc552 [SectionMemoryManager] Use range-based for loops. No functional change intended.
llvm-svn: 246440
2015-08-31 13:39:14 +00:00
Igor Breger 5ea0a68115 AVX512: ktest implemantation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11979

llvm-svn: 246439
2015-08-31 13:30:19 +00:00
Igor Breger f3ded811b2 AVX512: Implemented encoding and intrinsics for vdbpsadbw
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12491

llvm-svn: 246436
2015-08-31 13:09:30 +00:00
Igor Breger 59ac339357 AVX512: kadd implementation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11973

llvm-svn: 246432
2015-08-31 11:50:23 +00:00
Igor Breger 2ae0fe3ac3 AVX512: Implemented encoding and intrinsics for vpalignr
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12270

llvm-svn: 246428
2015-08-31 11:14:02 +00:00
Hal Finkel e0a28e54c7 [AggressiveAntiDepBreaker] Check for EarlyClobber on defining instruction
AggressiveAntiDepBreaker was doing some EarlyClobber checking, but was not
checking that the register being potentially renamed was defined by an
early-clobber def where there was also a use, in that instruction, of the
register being considered as the target of the rename. Fixes PR24014.

llvm-svn: 246423
2015-08-31 07:51:36 +00:00
Jingyue Wu e84f671830 [JumpThreading] make jump threading respect convergent annotation.
Summary:
JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example,
  if (cond) {
    ...
  } else {
    ...
  }
  convergent_call();
  if (cond) {
    ...
  } else {
    ...
  }
should not be optimized to
  if (cond) {
    ...
    convergent_call();
    ...
  } else {
    ...
    convergent_call();
    ...
  }

Test Plan: test/Transforms/JumpThreading/basic.ll

Patch by Xuetian Weng. 

Reviewers: resistor, arsenm, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12484

llvm-svn: 246415
2015-08-31 06:10:27 +00:00
Peter Collingbourne 592ee15e14 Support: Support LLVM_ENABLE_THREADS=0 in llvm/Support/thread.h.
Specifically, the header now provides llvm::thread, which is either a
typedef of std::thread or a replacement that calls the function synchronously
depending on the value of LLVM_ENABLE_THREADS.

llvm-svn: 246402
2015-08-31 00:09:01 +00:00
Hal Finkel a2cdbce661 [PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands
There were really two problems here. The first was that we had the truth tables
for signed i1 comparisons backward. I imagine these are not very common, but if
you have:
  setcc i1 x, y, LT
this has the '0 1' and the '1 0' results flipped compared to:
  setcc i1 x, y, ULT
because, in the signed case, '1 0' is really '-1 0', and the answer is not the
same as in the unsigned case.

The second problem was that we did not have patterns (at all) for the unsigned
comparisons select_cc nodes for i1 comparison operands. This was the specific
cause of PR24552. These had to be added (and a missing Altivec promotion added
as well) to make sure these function for all types. I've added a bunch more
test cases for these patterns, and there are a few FIXMEs in the test case
regarding code-quality.

Fixes PR24552.

llvm-svn: 246400
2015-08-30 22:12:50 +00:00
Elena Demikhovsky 63a7ca9948 NFC: Code style in VectorUtils.cpp
Differential Revision:	http://reviews.llvm.org/D12478

llvm-svn: 246381
2015-08-30 13:48:02 +00:00
Renato Golin 3b1d3b0d84 Revert "Revert "New interface function is added to VectorUtils Value *getSplatValue(Value *Val);""
This reverts commit r246379. It seems that the commit was not the culprit,
and the bot will be investigated for instability.

llvm-svn: 246380
2015-08-30 10:49:04 +00:00
Renato Golin c7be31736c Revert "New interface function is added to VectorUtils Value *getSplatValue(Value *Val);"
This reverts commit r246371, as it cause a rather obscure bug in AArch64
test-suite paq8p (time outs, seg-faults). I'll investigate it before
reapplying.

llvm-svn: 246379
2015-08-30 10:05:30 +00:00
Chandler Carruth 5543fbc9b2 Stop calling the flat out insane ARM target parsing code unless the
architecture string is something quite weird. Similarly delay calling
the BPF parsing code, although that is more reasonable.

To understand why I was motivated to make this change, it cuts the time
for running the ADT TripleTest unittests by a factor of two in
non-optimized builds (the developer default) and reduces my 'check-llvm'
time by a full 15 seconds. The implementation of parseARMArch is *that*
slow. I tried to fix it in the prior series of commits, but frankly,
I have no idea how to finish fixing it. The entire premise of the
function (to allow 'v7a-unknown-linux' or some such to parse as an
'arm-unknown-linux' triple) seems completely insane to me, but I'll let
the ARM folks sort that out. At least it is now out of the critical path
of every developer working on LLVM. It also will likely make some other
folks' code significantly faster as I've heard reports of 2% of time
spent in triple parsing even in optimized builds!

I'm not done making this code faster, but I am done trying to improve
the ARM target parsing code.

llvm-svn: 246378
2015-08-30 09:54:34 +00:00
Chandler Carruth 822d54a22c Remove a linear walk to find the default FPU for a given CPU by directly
expanding the .def file within a StringSwitch.

llvm-svn: 246377
2015-08-30 09:01:38 +00:00
Hal Finkel 982e8d48f8 [MIR Serialization] static -> static const in getSerializable*MachineOperandTargetFlags
Make the arrays 'static const' instead of just 'static'. Post-commit review
comment from Roman Divacky on IRC. NFC.

llvm-svn: 246376
2015-08-30 08:07:29 +00:00
Chandler Carruth 3309ef6f02 Teach the target parsing framework to directly compute the length of all
of its strings when expanding the string literals from the macros, and
push all of the APIs to be StringRef instead of C-string APIs.

This (remarkably) removes a very non-trivial number of strlen calls. It
even deletes code and complexity from one of the primary users -- Clang.

llvm-svn: 246374
2015-08-30 07:51:04 +00:00
Hal Finkel 2d55698ed7 [PowerPC/MIR Serialization] Target flags serialization support
Add support for MIR serialization of PowerPC-specific operand target flags
(based on the generic infrastructure added in r244185 and r245383).

I won't even pretend that this is good test coverage, but this includes the
regression test associated with r246372. Adding an MIR test for that fix is far
superior to adding an IR-level test because particular instruction-scheduling
decisions are necessary in order to expose the bug, and using an MIR test we
can start the pipeline post-scheduling.

llvm-svn: 246373
2015-08-30 07:50:35 +00:00
Hal Finkel d2fd9becf4 [PowerPC] Don't assume ADDISdtprelHA's source is r3
Even through ADDISdtprelHA generally has r3 as its source register, it is
possible for the instruction scheduler to move things around such that some
other register is the source. We need to print the actual source register, not
always r3. Fixes PR24394.

The test case will come in a follow-up commit because it depends on MIR
target-flags parsing.

llvm-svn: 246372
2015-08-30 07:44:05 +00:00
Elena Demikhovsky a59fcfa56b New interface function is added to VectorUtils
Value *getSplatValue(Value *Val);

It complements the CreateVectorSplat(), which creates 2 instructions - insertelement and shuffle with all-zero mask.

The new function recognizes the pattern - insertelement+shuffle and returns the splat value (or nullptr).
It also returns a splat value form ConstantDataVector, for completeness.

Differential Revision:	http://reviews.llvm.org/D11124

llvm-svn: 246371
2015-08-30 07:28:18 +00:00
Chandler Carruth 799e880e95 Refactor the ARM target parsing to use a def file with macros to expand
the necessary tables.

This will allow me to restructure the code and structures using this to
be significantly more efficient. It also removes the duplication of the
list of several enumerators. It also enshrines that the order of
enumerators match the order of the entries in the tables, something the
implementation code actually uses.

No functionality changed (yet).

llvm-svn: 246370
2015-08-30 05:27:31 +00:00
Chandler Carruth 4fc3a9862c [Triple] Use clang-format to normalize the formatting of the ARM target
parsing logic prior to making substantial changes to it.

This parsing logic is incredibly wasteful, so I'm planning to rewrite
it. Just unittesting the triple parsing logic spends well over 80% of
its time in the ARM parsing logic, and others have measured significant
time spent here in real production compiles.

Stay tuned...

llvm-svn: 246369
2015-08-30 02:17:15 +00:00
Chandler Carruth bb47b9a367 [Triple] Stop abusing a class to have only static methods and just use
the namespace that we are already using for the enums that are produced
by the parsing.

llvm-svn: 246367
2015-08-30 02:09:48 +00:00
Fiona Glaser 934765c1df SelectionDAG: add missing ComputeSignBits case for SELECT_CC
Identical to SELECT, just with different operand numbers.

llvm-svn: 246366
2015-08-29 23:04:38 +00:00
Peter Collingbourne 79bf113dca Fix shared library build.
llvm-svn: 246365
2015-08-29 22:34:34 +00:00
James Molloy 45ee9898ec [ARM] Hoist fabs/fneg above a conversion to float.
This is especially visible in softfp mode, for example in the implementation of libm fabs/fneg functions. If we have:

%1 = vmovdrr r0, r1
%2 = fabs %1

then move the fabs before the vmovdrr:

%1 = and r1, #0x7FFFFFFF
%2 = vmovdrr r0, r1

This is never a lose, and could be a serious win because the vmovdrr may be followed by a vmovrrd, which would enable us to remove the conversion into FPRs completely.

We already do this for f32, but not for f64. Tests are added for both.

llvm-svn: 246360
2015-08-29 10:49:11 +00:00
Matt Arsenault e4d0c142e8 AMDGPU: Add sdst operand to VOP2b instructions
The VOP3 encoding of these allows any SGPR pair for the i1
output, but this was forced before to always use vcc.
This doesn't yet try to use this, but does add the operand
to the definitions so the main change is adding vcc to the
output of the VOP2 encoding.

llvm-svn: 246358
2015-08-29 07:16:50 +00:00
Matt Arsenault 9a32cd3d3b AMDGPU: Set mem operands for spill instructions
llvm-svn: 246357
2015-08-29 06:48:57 +00:00
Matt Arsenault 5c004a7c61 AMDGPU: Fix dropping mem operands when moving to VALU
Without a memory operand, mayLoad or mayStore instructions
are treated as hasUnorderedMemRef, which results in much worse
scheduling.

We really should have a verifier check that any
non-side effecting mayLoad or mayStore has a memory operand.
There are a few instructions (interp and images) which I'm
not sure what / where to add these.

llvm-svn: 246356
2015-08-29 06:48:46 +00:00
Tom Stellard eea72ccbf2 AMDGPU/SI: Fix some invaild assumptions when folding 64-bit immediates
Summary:
We were assuming tha if the use operand had a sub-register that
the immediate was 64-bits, but this was breaking the case of
folding a 64-bit immediate into another 64-bit instruction.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12255

llvm-svn: 246354
2015-08-29 01:58:21 +00:00
Tom Stellard b8ce14c4c3 AMDGPU/SI: Factor operand folding code into its own function
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12254

llvm-svn: 246353
2015-08-28 23:45:19 +00:00
Duncan P. N. Exon Smith b09eb9f1c2 DI: Set DILexicalBlock columns >= 65536 to 0/unknown
This fixes PR24621 and matches what we do for `DILocation`.  Although
the limit seems somewhat artificial, there are places in the backend
that also assume 16-bit columns, so we may as well just be consistent
about the limits.

llvm-svn: 246349
2015-08-28 22:58:50 +00:00
Vedant Kumar 44fccb7b50 [X86] NFC: Clean up and clang-format a few lines
llvm-svn: 246340
2015-08-28 21:59:00 +00:00
Duncan P. N. Exon Smith b56b5af4c3 DI: Add Function::getSubprogram()
Add `Function::setSubprogram()` and `Function::getSubprogram()`,
convenience methods to forward to `setMetadata()` and `getMetadata()`,
respectively, and deal in `DISubprogram` instead of `MDNode`.

Also add a verifier check to enforce that `!dbg` attachments are always
subprograms.

Originally (when I had the llvm-dev discussion back in April) I thought
I'd store a pointer directly on `llvm::Function` for these attachments
-- we frequently have debug info, and that's much cheaper than using map
in the context if there are no other function-level attachments -- but
for now I'm just using the generic infrastructure.  Let's add the extra
complexity only if this shows up in a profile.

llvm-svn: 246339
2015-08-28 21:55:35 +00:00
Duncan P. N. Exon Smith 0660bcda53 AsmPrinter: Allow null subroutine type
Currently the DWARF backend requires that subprograms have a type, and
the type is ignored if it has an empty type array.  The long term
direction here -- see PR23079 -- is instead to skip the type entirely if
there's no valid type.

It turns out we have cases in tree of missing types on subprograms, but
since they're not referenced by compile units, the backend never crashes
on them.  One option would be to add a Verifier check that subprograms
have types, and fix the bitrot.  However, this is a fair bit of churn
(20-30 testcases) that would be reversed anyway by PR23079.

I found this inconsistency because of a WIP patch and upgrade script for
PR23367 that started crashing on test/DebugInfo/2010-10-01-crash.ll.
This commit updates the testcase to reference the subprogram from the
compile unit, and fixes the resulting crash (in line with the direction
of PR23079).  This also updates `DIBuilder` to stop assuming a non-null
pointer for the subroutine types.

llvm-svn: 246333
2015-08-28 21:38:24 +00:00
David Majnemer 0a92f86fe6 Revert r246232 and r246304.
This reverts isSafeToSpeculativelyExecute's use of ReadNone until we
split ReadNone into two pieces: one attribute which reasons about how
the function reasons about memory and another attribute which determines
how it may be speculated, CSE'd, trap, etc.

llvm-svn: 246331
2015-08-28 21:13:39 +00:00
Duncan P. N. Exon Smith 814b8e91c7 DI: Require subprogram definitions to be distinct
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'.  Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.

While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore.  Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node.  The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.

I updated almost all the IR with the following script:

    git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'

Likely some variant of would work for out-of-tree testcases.

llvm-svn: 246327
2015-08-28 20:26:49 +00:00
Sanjoy Das 6f5dca70ed [InstCombine] Fix PR24605.
PR24605 is caused due to an incorrect insert point in instcombine's IR
builder.  When simplifying

  %t = add X Y
  ...
  %m = icmp ... %t

the replacement for %t should be placed before %t, not before %m, as
there could be a use of %t between %t and %m.

llvm-svn: 246315
2015-08-28 19:09:31 +00:00
Chad Rosier dc65532fd9 Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y.
http://reviews.llvm.org/D6952
PR20673

llvm-svn: 246313
2015-08-28 18:30:18 +00:00
Petar Jovanovic 207a191a98 [mips64][mcjit] Add N64R6 relocations tests and fix N64R2 tests
This patch adds a test for MIPS64R6 relocations, it corrects check
expressions for R_MIPS_26 and R_MIPS_PC16 relocations in MIPS64R2 test, and
it adds run for big endian in MIPS64R2 test.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D11217

llvm-svn: 246311
2015-08-28 18:02:53 +00:00
Petar Jovanovic 28e2b717fc [mips] Remove incorrect DebugLoc entries from prologue
This has been causing the prologue_end to be incorrectly positioned.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D11293

llvm-svn: 246309
2015-08-28 17:53:26 +00:00
Matt Arsenault d9c830154f Make MergeConsecutiveStores look at other stores on same chain
When combiner AA is enabled, look at stores on the same chain.
Non-aliasing stores are moved to the same chain so the existing
code fails because it expects to find an adajcent store on a consecutive
chain.

Because of how DAGCombiner tries these store combines,
MergeConsecutiveStores doesn't see the correct set of stores on the chain
when it visits the other stores. Each store individually has its chain
fixed before trying to merge consecutive stores, and then tries to merge
stores from that point before the other stores have been processed to
have their chains fixed. To fix this, attempt to use FindBetterChain
on any possibly neighboring stores in visitSTORE.

Suppose you have 4 32-bit stores that should be merged into 1 vector
store. One store would be visited first, fixing the chain. What happens is
because not all of the store chains have yet been fixed, 2 of the stores
are merged. The other 2 stores later have their chains fixed,
but because the other stores were already merged, they have different
memory types and merging the two different sized stores is not
supported and would be more difficult to handle.

llvm-svn: 246307
2015-08-28 17:31:28 +00:00
JF Bastien f5aa1ca655 Remove Merge Functions pointer comparisons
Summary:
This patch removes two remaining places where pointer value comparisons
are used to order functions: comparing range annotation metadata, and comparing
block address constants. (These are both rare cases, and so no actual
non-determinism was observed from either case).

The fix for range metadata is simple: the annotation always consists of a pair
of integers, so we just order by those integers.

The fix for block addresses is more subtle. Two constants are the same if they
are the same basic block in the same function, or if they refer to corresponding
basic blocks in each respective function. Note that in the first case, merging
is trivially correct. In the second, the correctness of merging relies on the
fact that the the values of block addresses cannot be compared. This change is
actually an enhancement, as these functions could not previously be merged (see
merge-block-address.ll).

There is still a problem with cross function block addresses, in that constants
pointing to a basic block in a merged function is not updated.

This also more robustly compares floating point constants by all fields of their
semantics, and fixes a dyn_cast/cast mixup.

Author: jrkoenig
Reviewers: dschuff, nlewycky, jfb
Subscribers llvm-commits
Differential revision: http://reviews.llvm.org/D12376

llvm-svn: 246305
2015-08-28 16:49:09 +00:00
David Majnemer a787de3227 [CodeGen] isInTailCallPosition didn't consider readnone tailcalls
A readnone tailcall may still have a chain of computation which follows
it that would invalidate a tailcall lowering.  Don't skip the analysis
in such cases.

This fixes PR24613.

llvm-svn: 246304
2015-08-28 16:44:09 +00:00
Sanjay Patel 7c912898a5 [x86] enable machine combiner reassociations for scalar 'and' insts
llvm-svn: 246300
2015-08-28 14:09:48 +00:00
Chandler Carruth 4b682f6f24 [SROA] Fix PR24463, a crash I introduced in SROA by allowing it to
handle more allocas with loads past the end of the alloca.

I suspect there are some related crashers with slightly different
patterns, but I'll fix those and add test cases as I find them.

Thanks to David Majnemer for the excellent test case reduction here.
Made this super simple to debug and fix.

llvm-svn: 246289
2015-08-28 09:03:52 +00:00
Rui Ueyama 71ba9bdd23 Re-apply r246276 - Object: Teach llvm-ar to create symbol table for COFF short import files
This patch includes a fix for a llvm-readobj test. With this patch, 
the tool does no longer print out COFF headers for the short import
file, but that's probably desirable because the header for the short
import file is dummy.

llvm-svn: 246283
2015-08-28 07:40:30 +00:00
Steven Wu 61db34d12e Revert r246244 and r246243
These two commits cause clang/llvm bootstrap to hang.

llvm-svn: 246279
2015-08-28 06:52:00 +00:00
Rui Ueyama 8cff17469f Rollback r246276 - Object: Teach llvm-ar to create symbol table for COFF short import files
This change caused a test for llvm-readobj to fail.

llvm-svn: 246277
2015-08-28 06:03:01 +00:00
Rui Ueyama 22b1b7aad2 Object: Teach llvm-ar to create symbol table for COFF short import files.
COFF short import files are special kind of files that contains only
DLL-exported symbol names. That's different from object files because
it has no data except symbol names.

This change implements a SymbolicFile interface for the short import
files so that symbol names can be accessed through that interface.
llvm-ar is now able to read the file and create symbol table entries
for short import files.

llvm-svn: 246276
2015-08-28 05:47:46 +00:00
NAKAMURA Takumi bc3af7b031 LLVMCodeGen: Update libdeps corresponding to r246236.
llvm-svn: 246274
2015-08-28 05:38:49 +00:00
Ahmed Bougacha f9c19da03a [CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0.
For targets that didn't support this, this will let us respect the
langref instead of failing to select.

Note that we don't need to change the 32-bit x86/PPC lowerings (to
account for the result type/# difference) because they're both
custom and bypass type legalization.

llvm-svn: 246258
2015-08-28 01:49:59 +00:00
Joseph Tremoulet ec18285b91 [WinEH] Update coloring to handle nested cases cleanly
Summary:
Change the coloring algorithm in WinEHPrepare to visit a funclet's exits
in its parents' contexts and so properly classify the continuations of
nested funclets.

Also change the placement of cloned blocks to be deterministic and to
maintain the relative order of each funclet's blocks.

Add a lit test showing various patterns that require cloning, the last
several of which don't have CHECKs yet because they require cloning
entire funclets which is NYI.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12353

llvm-svn: 246245
2015-08-28 01:12:35 +00:00
Piotr Padlewski 3f81ec1e38 Constant propagation after hitting assume(cmp) bugfix
Last time code run into assertion `BBE.isSingleEdge()` in
lib/IR/Dominators.cpp:200.

http://reviews.llvm.org/D12170

llvm-svn: 246244
2015-08-28 01:02:00 +00:00
Piotr Padlewski 63cc5d4627 Constant propagation after hiting llvm.assume
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
  is constant we can change all variables with constants in the same BasicBlock

http://reviews.llvm.org/D11918

llvm-svn: 246243
2015-08-28 01:01:57 +00:00
George Burgess IV 68b36e01da Fix: CFLAA -- Mark no-args returns as unknown
Prior to this patch, we hadn't been marking StratifiedSets with the
appropriate StratifiedAttrs when handling the result of no-args call
instructions. This caused us to report NoAlias when handed, for
example, an escaped alloca and a result from an opaque function. Now we
properly mark the return value of said functions.

Thanks again to Chandler, Richard, and Nick for pinging me about this.

Differential review: http://reviews.llvm.org/D12408

llvm-svn: 246240
2015-08-28 00:16:18 +00:00
Quentin Colombet fa4ecb4b9a [AArch64][CollectLOH] Fix a regression that prevented us to detect chains of
more than 2 instructions.

I introduced this regression a while back and did not noticed it because I
somehow forgot to push the initial test cases for the pass!

Fix that as well!

llvm-svn: 246239
2015-08-27 23:47:10 +00:00
Peter Collingbourne c269ed5115 CodeGen: Introduce splitCodeGen and teach LTOCodeGenerator to use it.
llvm::splitCodeGen is a function that implements the core of parallel LTO
code generation. It uses llvm::SplitModule to split the module into linkable
partitions and spawning one code generation thread per partition. The function
produces multiple object files which can be linked in the usual way.

This has been threaded through to LTOCodeGenerator (and llvm-lto for testing
purposes). Separate patches will add parallel LTO support to the gold plugin
and lld.

Differential Revision: http://reviews.llvm.org/D12260

llvm-svn: 246236
2015-08-27 23:37:36 +00:00
Reid Kleckner 0e2882345d [WinEH] Add some support for code generating catchpad
We can now run 32-bit programs with empty catch bodies.  The next step
is to change PEI so that we get funclet prologues and epilogues.

llvm-svn: 246235
2015-08-27 23:27:47 +00:00
David Majnemer 0293704be2 [ValueTracking] readnone CallInsts are fair game for speculation
Any call which is side effect free is trivially OK to speculate.  We
already had similar logic in EarlyCSE and GVN but we were missing it
from isSafeToSpeculativelyExecute.

This fixes PR24601.

llvm-svn: 246232
2015-08-27 23:03:01 +00:00
Ahmed Bougacha 87166905c8 [CodeGen] Check FoldConstantArithmetic result before using it.
Fixes PR24602: r245689 introduced an unguarded use of
SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails
because of opaque (hoisted) constants.

llvm-svn: 246217
2015-08-27 21:46:04 +00:00
Erik Schnetter 5e93e28d8b Enable constant propagation for more math functions
Constant propagation for single precision math functions (such as
tanf) is already working, but was not enabled. This patch enables
these for many single-precision functions, and adds respective test
cases.

Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf
exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf

llvm-svn: 246194
2015-08-27 19:56:57 +00:00
Erik Schnetter ed6eab32b3 Revert 246186; still breaks on some systems
llvm-svn: 246191
2015-08-27 19:34:14 +00:00
Tyler Nowicki 5eaa5a9d26 Improve vectorization diagnostic messages and extend vectorize(enable) pragma.
This patch changes the analysis diagnostics produced when loops with
floating-point recurrences or memory operations are identified. The new messages 
say "cannot prove it is safe to reorder * operations; allow reordering by
specifying #pragma clang loop vectorize(enable)". Depending on the type of 
diagnostic the message will include additional options such as ffast-math or
__restrict__.

This patch also allows the vectorize(enable) pragma to override the low pointer
memory check threshold. When the hint is given a higher threshold is used.

See the clang patch for the options produced for each diagnostic.

llvm-svn: 246187
2015-08-27 18:56:49 +00:00
Erik Schnetter 05845d31c9 Enable constant propagation for more math functions
Constant propagation for single precision math functions (such as
tanf) is already working, but was not enabled. This patch enables
these for many single-precision functions, and adds respective test
cases.

Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf
exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf

llvm-svn: 246186
2015-08-27 18:56:23 +00:00
Erik Schnetter a23672626d Revert r246158 since it breaks LLVM.Transforms/ConstProp.calls.ll
llvm-svn: 246166
2015-08-27 17:24:01 +00:00
Erik Schnetter 694bf5c9b5 Enable constant propagation for more math functions
Constant propagation for single precision math functions (such as
tanf) is already working, but was not enabled. This patch enables
these for many single-precision functions, and adds respective test
cases.

Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf
exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf

llvm-svn: 246158
2015-08-27 16:36:37 +00:00
Chad Rosier c94f8e2906 [LoopVectorize] Add Support for Small Size Reductions.
Unlike scalar operations, we can perform vector operations on element types that
are smaller than the native integer types. We type-promote scalar operations if
they are smaller than a native type (e.g., i8 arithmetic is promoted to i32
arithmetic on Arm targets). This patch detects and removes type-promotions
within the reduction detection framework, enabling the vectorization of small
size reductions.

In the legality phase, we look through the ANDs and extensions that InstCombine
creates during promotion, keeping track of the smaller type. In the
profitability phase, we use the smaller type and ignore the ANDs and extensions
in the cost model. Finally, in the code generation phase, we truncate the result
of the reduction to allow InstCombine to rewrite the entire expression in the
smaller type.

This fixes PR21369.
http://reviews.llvm.org/D12202

Patch by Matt Simpson <mssimpso@codeaurora.org>!

llvm-svn: 246149
2015-08-27 14:12:17 +00:00
James Molloy 1bbf15c57c [LoopVectorize] Extract InductionInfo into a helper class...
... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup.

NFC

llvm-svn: 246145
2015-08-27 09:53:00 +00:00
Alex Rosenberg a0a19c1c91 Whoops, remove trailing whitespace.
llvm-svn: 246141
2015-08-27 05:37:12 +00:00
Pete Cooper 6b716218fa isKnownNonNull needs to consider globals in non-zero address spaces.
Globals in address spaces other than one may have 0 as a valid address,
so we should not assume that they can be null.

Reviewed by Philip Reames.

llvm-svn: 246137
2015-08-27 03:16:29 +00:00
Philip Reames dfd890dd3a Allow value forwarding past release fences in EarlyCSE
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence.

We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and *then* eliminate the former, but we can't do this if the stores are on opposite sides of the fence.

Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering.  In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences.

Differential Revision: http://reviews.llvm.org/D11434

llvm-svn: 246134
2015-08-27 01:32:33 +00:00
Philip Reames abcdc5e3a8 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers
When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints.

I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead.

Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them.

Differential Revision: http://reviews.llvm.org/D12004

llvm-svn: 246133
2015-08-27 01:02:28 +00:00
Tyler Nowicki e0f400feaa Improved printing of analysis diagnostics in the loop vectorizer.
This patch ensures that every analysis diagnostic produced by the vectorizer
will be printed if the loop has a vectorization hint on it. The condition has
also been improved to prevent printing when a disabling hint is specified.

llvm-svn: 246132
2015-08-27 01:02:04 +00:00
Cong Hou 08cb4fc688 Fixed a bug that edge weights are not assigned correctly when lowering switch statement.
This is a one-line-change patch that moves the update to UnhandledWeights to the correct position: it should be updated for all clusters instead of just range clusters.

Differential Revision: http://reviews.llvm.org/D12391

llvm-svn: 246129
2015-08-27 00:37:40 +00:00
Philip Reames 98a2dabc08 [SimplifyCFG] Prune code from a provably unreachable switch default
As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine.

Note: Duplicate case values are disallowed by the LangRef and verifier.

Differential Revision: http://reviews.llvm.org/D11995

llvm-svn: 246125
2015-08-26 23:56:46 +00:00
Hal Finkel 7ffe55ae9d [PowerPC] Remove unnecessary braces in PPCVSXFMAMutate
Address Eric's post-commit review of r245741. NFC.

llvm-svn: 246121
2015-08-26 23:41:53 +00:00
Bjarke Hammersholt Roune 6c64738e87 [NVPTX] Let NVPTX backend detect integer min and max patterns.
Summary:
Let NVPTX backend detect integer min and max patterns during isel and emit intrinsics that enable hardware support.


Reviewers: jholewinski, meheff, jingyue

Subscribers: arsenm, llvm-commits, meheff, jingyue, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D12377

llvm-svn: 246107
2015-08-26 23:22:02 +00:00
Cong Hou b5ef475e5c [ARM] Use BranchProbability::scale() to scale an integer with a probability in ARMBaseInstrInfo.cpp,
Previously in isProfitableToIfCvt() in ARMBaseInstrInfo.cpp, the multiplication between an integer and a branch probability is done manually in an unsafe way that may lead to overflow. This patch corrects those cases by using BranchProbability's member function scale() to avoid overflow (which stores the intermediate result in int64).

Differential Revision: http://reviews.llvm.org/D12295

llvm-svn: 246106
2015-08-26 23:17:52 +00:00
Cong Hou 03127700d5 Assign weights to edges to jump table / bit test header when lowering switch statement.
Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly.

Differential Revision: http://reviews.llvm.org/D12166#fae6eca7

llvm-svn: 246104
2015-08-26 23:15:32 +00:00
JF Bastien b1b61ebb21 WebAssembly: NFC comment update
llvm-svn: 246101
2015-08-26 23:03:07 +00:00
Duncan P. N. Exon Smith b2df64721c DI: Make Subprogram definitions 'distinct'
Change `DIBuilder` always to produce 'distinct' nodes when creating
`DISubprogram` definitions.  I measured a ~5% memory improvement in the
link step (of ld64) when using `-flto -g`.

`DISubprogram`s are used in two ways in the debug info graph.

Some are definitions, point at actual functions, and can't really be
shared between compile units.  With full debug info, these point down at
their variables, forming uniquing cycles.  These uniquing cycles are
expensive to link between modules, since all unique nodes that reference
them transitively need to be duplicated (see commit message for r244181
for more details).

Others are declarations, primarily used for member functions in the type
hierarchy.  Definitions never show up there; instead, a definition
points at its corresponding declaration node.

I started by making all subprograms 'distinct'.  However, that was too
big a hammer: memory usage *increased* ~5% (net increase vs. this patch
of ~10%) because the 'distinct' declarations undermine LTO type
uniquing.  This is a targeted fix for the definitions (where uniquing is
an observable problem).

A couple of notes:

  - There's an accompanying commit to update IRGen testcases in clang.
  - ^ That's what I'm using to test this commit.
  - In a follow-up, I'll change the verifier to require 'distinct' on
    definitions and add an upgrade to `BitcodeReader`.

llvm-svn: 246098
2015-08-26 22:50:16 +00:00
JF Bastien 45479f627a WebAssembly: handle private/internal globals.
Things of note:
 - Other linkage types aren't handled yet. We'll figure it out with dynamic linking.
 - Special LLVM globals are either ignored, or error out for now.
 - TLS isn't supported yet (WebAssembly will have threads later).
 - There currently isn't a syntax for alignment, I left it in a comment so it's easy to hook up.
 - Undef is convereted to whatever the type's appropriate null value is.
 - assert versus report_fatal_error: follow what other AsmPrinters do, and assert only on what should have been caught elsewhere.

llvm-svn: 246092
2015-08-26 22:09:54 +00:00
Reid Kleckner c2b9254426 [ms-inline-asm] Relax assertion around funky identifiers slightly
A corresponding clang change will make it so that clang can consume part
of an assembler token. The assembler treats '.' as an identifier
character while clang does not, so it's view of the token stream is a
little different.

llvm-svn: 246089
2015-08-26 21:57:25 +00:00
Kostya Serebryany 06c199ac9d [libFuzzer] fix minor inefficiency, PR24584
llvm-svn: 246087
2015-08-26 21:55:19 +00:00
Mehdi Amini 0ab4b5b52e Fix LLVM C API for DataLayout
We removed access to the DataLayout on the TargetMachine and
deprecated the C API function LLVMGetTargetMachineData() in r243114.
However the way I tried to be backward compatible was broken: I
changed the wrapper of the TargetMachine to be a structure that
includes the DataLayout as well. However the TargetMachine is also
wrapped by the ExecutionEngine, in the more classic way. A client
using the TargetMachine wrapped by the ExecutionEngine and trying
to get the DataLayout would break.

It seems tricky to solve the problem completely in the C API
implementation. This patch tries to address this backward
compatibility in a more lighter way in the C++ API. The C API is
restored in its original state and the removed C++ API is
reintroduced, but privately. The C API is friended to the
TargetMachine and should be the only consumer for this API.

Reviewers: ributzka

Differential Revision: http://reviews.llvm.org/D12263

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 246082
2015-08-26 21:16:29 +00:00
Matt Arsenault 8a067121f8 AMDGPU: Delete dead code
There is no context where s_mov_b64 is emitted
and could potentially be moved to the VALU.
It is currently only emitted for materializing
immediates, which can't be dependent on vector sources.

The immediate splitting is already done when selecting
constants. I'm not sure what contexts if any the register
splitting would have been used before.

Also clean up using s_mov_b64 in place of v_mov_b64_pseudo,
although this isn't required and just skips the extra step
of eliminating the copy from the SReg_64.

llvm-svn: 246080
2015-08-26 20:48:08 +00:00
Matt Arsenault 5e7f95e567 AMDGPU: Don't reprocess instructions when splitting i64 bcnt
llvm-svn: 246079
2015-08-26 20:48:04 +00:00
Matt Arsenault 445833cc91 AMDGPU: Fix not moving users of s_bfe_i64 to VALU
This wouldn't propagate to users of the original BFE
and would hit a verifier error.

llvm-svn: 246078
2015-08-26 20:47:58 +00:00
Matt Arsenault f003c38e1e AMDGPU: Don't create intermediate SALU instructions
When splitting 64-bit operations, create the correct
VALU instructions immediately.

This was splitting things like s_or_b64 into the two
s_or_b32s and then pushing the new instructions
onto the worklist. There's no reason we need
to do this intermediate step.

llvm-svn: 246077
2015-08-26 20:47:50 +00:00
Matthias Braun 4e7ded834f SelectionDAGBuilder: Fix SPDescriptor not resetting GuardReg
This was causing problems when some functions use a GuardReg and some
don't as can happen when mixing SelectionDAG and FastISel generated
functions.

llvm-svn: 246075
2015-08-26 20:46:52 +00:00
Matthias Braun 4816b18d86 FastISel: Avoid adding a successor block twice for degenerate IR.
This fixes http://llvm.org/PR24581

Differential Revision: http://reviews.llvm.org/D12350

llvm-svn: 246074
2015-08-26 20:46:49 +00:00
Andrew Kaylor af083d4cf9 Expose hasLiveCondCodeDef as a member function of the X86InstrInfo class. NFC
This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC.

Patch by: Kevin B. Smith
Differential Revision: http://reviews.llvm.org/D12371

llvm-svn: 246073
2015-08-26 20:36:52 +00:00
Diego Novillo 7732ae4a4f Fix memory leak in sample profile pass.
The problem here were the function analyses invoked by the function pass
manager from the new IPO pass. I looked at other IPO passes needing
dominance information and the only one that requires it (partial
inliner) does not use the standard dependency mechanism.

This patch mimics what the partial inliner does to compute dominance,
post-dominance and loop info. One thing I like about this approach is
that I can delay the computation of all this until I actually need it.

This should bring the ASAN buildbot back to green. If there's a better
way to fix this, I'll do it in a follow-up patch.

llvm-svn: 246066
2015-08-26 20:00:27 +00:00
Mehdi Amini 31ebf03c09 Revert "Fix LLVM C API for DataLayout"
This reverts commit r246052.
Third attempt, still unpleasant for some bots.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 246057
2015-08-26 19:24:59 +00:00
Matt Arsenault 602a16d3db AMDGPU/SI: Report SIFixSGPRLiveRanges changed function
llvm-svn: 246056
2015-08-26 19:12:03 +00:00
Mehdi Amini 9d692b6805 Fix LLVM C API for DataLayout
We removed access to the DataLayout on the TargetMachine and
deprecated the C API function LLVMGetTargetMachineData() in r243114.
However the way I tried to be backward compatible was broken: I
changed the wrapper of the TargetMachine to be a structure that
includes the DataLayout as well. However the TargetMachine is also
wrapped by the ExecutionEngine, in the more classic way. A client
using the TargetMachine wrapped by the ExecutionEngine and trying
to get the DataLayout would break.

It seems tricky to solve the problem completely in the C API
implementation. This patch tries to address this backward
compatibility in a more lighter way in the C++ API. The C API is
restored in its original state and the removed C++ API is
reintroduced, but privately. The C API is friended to the
TargetMachine and should be the only consumer for this API.

Reviewers: ributzka

Differential Revision: http://reviews.llvm.org/D12263

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 246052
2015-08-26 18:56:01 +00:00
Matt Arsenault bd66061db7 AMDGPU: Make sure to reserve super registers
I think this could potentially have broken if
one of the super registers were allocated
that contain v254/v255.

llvm-svn: 246051
2015-08-26 18:54:50 +00:00
Mehdi Amini 8b3dda3f71 Revert "Fix LLVM C API for DataLayout"
This reverts commit r246044.
Build broken, still. It builds for me...

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 246049
2015-08-26 18:37:59 +00:00
Matt Arsenault 19c5488015 AMDGPU: Produce error on dynamic_stackalloc
llvm-svn: 246048
2015-08-26 18:37:13 +00:00
David Majnemer 3354fe473f [SimplifyLibCalls] Fix a typo
cbrt(sqrt(x)) calculates the sixth root, not the ninth root.
cbrt(cbrt(x)) calculates the ninth root.

llvm-svn: 246046
2015-08-26 18:30:16 +00:00
Mehdi Amini b5d8b27fc8 Fix LLVM C API for DataLayout
We removed access to the DataLayout on the TargetMachine and
deprecated the C API function LLVMGetTargetMachineData() in r243114.
However the way I tried to be backward compatible was broken: I
changed the wrapper of the TargetMachine to be a structure that
includes the DataLayout as well. However the TargetMachine is also
wrapped by the ExecutionEngine, in the more classic way. A client
using the TargetMachine wrapped by the ExecutionEngine and trying
to get the DataLayout would break.

It seems tricky to solve the problem completely in the C API
implementation. This patch tries to address this backward
compatibility in a more lighter way in the C++ API. The C API is
restored in its original state and the removed C++ API is
reintroduced, but privately. The C API is friended to the
TargetMachine and should be the only consumer for this API.

Reviewers: ributzka

Differential Revision: http://reviews.llvm.org/D12263

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 246044
2015-08-26 18:22:34 +00:00
James Y Knight 3602286937 [SPARC] Fix stupid oversight in stack realignment support.
If you're going to realign %sp to get object alignment properly (which
the code does), and stack offsets and alignments are calculated going
down from %fp (which they are), then the total stack size had better
be a multiple of the alignment. LLVM did indeed ensure that.

And then, after aligning, the sparc frame code added 96 (for sparcv8)
to the frame size, making any requested alignment of 64-bytes or
higher *guaranteed* to be misaligned. The test case added with r245668
even tests this exact scenario, and asserted the incorrect behavior,
which I somehow failed to notice. D'oh.

This change fixes the frame lowering code to align the stack size
*after* adding the spill area, instead.

Differential Revision: http://reviews.llvm.org/D12349

llvm-svn: 246042
2015-08-26 17:57:51 +00:00
Vedant Kumar bf891b12b4 [llvm-mc] Ignore opcode size prefix in 64-bit CALL disassembly
This is a fix for disassembling unusual instruction sequences in 64-bit
mode w.r.t the CALL rel16 instruction. It might be desirable to move the
check somewhere else, but it essentially mimics the special case
handling with JCXZ in 16-bit mode.

The current behavior accepts the opcode size prefix and causes the
call's immediate to stop disassembling after 2 bytes. When debugging
sequences of instructions with this pattern, the disassembler output
becomes extremely unreliable and essentially useless (if you jump midway
into what lldb thinks is a unified instruction, you'll lose %rip). So we
ignore the prefix and consume all 4 bytes when disassembling a 64-bit
mode binary.

Note: in Vol. 2A 3-99 the Intel spec states that CALL rel16 is N.S. N.S.
is defined as:

    Indicates an instruction syntax that requires an address override
    prefix in 64-bit mode and is not supported. Using an address
    override prefix in 64-bit mode may result in model-specific
    execution behavior. (Vol. 2A 3-7)

Since 0x66 is an operand override prefix we should be OK (although we
may want to warn about 0x67 prefixes to 0xe8). On the CPUs I tested
with, they all ignore the 0x66 prefix in 64-bit mode.

Patch by Matthew Barney!

Differential Revision: http://reviews.llvm.org/D9573

llvm-svn: 246038
2015-08-26 16:20:29 +00:00
Chad Rosier 9f4709b261 [AArch64] Remove a use-after-free when collecting stats.
The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt()
is referencing freed memory.

llvm-svn: 246033
2015-08-26 13:39:48 +00:00
Silviu Baranga db1ddb32ce [AArch64] Unify the integer min/max vector selection patterns with the intrinsic ones
Summary:
This change lowers the aarch64 integer vector min/max intrinsic nodes to
generic min/max nodes and replaces the intrinsic selection patterns with
the generic ones.

There should already be testing in place for this, so no further tests
were added.

Reviewers: jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12276

llvm-svn: 246030
2015-08-26 11:11:14 +00:00
Chandler Carruth 748d095ff0 [SROA] Rip out all support for SSAUpdater in SROA.
This was only added to preserve the old ScalarRepl's use of SSAUpdater
which was originally to avoid use of dominance frontiers. Now, we only
need a domtree, and we'll need a domtree right after this pass as well
and so it makes perfect sense to always and only use the dom-tree
powered mem2reg. This was flag-flipper earlier and has stuck reasonably
so I wanted to gut the now-dead code out of SROA before we waste more
time with it. Among other things, this will make passmanager porting
easier.

llvm-svn: 246028
2015-08-26 09:09:29 +00:00
Alex Rosenberg 81cfed21ca Modernize with range-based for loops.
llvm-svn: 246018
2015-08-26 06:11:41 +00:00
Alex Rosenberg 99805ed45a Reduce code duplication.
llvm-svn: 246017
2015-08-26 06:11:38 +00:00
Alex Rosenberg 5b3404a03e Trailing whitespace
llvm-svn: 246016
2015-08-26 06:11:36 +00:00
Frederic Riss 74b9882ec3 [MC] Split the layout part of MCAssembler::finish() into its own method. NFC.
Split a MCAssembler::layout() method out of MCAssembler::finish(). This allows
running the MCSections layout separately from the streaming of the output
file. This way if a client wants to use MC to generate section contents, but
emit something different than the standard relocatable object files it is
possible (llvm-dsymutil is such a client).

llvm-svn: 246008
2015-08-26 05:09:49 +00:00
Frederic Riss 75c0c7050a [MC/MachO] Make some MachObjectWriter methods more generic. NFC.
Hardcode less values in some mach-o header writing routines and pass them
as argument. Doing so will allow reusing this code in llvm-dsymutil.

llvm-svn: 246007
2015-08-26 05:09:46 +00:00
JF Bastien 9dc042a0b6 Comparing operands should not require the same ValueID
Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging.

Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes.

Author: jrkoenig
Reviewers: nlewycky, jfb, dschuff
Subscribers: llvm-commits
Differential revision: http://reviews.llvm.org/D12302

llvm-svn: 246001
2015-08-26 03:02:58 +00:00
JF Bastien a1d3c24ccf Expose more properties of llvm::fltSemantics
Summary: Adds accessor functions for all the fields in llvm::fltSemantics. This will be used in MergeFunctions to order two APFloats with different semanatics.

Author: jrkoenig
Reviewers: jfb
Subscribers: dschuff, llvm-commits
Differential revision: http://reviews.llvm.org/D12253

llvm-svn: 245999
2015-08-26 02:32:45 +00:00
Matthias Braun ccfc9c8d6d FastISel: Use finishCondBranch() for ARM,Mips,PowerPC FastISel
Note that after this change branch probabilities are preserved now.

llvm-svn: 245998
2015-08-26 01:55:47 +00:00
Matthias Braun 17af607796 FastISel: Factor out common code; NFC intended
This should be no functional change but for the record: For three cases
in X86FastISel this will change the order in which the FalseMBB and
TrueMBB of a conditional branch is addedd to the successor/predecessor
lists.

llvm-svn: 245997
2015-08-26 01:38:00 +00:00
JF Bastien 1a4aa1589b WebAssembly: add small FIXME for AsmPrinter.
Suggested by @sunfish as a follow-up to r245982.

llvm-svn: 245996
2015-08-26 00:50:49 +00:00
Charles Davis 119525914c Make variable argument intrinsics behave correctly in a Win64 CC function.
Summary:
This change makes the variable argument intrinsics, `llvm.va_start` and
`llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows
inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch
I have to add support for GCC's `__builtin_ms_va_list` constructs.

Reviewers: nadav, asl, eugenis

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1622

llvm-svn: 245990
2015-08-25 23:27:41 +00:00
JF Bastien 54be3b1f03 WebAssembly: assert that there aren't any constant pools
WebAssembly will either use globals or immediates, since it's a virtual ISA.

llvm-svn: 245989
2015-08-25 23:19:49 +00:00
JF Bastien b6091dfe0f WebAssembly: emit `(func (param t) (result t))` s-expressions
Summary: Match spec format: https://github.com/WebAssembly/spec/blob/master/ml-proto/test/fac.wasm

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D12307

llvm-svn: 245986
2015-08-25 22:58:05 +00:00
JF Bastien 289287060b WebAssembly: comment out .globl when printing textual assembly
Do the same for .weak (not implemented for now, but may as well to it). Update comment string to two semicolons.

llvm-svn: 245982
2015-08-25 22:23:15 +00:00
Evgeniy Stepanov d04d07e65e [msan] Precise instrumentation for icmp sgt %x, -1.
Extend signed relational comparison instrumentation with a special
case for comparisons with -1. This fixes an MSan false positive when
such comparison is used as a sign bit test.

https://llvm.org/bugs/show_bug.cgi?id=24561

llvm-svn: 245980
2015-08-25 22:19:11 +00:00
Matthias Braun 130bd90e17 MachineBasicBlock: Use MCPhysReg instead of unsigned in livein API
This is friendlier to the readers as it makes it clear that the API is
not meant for vregs but just for physregs.

llvm-svn: 245977
2015-08-25 22:05:55 +00:00
Cong Hou cd59591396 Remove the final bit test during lowering switch statement if all cases in bit test cover a contiguous range.
When lowering switch statement, if bit tests are used then LLVM will always generates a jump to the default statement in the last bit test. However, this is not necessary when all cases in bit tests cover a contiguous range. This is because when generating the bit tests header MBB, there is a range check that guarantees cases in bit tests won't go outside of [low, high], where low and high are minimum and maximum case values in the bit tests. This patch checks if this is the case and then doesn't emit jump to default statement and hence saves a bit test and a branch.

Differential Revision: http://reviews.llvm.org/D12249

llvm-svn: 245976
2015-08-25 21:34:38 +00:00
Davide Italiano 68961bba06 [MachO] Move trivial accessors to header.
Requested by: Jim Grosbach.

llvm-svn: 245963
2015-08-25 18:27:59 +00:00
NAKAMURA Takumi c57a09821f Update libdeps in LLVMipo and LLVMScalarOpts, corresponding to r245940.
llvm-svn: 245957
2015-08-25 17:11:17 +00:00
Matthias Braun a7fc3856f1 Fix dependencies/shared library build
llvm-svn: 245955
2015-08-25 17:07:40 +00:00
David Blaikie d486000387 Fix dropped conditional in cleanup in r245752
Code review feedback by Charlie Turner.

llvm-svn: 245954
2015-08-25 17:01:36 +00:00
Wei Mi edae87d819 The patch replace the overflow check in loop vectorization with the minimum loop iterations check.
The loop minimum iterations check below ensures the loop has enough trip count so the generated
vector loop will likely be executed, and it covers the overflow check.

Differential Revision: http://reviews.llvm.org/D12107.

llvm-svn: 245952
2015-08-25 16:43:47 +00:00
Sanjay Patel deb8f826a5 make fast unaligned memory accesses implicit with SSE4.2 or SSE4a
This is a follow-on from the discussion in http://reviews.llvm.org/D12154.

This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has
generally fast unaligned memory ops.

A motivating use case for this change is a clang invocation that doesn't explicitly set
the CPU, but does target a feature that we know only exists on a CPU that supports fast
unaligned memops. For example:
$ clang -O1 foo.c -mavx

This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449

Before this patch, we used different store types depending on whether the example can be
lowered as a memset or not.

Differential Revision: http://reviews.llvm.org/D12288

llvm-svn: 245950
2015-08-25 16:29:21 +00:00
Diego Novillo 4d71113cdb Convert SampleProfile pass into a Module pass.
Eventually, we will need sample profiles to be incorporated into the
inliner's cost models.  To do this, we need the sample profile pass to
be a module pass.

This patch makes no functional changes beyond the mechanical adjustments
needed to run SampleProfile as a module pass.

llvm-svn: 245940
2015-08-25 15:25:11 +00:00
Davide Italiano 933e230738 [MachO] Introduce MinVersion API.
While introducing support for MinVersionLoadCommand in llvm-readobj I noticed there's
no API to extract Major/Minor/Update components conveniently. Currently consumers
do the bit twiddling on their own, but this will change from now on.

I'll convert llvm-objdump (and llvm-readobj) in a later commit.

Differential Revision:	 http://reviews.llvm.org/D12282
Reviewed by:	rafael

llvm-svn: 245938
2015-08-25 15:02:23 +00:00
Michael Kuperstein 6e3fee07f7 [X86] Remove references to _ftol2
As of r245924, _ftol2 is no longer used for fptoui on MS platforms.
Remove the dead code associated with it.

llvm-svn: 245925
2015-08-25 07:58:33 +00:00
Michael Kuperstein 8515893be8 [X86] Fix fptoui conversions
This fixes two issues in x86 fptoui lowering.
1) Makes conversions from f80 go through the right path on AVX-512.
2) Implements an inline sequence for fptoui i64 instead of a library
call. This improves performance by 6X on SSE3+ and 3X otherwise.
Incidentally, it also removes the use of ftol2 for fptoui, which was
wrong to begin with, as ftol2 converts to a signed i64, producing
wrong results for values >= 2^63.

Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D11316

llvm-svn: 245924
2015-08-25 07:42:09 +00:00
Steve King 5cdbd20cc3 Pass function attributes instead of boolean in isIntDivCheap().
llvm-svn: 245921
2015-08-25 02:31:21 +00:00
Piotr Padlewski 4e7f752bb8 Assume intrinsic handling in global opt
It doesn't solve the problem, when for example we load something, and
then assume that it is the same as some constant value, because
globalopt will fail on unknown load instruction. The proposed solution
would be to skip some instructions that we can't evaluate and they are
safe to skip (f.e. load, assume and many others) and see if they are
required to perform optimization (f.e. we don't care about ephemeral
instructions that may appear using @llvm.assume())

http://reviews.llvm.org/D12266

llvm-svn: 245919
2015-08-25 01:34:15 +00:00
Mehdi Amini f83b865448 Revert "Fix LLVM C API for DataLayout"
This reverts commit 433bfd94e4b7e3cc3f8b08f8513ce47817941b0c.
Broke some bot, have to see why it passed locally.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245917
2015-08-25 01:21:09 +00:00
Mehdi Amini 84b2e325d3 Fix LLVM C API for DataLayout
We removed access to the DataLayout on the TargetMachine and
deprecated the C API function LLVMGetTargetMachineData() in r243114.
However the way I tried to be backward compatible was broken: I
changed the wrapper of the TargetMachine to be a structure that
includes the DataLayout as well. However the TargetMachine is also
wrapped by the ExecutionEngine, in the more classic way. A client
using the TargetMachine wrapped by the ExecutionEngine and trying
to get the DataLayout would break.

It seems tricky to solve the problem completely in the C API
implementation. This patch tries to address this backward
compatibility in a more lighter way in the C++ API. The C API is
restored in its original state and the removed C++ API is
reintroduced, but privately. The C API is friended to the
TargetMachine and should be the only consumer for this API.

Reviewers: ributzka

Differential Revision: http://reviews.llvm.org/D12263

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245916
2015-08-25 01:07:25 +00:00
Hal Finkel 0f2ddcb83f [PowerPC] PPCVSXFMAMutate should ignore trivial-copy addends
We might end up with a trivial copy as the addend, and if so, we should ignore
the corresponding FMA instruction. The trivial copy can be coalesced away later,
so there's nothing to do here. We should not, however, assert. Fixes PR24544.

llvm-svn: 245907
2015-08-24 23:48:28 +00:00
Matthias Braun 1b50bb58a1 Try to fix buildbots
Apparently std::vector::erase(const_iterator) (as opposed to the
non-const iterator) is a part of C++11 but it seems this is not available
on all the buildbots.

llvm-svn: 245900
2015-08-24 23:30:39 +00:00
Sanjay Patel 4104337d9d fix typos; NFC
llvm-svn: 245899
2015-08-24 23:20:16 +00:00
Matthias Braun 7a8b1150bf Let's try to fix GNU libstdc++ buildbots
llvm-svn: 245898
2015-08-24 23:19:39 +00:00
Sanjay Patel 942b46a011 fix typo; NFC
llvm-svn: 245896
2015-08-24 23:18:44 +00:00
Matthias Braun b2b7ef1de8 MachineBasicBlock: Add liveins() method returning an iterator_range
llvm-svn: 245895
2015-08-24 22:59:52 +00:00
Dan Gohman 2683a5534e [WebAssembly] DYNAMIC_STACKALLOC returns a pointer.
llvm-svn: 245893
2015-08-24 22:31:52 +00:00
Peter Collingbourne 9c8909dbd1 LTO: Simplify merged module ownership.
This change moves LTOCodeGenerator's ownership of the merged module to a
field of type std::unique_ptr<Module>. This helps simplify parts of the code
and clears the way for the module to be consumed by LLVM CodeGen (see D12132
review comments).

Differential Revision: http://reviews.llvm.org/D12205

llvm-svn: 245891
2015-08-24 22:22:53 +00:00
JF Bastien af111db8af WebAssembly: Implement call
Summary: Support function calls.

Reviewers: sunfish, sunfishcode

Subscribers: sunfishcode, jfb, llvm-commits

Differential revision: http://reviews.llvm.org/D12219

llvm-svn: 245887
2015-08-24 22:16:48 +00:00
JF Bastien 19c2e6634d Revert two bad commits.
Summary: I forgot to squash git commits before doing an svn dcommit of D12219. Reverting, and re-submitting.

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D12298

llvm-svn: 245886
2015-08-24 22:07:33 +00:00
JF Bastien 744ad106c3 Missing print.
llvm-svn: 245883
2015-08-24 22:00:04 +00:00
JF Bastien d8a9d66d50 call
llvm-svn: 245882
2015-08-24 21:59:51 +00:00
Dan Gohman 12e1997e4b [WebAssembly] Make the assembly printer indent instructions.
llvm-svn: 245875
2015-08-24 21:19:48 +00:00
Peter Collingbourne e34034c8d0 LTO: Rename mergedModule variables to MergedModule to prepare for ownership change.
Also convert a few loops to range-for loops and correct a comment.

llvm-svn: 245874
2015-08-24 21:15:35 +00:00
Dan Gohman 69c4c76396 [WebAssembly] CodeGen support for __builtin_wasm_page_size()
llvm-svn: 245872
2015-08-24 21:03:24 +00:00
Sanjay Patel 6b2765fe49 fix typo; NFC
llvm-svn: 245869
2015-08-24 20:11:14 +00:00
Bill Schmidt 32fd189de2 [PPC64LE] Fix PR24546 - Swap optimization and debug values
This patch fixes PR24546, which demonstrates a segfault during the VSX
swap removal pass.  The problem is that debug value instructions were
not excluded from the list of instructions to be analyzed for webs of
related computation.  I've added the test case from the PR as a crash
test in test/CodeGen/PowerPC.

llvm-svn: 245862
2015-08-24 19:27:27 +00:00
Dan Gohman 7b63484b99 [WebAssembly] Skeleton FastISel support
llvm-svn: 245860
2015-08-24 18:44:37 +00:00
Dan Gohman 896e53fae8 [WebAssembly] Implement floating point rounding operators.
llvm-svn: 245859
2015-08-24 18:23:13 +00:00
Dan Gohman 01612f627d [WebAssembly] Tell TargetTransformInfo about popcnt and sqrt.
llvm-svn: 245853
2015-08-24 16:51:46 +00:00
Dan Gohman e419a7c307 [WebAssembly] Use the checked form of MachineFunction::getSubtarget. NFC.
llvm-svn: 245852
2015-08-24 16:46:31 +00:00
Dan Gohman 08fc966d3c [WebAssembly] Implement the is_zero_undef forms of cttz and ctlz
llvm-svn: 245851
2015-08-24 16:39:37 +00:00
Adhemerval Zanella 4754e2d59c [sanitizers] Add DFSan support for AArch64 42-bit VMA
This patch adds support for dfsan on aarch64-linux with 42-bit VMA
(current default config for 64K pagesize kernels).  The support is
enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time
for both clang/llvm and compiler-rt.  The default VMA is 39 bits.

llvm-svn: 245840
2015-08-24 13:48:10 +00:00
Michael Zuckerman 9beca2e7e2 [X86] Add support for mmword memory operand size for Intel-syntax x86 assembly
Differential Revision: http://reviews.llvm.org/D12151

llvm-svn: 245835
2015-08-24 10:26:54 +00:00
Oliver Stannard 284f2bffc9 Add DAG optimisation for FP16_TO_FP
The FP16_TO_FP node only uses the bottom 16 bits of its input, so the
following pattern can be optimised by removing the AND:

  (FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op)

This is a common pattern for ARM targets when functions have __fp16
arguments, as they are passed as floats (so that they get passed in the
correct registers), but then bitcast and truncated to ignore the top 16
bits.

llvm-svn: 245832
2015-08-24 09:47:45 +00:00
Scott Douglass bdef60462d [ARM] Use AEABI helpers for i64 div and rem
Differential Revision: http://reviews.llvm.org/D12232

llvm-svn: 245830
2015-08-24 09:17:18 +00:00
Scott Douglass d2974a6afa [ARM] Refactor LowerDivRem before adding LowerREM (nfc)
Differential Revision: http://reviews.llvm.org/D12230

llvm-svn: 245829
2015-08-24 09:17:11 +00:00
Michael Zuckerman 2fe19db94f first commit to llvm
llvm-svn: 245825
2015-08-24 07:48:50 +00:00
Mehdi Amini d134a67ce9 Require Dominator Tree For SROA, improve compile-time
TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree.
There is no reason not to require it up-front for SROA.

Some history is necessary to understand why we ended-up here.

r123437 switched the second (Legacy)SROA in the optimizer pipeline to
use SSAUpdater in order to avoid recomputing the costly
DominanceFrontier. The purpose was to speed-up the compile-time.

Later r123609 removed the need for the DominanceFrontier in
(Legacy)SROA.

Right after, some cleanup was made in r123724 to remove any reference
to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and
SROA_DT (the latter replacing SROA_DF).
The second argument of `createScalarReplAggregatesPass` was renamed
from `UseDomFrontier` to `UseDomTree`.
I believe this is were a mistake was made. The pipeline was not
updated and the call site was still:
    PM->add(createScalarReplAggregatesPass(-1, false));

At that time, SROA was immediately followed in the pipeline by
EarlyCSE which required alread the DominatorTree. Not requiring
the DominatorTree in SROA didn't save anything, but unfortunately
it was lost at this point.

When the new SROA Pass was introduced in r163965, I believe the goal
was to have an exact replacement of the existing SROA, this bug
slipped through.

You can see currently:

$ echo "" | clang -x c++  -O3 -c - -mllvm -debug-pass=Structure
...
...
      FunctionPass Manager
        SROA
        Dominator Tree Construction
        Early CSE

After this patch:

$ echo "" | clang -x c++  -O3 -c - -mllvm -debug-pass=Structure
...
...
      FunctionPass Manager
        Dominator Tree Construction
        SROA
        Early CSE

This improves the compile time from 88s to 23s for PR17855.
https://llvm.org/bugs/show_bug.cgi?id=17855

And from 113s to 12s for PR16756
https://llvm.org/bugs/show_bug.cgi?id=16756

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D12267

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245820
2015-08-23 22:15:49 +00:00
David Majnemer b01aa9f794 [IR] Cleanup EH instructions a little bit
Just a cosmetic change, no functionality change is intended.

llvm-svn: 245818
2015-08-23 19:22:31 +00:00
Simon Pilgrim 2a7049abe0 [DAGCombiner] Fold CONCAT_VECTORS of bitcasted EXTRACT_SUBVECTOR
Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from.

llvm-svn: 245814
2015-08-23 15:22:14 +00:00
Frederic Riss 7bb12261a3 [dwarfdump] Do not apply relocations in mach-o files if there is no LoadedObjectInfo.
Not only do we not need to do anything to read correct values from the
object files, but the current logic actually wrongly applies twice the
section base address when there is no LoadedObjectInfo passed to the
DWARFContext creation (as the added test shows).

Simply do not apply any relocations on the mach-o debug info if there is
no load offset to apply.

llvm-svn: 245807
2015-08-23 04:44:21 +00:00
Mehdi Amini a758398833 Add missing break in AArch64DAGToDAGISel::Select() switch case
Reported by coverity.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245800
2015-08-23 00:42:57 +00:00
Mehdi Amini 5aa7bd7d62 Do not use dyn_cast<> after isa<>
Reported by coverity.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245799
2015-08-23 00:27:57 +00:00
Joseph Tremoulet 8220bcc570 [WinEH] Require token linkage in EH pad/ret signatures
Summary:
WinEHPrepare is going to require that cleanuppad and catchpad produce values
of token type which are consumed by any cleanupret or catchret exiting the
pad.  This change updates the signatures of those operators to require/enforce
that the type produced by the pads is token type and that the rets have an
appropriate argument.

The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and
similarly for `CleanupReturnInst`/`CleanupPadInst`).  To accommodate that
restriction, this change adds a notion of an operator constraint to both
LLParser and BitcodeReader, allowing appropriate sentinels to be constructed
for forward references and appropriate error messages to be emitted for
illegal inputs.

Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad
predecessor must have no other predecessors; this ensures that WinEHPrepare
will see the expected linear relationship between sibling catches on the
same try.

Lastly, remove some superfluous/vestigial casts from instruction operand
setters operating on BasicBlocks.

Reviewers: rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12108

llvm-svn: 245797
2015-08-23 00:26:33 +00:00
David Blaikie 3c338f3a7e Verifier: Don't crash on null entries in debug info retained types list
There was already a good error path for this. Added a test for it & made
a minor code change to ensure the error path was actually reached,
rather than crashing before we got that far.

llvm-svn: 245795
2015-08-22 22:36:40 +00:00
Jingyue Wu fcec09866a [NVPTX] Allow undef value as global initializer
Summary:
__shared__ variable may now emit undef value as initializer, do not
throw error on that.

Test Plan: test/CodeGen/NVPTX/global-addrspace.ll

Patch by Xuetian Weng

Reviewers: jholewinski, tra, jingyue

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D12242

llvm-svn: 245785
2015-08-22 05:40:26 +00:00
Peter Collingbourne c7b675f48c LTO: Maintain target triple, FeatureStr and CGOptLevel in the module or LTOCodeGenerator.
This makes it easier to create new TargetMachines on demand.

llvm-svn: 245781
2015-08-22 02:25:53 +00:00
Matt Arsenault 0a3ac1be43 AMDGPU: Allow specifying different opcode on VI for SMRD/SMEM
Although the basic s_load_* instructions happen to use the same
opcode, some of the special case SMRD instructions have
different opcodes.

llvm-svn: 245775
2015-08-22 00:54:31 +00:00
Matt Arsenault e8df879948 AMDGPU: Improve accuracy of instruction rates for some FP instructions
llvm-svn: 245774
2015-08-22 00:50:41 +00:00
Matt Arsenault 33010103b7 AMDGPU: Use DFS to avoid second loop over function
llvm-svn: 245772
2015-08-22 00:43:38 +00:00
Matt Arsenault c8d8e4ed76 AMDGPU: Make sure to run verifier after SIFixSGPRLiveRanges
llvm-svn: 245769
2015-08-22 00:19:34 +00:00
Matt Arsenault aba29d6ab1 AMDGPU: Improve debug printing in SIFixSGPRLiveRanges
llvm-svn: 245768
2015-08-22 00:19:25 +00:00
Matt Arsenault 6adf07a92e AMDGPU: Move CI instructions into CIInstructions.td
There are still a couple of CI patterns left in SIInstructions.

llvm-svn: 245767
2015-08-22 00:16:34 +00:00
Matt Arsenault f56872dc30 AMDGPU: Minor cleanups to help with f16 support
The main change is inverting the condition for the
operand class classes so that VT.Size == 16 uses VGPR_32
instead of 64.

llvm-svn: 245764
2015-08-21 23:49:51 +00:00
JF Bastien 057292a76c Improve the determinism of MergeFunctions
Summary:

Merge functions previously relied on unsigned comparisons of pointer values to
order functions. This caused observable non-determinism in the compiler for
large bitcode programs. Basically, opt -mergefuncs program.bc | md5sum produces
different hashes when run repeatedly on the same machine. Differing output was
observed on three large bitcodes, but it was less frequent on the smallest file.
It is possible that this only manifests on the large inputs, hence remaining
undetected until now.

This patch fixes this by removing (almost, see below) all places where
comparisons between pointers are used to order functions. Most of these changes
are local, but the comparison of global values requires assigning an identifier
to each local in the order it is visited. This is very similar to the way the
comparison function identifies Value*'s defined within a function. Because the
order of visiting the functions and their subparts is deterministic, the
identifiers assigned to the globals will be as well, and the order of functions
will be deterministic.

With these changes, there is no more observed non-determinism. There is also
only minor slowdowns (negligible to 4%) compared to the baseline, which is
likely a result of the fact that global comparisons involve hash lookups and not
just pointer comparisons.

The one caveat so far is that programs containing BlockAddress constants can
still be non-deterministic. It is not clear what the right solution is here. In
particular, even if the global numbers are used to order by function, we still
need a way to order the BasicBlock*'s. Unfortunately, we cannot just bail out
and fail to order the functions or consider them equal, because we require a
total order over functions. Note that programs with BlockAddress constants are
relatively rare, so the impact of leaving this in is minor as long as this pass
is opt-in.

Author: jrkoenig

Reviewers: nlewycky, jfb, dschuff

Subscribers: jevinskie, llvm-commits, chapuni

Differential revision: http://reviews.llvm.org/D12168

llvm-svn: 245762
2015-08-21 23:27:24 +00:00
Adam Nemet 4e533ef7a9 [LAA] Hold bounds via ValueHandles during SCEV expansion
SCEV expansion can invalidate previously expanded values.  For example
in SCEVExpander::ReuseOrCreateCast, if we already have the requested
cast value but it's not at the desired location, a new cast is inserted
and the old cast will be invalidated.

Therefore, when expanding the bounds for the pointers, a later entry can
invalidate the IR value for an earlier one.  The fix is to store a value
handle rather than the value itself.

The newly added test has a more detailed description of how the bug
triggers.

This bug can have a negative but potentially highly variable performance
impact in Loop Distribution.  Because one of the bound values was
invalidated and is an undef expression now, InstCombine is free to
transform the array overlap check:

   Start0 <= End1 && Start1 <= End0

into:

   Start0 <= End1

So depending on the runtime location of the arrays, we would detect a
conflict and fall back on the original loop of the versioned loop.

Also tested compile time with SPEC2006 LTO bc files.

llvm-svn: 245760
2015-08-21 23:19:57 +00:00
Tyler Nowicki 552a62fabc Standardized 'failed' to 'Failed' in LoopVectorizationRequirements.
llvm-svn: 245759
2015-08-21 23:03:24 +00:00
Peter Collingbourne 44ee84eec5 LTO: Change signature of LTOCodeGenerator::setCodePICModel() to take a Reloc::Model.
This allows us to remove a bunch of code in LTOCodeGenerator and llvm-lto
and has the side effect of improving error handling in the libLTO C API.

llvm-svn: 245756
2015-08-21 22:57:17 +00:00
Tom Stellard bd8a0856e2 AMDGPU/SI: Better handle s_wait insertion
We can wait on either VM, EXP or LGKM.
The waits are independent.

Without this patch, a wait inserted because of one of them
would also wait for all the previous others.
This patch makes s_wait only wait for the ones we need for the next
instruction.

Here's an example of subtle perf reduction this patch solves:

This is without the patch:

buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen
buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen
s_load_dwordx4 s[44:47], s[8:9], 0xc
s_waitcnt lgkmcnt(0)
buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen
s_load_dwordx4 s[48:51], s[8:9], 0x10
s_waitcnt vmcnt(1)
buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen

The s_waitcnt vmcnt(1) is useless.
The reason it is added is because the last
buffer_load_format_xyzw needs s[44:47], which was issued
by the first s_load_dwordx4. It waits for all VM
before that call to have finished.

Internally after every instruction, 3 counters (for VM, EXP and LGTM)
are updated after every instruction. For example buffer_load_format_xyzw
will
increase the VM counter, and s_load_dwordx4 the LGKM one.

Without the patch, for every defined register,
the current 3 counters are stored, and are used to know
how long to wait when an instruction needs the register.

Because of that, the s[44:47] counter includes that to use the register
you need to wait for the previous buffer_load_format_xyzw.

Instead this patch stores only the counters that matter for the
register,
and puts zero for the other ones, since we don't need any wait for them.

Patch by: Axel Davy

Differential Revision: http://reviews.llvm.org/D11883

llvm-svn: 245755
2015-08-21 22:47:27 +00:00
Sanjoy Das c86c162a58 Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0"
The original checkin was buggy, this change has a fix.

Original commit message:

[InstCombine] Transform A & (L - 1) u< L --> L != 0

Summary:

This transform is never a pessimization at the IR level (since it
replaces an `icmp` with another), and has potentiall payoffs:

 1. It may make the `icmp` fold away or become loop invariant.
 2. It may make the `A & (L - 1)` computation dead.

This shows up in Java, in range checks generated by array accesses of
the form `a[i & (a.length - 1)]`.

Reviewers: reames, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12210

llvm-svn: 245753
2015-08-21 22:22:37 +00:00
David Blaikie 47bf5c019d Range-for-ify some things in GlobalMerge
llvm-svn: 245752
2015-08-21 22:19:06 +00:00
David Blaikie 9ed57a9ef0 [opaque pointer types] Fix a few easy places in GlobalMerge that were accessing value types through pointee types
llvm-svn: 245746
2015-08-21 22:00:44 +00:00
Alex Lorenz c1136ef3b8 MIR Serialization: Serialize the pointer IR expression values in the machine
memory operands.

llvm-svn: 245745
2015-08-21 21:54:12 +00:00
Vedant Kumar 366dd9fd2b [ARM] Fix MachO CPU Subtype selection
Differential Revision: http://reviews.llvm.org/D12040

llvm-svn: 245744
2015-08-21 21:52:48 +00:00
Alex Lorenz 5d8b0bd9b0 MIRParser: Split the 'parseIRConstant' method into two methods. NFC.
One variant of this method can be reused when parsing the quoted IR pointer
expressions in the machine memory operands.

llvm-svn: 245743
2015-08-21 21:48:22 +00:00
David Blaikie d583b19569 [opaque pointer types] Push the passing of value types up from Function/GlobalVariable to GlobalObject
(coming next, pushing this up into GlobalValue, so it can store the
value type directly)

llvm-svn: 245742
2015-08-21 21:35:28 +00:00
Hal Finkel ff9639d6b7 [PowerPC] PPCVSXFMAMutate should not segfault on undef input registers
When PPCVSXFMAMutate would look at the input addend register, it would get its
input value number. This would fail, however, if the register was undef,
causing a segfault. Don't segfault (just skip such FMA instructions).

Fixes the test case from PR24542 (although that may have been over-reduced).

llvm-svn: 245741
2015-08-21 21:34:24 +00:00
Alex Lorenz 1de2acd3c2 AsmParser: Save and restore the parsing state for types using SlotMapping.
This commit extends the 'SlotMapping' structure and includes mappings for named
and numbered types in it. The LLParser is extended accordingly to fill out
those mappings at the end of module parsing.

This information is useful when we want to parse standalone constant values
at a later stage using the 'parseConstantValue' method. The constant values
can be constant expressions, which can contain references to types. In order
to parse such constant values, we have to restore the internal named and
numbered mappings for the types in LLParser, otherwise the parser will report
a parsing error. Therefore, this commit also introduces a new method called
'restoreParsingState' to LLParser, which uses the slot mappings to restore
some of its internal parsing state.

This commit is required to serialize constant value pointers in the machine
memory operands for the MIR format.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 245740
2015-08-21 21:32:39 +00:00
Bruno Cardoso Lopes 7a1483e7d1 [LVI] Use a SmallVector instead of SmallPtrSet. NFC
llvm-svn: 245739
2015-08-21 21:18:26 +00:00
Alex Lorenz f22ca8ad35 MIR Serialization: Print MCSymbol operands.
This commit allows the MIR printer to print the MCSymbol machine operands.
Unfortunately they can't be parsed at this time. I will create a bug that will
track the fact that the MCSymbol operands can't be parsed yet.

llvm-svn: 245737
2015-08-21 21:12:44 +00:00
Sanjay Patel f0bc07f7a5 [x86] enable machine combiner reassociations for 256-bit vector min/max
llvm-svn: 245735
2015-08-21 21:04:21 +00:00
Sanjay Patel dddad10241 remove 'FeatureSlowUAMem' from AMD CPUs based on 10H micro-arch or later
See discussion in D12154 ( http://reviews.llvm.org/D12154 ), AMD Software
Optimization Guides for 10H/12H/15H/16H, and Agner Fog's experimental data.

llvm-svn: 245733
2015-08-21 20:39:17 +00:00
David Blaikie 51973e1088 Add comment as follow up to r245712
llvm-svn: 245730
2015-08-21 20:18:39 +00:00
Sanjay Patel 9e916dc48d [x86] invert logic for attribute 'FeatureFastUAMem'
This is a 'no functional change intended' patch. It removes one FIXME, but adds several more.

Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any
sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however,
you can see that we're not consistent about this. Changing the name of the attribute makes it
clearer to see the logic holes.

Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute
to new chips; fast unaligned accesses have been standard for several generations of CPUs now.

Differential Revision: http://reviews.llvm.org/D12154

llvm-svn: 245729
2015-08-21 20:17:26 +00:00
David Blaikie 88208840b5 [opaque pointer type]: Pass explicit pointee type when building a constant GEP.
Gets a bit tricky in the ValueMapper, of course - not sure if we should
just expose a list of explicit types for each Value so that the
ValueMapper can be neutral to these special cases (it's OK for things
like load, where the explicit type is the result type - but when that's
not the case, it means plumbing through another "special" type... )

llvm-svn: 245728
2015-08-21 20:16:51 +00:00
Sanjay Patel cf942fa905 [x86] enable machine combiner reassociations for 128-bit vector min/max
llvm-svn: 245715
2015-08-21 18:06:49 +00:00
David Blaikie 401bb64b31 Remove an unnecessary use of pointee types introduced in r194220
David Majnemer (the original author) believes this to be an impossible
condition to reach anyway, and no test cases cover this so we'll go with
that.

llvm-svn: 245712
2015-08-21 17:37:41 +00:00
Yaron Keren 528d8d6092 Disable Visual C++ 2013 Debug mode assert on null pointer in some STL algorithms,
such as std::equal on the third argument. This reverts previous workarounds.

Predefining _DEBUG_POINTER_IMPL disables Visual C++ 2013 headers from defining
it to a function performing the null pointer check. In practice, it's not that
bad since any function actually using the nullptr will seg fault. The other
iterator sanity checks remain enabled in the headers.

Reviewed by Aaron Ballmanþ and Duncan P. N. Exon Smith.

llvm-svn: 245711
2015-08-21 17:31:03 +00:00
Benjamin Kramer 103fc94d2d [APFloat] Remove else after return and replace loop with std::equal. NFC.
llvm-svn: 245707
2015-08-21 16:44:52 +00:00
Eric Christopher e5e302f7e0 Fix typo - symetric -> symmetric.
llvm-svn: 245705
2015-08-21 16:23:39 +00:00
John Brawn eab960c46f [DAGCombiner] Fold together mul and shl when both are by a constant
This is intended to improve code generation for GEPs, as the index value is
shifted by the element size and in GEPs of multi-dimensional arrays the index
of higher dimensions is multiplied by the lower dimension size.

Differential Revision: http://reviews.llvm.org/D12197

llvm-svn: 245689
2015-08-21 10:48:17 +00:00
NAKAMURA Takumi 6a6232818d Revert r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0"
It caused miscompilation in clang.

llvm-svn: 245678
2015-08-21 07:46:07 +00:00
Peter Collingbourne 5cd1e8d3ab Linker: Remove empty destructor.
llvm-svn: 245672
2015-08-21 04:51:24 +00:00
Peter Collingbourne ec43d0f356 LTO: Simplify ownership of LTOCodeGenerator::TargetMach.
llvm-svn: 245671
2015-08-21 04:45:57 +00:00
Peter Collingbourne 2257512f87 LTO: Simplify ownership of LTOCodeGenerator::CodegenOptions.
llvm-svn: 245670
2015-08-21 04:45:55 +00:00
James Y Knight 667395f334 [Sparc] Support user-specified stack object overalignment.
Note: I do not implement a base pointer, so it's still impossible to
have dynamic realignment AND dynamic alloca in the same function.

This also moves the code for determining the frame index reference
into getFrameIndexReference, where it belongs, instead of inline in
eliminateFrameIndex.

[Begin long-winded screed]

Now, stack realignment for Sparc is actually a silly thing to support,
because the Sparc ABI has no need for it -- unlike the situation on
x86, the stack is ALWAYS aligned to the required alignment for the CPU
instructions: 8 bytes on sparcv8, and 16 bytes on sparcv9.

However, LLVM unfortunately implements user-specified overalignment
using stack realignment support, so for now, I'm going to go along
with that tradition. GCC instead treats objects which have alignment
specification greater than the maximum CPU-required alignment for the
target as a separate block of stack memory, with their own virtual
base pointer (which gets aligned). Doing it that way avoids needing to
implement per-target support for stack realignment, except for the
targets which *actually* have an ABI-specified stack alignment which
is too small for the CPU's requirements.

Further unfortunately in LLVM, the default canRealignStack for all
targets effectively returns true, despite that implementing that is
something a target needs to do specifically. So, the previous behavior
on Sparc was to silently ignore the user's specified stack
alignment. Ugh.

Yet MORE unfortunate, if a target actually does return false from
canRealignStack, that also causes the user-specified alignment to be
*silently ignored*, rather than emitting an error.

(I started looking into fixing that last, but it broke a bunch of
tests, because LLVM actually *depends* on having it silently ignored:
some architectures (e.g. non-linux i386) have smaller stack alignment
than spilled-register alignment. But, the fact that a register needs
spilling is not known until within the register allocator. And by that
point, the decision to not reserve the frame pointer has been frozen
in place. And without a frame pointer, stack realignment is not
possible. So, canRealignStack() returns false, and
needsStackRealignment() then returns false, assuming everyone can just
go on their merry way assuming the alignment requirements were
probably just suggestions after-all. Sigh...)

Differential Revision: http://reviews.llvm.org/D12208

llvm-svn: 245668
2015-08-21 04:17:56 +00:00
Peter Collingbourne 1dc6a8d179 TransformUtils: Introduce module splitter.
The module splitter splits a module into linkable partitions. It will
be used to implement parallel LTO code generation.

This initial version of the splitter does not attempt to deal with the
somewhat subtle symbol visibility issues around module splitting. These
will be dealt with in a future change.

Differential Revision: http://reviews.llvm.org/D12132

llvm-svn: 245662
2015-08-21 02:48:20 +00:00
NAKAMURA Takumi cf61aae163 SparcAsmParser.cpp: Appease msc x86.
llvm-svn: 245661
2015-08-21 01:12:19 +00:00
Matthias Braun 46e5639806 AArch64: Fix cmp;ccmp ordering
When producing conditional compare sequences for or operations we need
to negate the operands and the finally tested flags. The thing is if we negate
the finally tested flags this equals a logical negation of all previously
emitted expressions. There was a case missing where we have to order OR
expressions so they get emitted first.

This fixes http://llvm.org/PR24459

llvm-svn: 245641
2015-08-20 23:33:34 +00:00
Matthias Braun 266204b7dc AArch64: Do not create CCMP on multiple users.
Create CMP;CCMP sequences from and/or trees does not gain us anything if
the and/or tree is materialized to a GP register anyway. While most of
the code already checked for hasOneUse() there was one important case
missing.

llvm-svn: 245640
2015-08-20 23:33:31 +00:00
David Majnemer 2df38cd0c4 [InstSimplify] add nuw %x, C2 must be at least C2
Use the fact that add nuw always creates a larger bit pattern when
trying to simplify comparisons.

llvm-svn: 245638
2015-08-20 23:01:41 +00:00
Dan Gohman 32907a6b21 [WebAssembly] Mark more operators as Expand.
llvm-svn: 245636
2015-08-20 22:57:13 +00:00
Sanjoy Das e472d8a57a [InstCombine] Transform A & (L - 1) u< L --> L != 0
Summary:
This transform is never a pessimization at the IR level (since it
replaces an `icmp` with another), and has potentiall payoffs:

 1. It may make the `icmp` fold away or become loop invariant.
 2. It may make the `A & (L - 1)` computation dead.

This shows up in Java, in range checks generated by array accesses of
the form `a[i & (a.length - 1)]`.

Reviewers: reames, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12210

llvm-svn: 245635
2015-08-20 22:31:55 +00:00
Michael Zolotukhin 51b00e6d82 [SLP] Propagate 'nontemporal' attribute into vectorized instructions.
llvm-svn: 245633
2015-08-20 22:28:15 +00:00
Michael Zolotukhin 2a3d99fedf [LoopVectorize] Propagate 'nontemporal' attribute into vectorized instructions.
llvm-svn: 245632
2015-08-20 22:27:38 +00:00
Adrian Prantl cbdfdb74d3 Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata()
and make it always preserve debug locations, since all callers wanted this
behavior anyway.

This is addressing a post-commit review feedback for r245589.

NFC (inside the LLVM tree).

llvm-svn: 245622
2015-08-20 22:00:30 +00:00
Ahmed Bougacha 0cdc7719f0 [X86] Look for scalar through one bitcast when lowering to VBROADCAST.
Fixes PR23464: one way to use the broadcast intrinsics is:

  _mm256_broadcastw_epi16(_mm_cvtsi32_si128(*(int*)src));

We don't currently fold this, but now that we use native IR for
the intrinsics (r245605), we can look through one bitcast to find
the broadcast scalar.

Differential Revision: http://reviews.llvm.org/D10557

llvm-svn: 245613
2015-08-20 21:02:39 +00:00
Jingyue Wu ca3ef11a9b [NVPTX] truncating 64-bit to 32-bit is free
Summary:
Add an LSR test that exercises isTruncateFree. Without this change, LSR creates
another indvar representing the truncated value.

Reviewers: jholewinski, eliben

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D12058

llvm-svn: 245611
2015-08-20 20:59:02 +00:00
Ahmed Bougacha 1a498705e4 [X86] Replace avx2 broadcast intrinsics with native IR.
Since r245605, the clang headers don't use these anymore.
r245165 updated some of the tests already; update the others, add
an autoupgrade, remove the intrinsics, and cleanup the definitions.

Differential Revision: http://reviews.llvm.org/D10555

llvm-svn: 245606
2015-08-20 20:36:19 +00:00
Adhemerval Zanella e00b497242 [asan] Add ASAN support for AArch64 42-bit VMA
This patch adds support for asan on aarch64-linux with 42-bit VMA
(current default config for 64K pagesize kernels).  The support is
enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time
for both clang/llvm and compiler-rt.  The default VMA is 39 bits.

llvm-svn: 245594
2015-08-20 18:30:40 +00:00
Jingyue Wu 10fcea5d4b [ValueTracking] computeOverflowForSignedAdd and isKnownNonNegative
Summary:
Refactor, NFC

Extracts computeOverflowForSignedAdd and isKnownNonNegative from NaryReassociate to ValueTracking in case
others need it.

Reviewers: reames

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D11313

llvm-svn: 245591
2015-08-20 18:27:04 +00:00
Bruno Cardoso Lopes ed6b9bfeab [LVI] Avoid iterator invalidation in LazyValueInfoCache::threadEdge
Do that by copying out the elements to another SmallPtrSet.
Follow up from r245309.

llvm-svn: 245590
2015-08-20 18:24:54 +00:00
Adrian Prantl baf90fc265 Fix a bug that caused SimplifyCFG to drop DebugLocs.
Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all
metadata in KnownSet, but the condition for DebugLocs was inverted.

Most users of dropUnknownMetadata() actually worked around this by not
adding LLVMContext::MD_dbg to their list of KnowIDs.
This is now made explicit.

llvm-svn: 245589
2015-08-20 18:24:02 +00:00
Adrian Prantl a317cd2583 Fix a debug location handling bug in GVN.
Caught by the famous "DebugLoc describes the currect SubProgram" assertion.

When GVN is removing a nonlocal load it updates the debug location of the
SSA value it replaced the load with with the one of the load. In the
testcase this actually overwrites a valid debug location with an empty one.

In reality GVN has to make an arbitrary choice between two equally valid
debug locations. This patch changes to behavior to only update the
location if the value doesn't already have a debug location.

llvm-svn: 245588
2015-08-20 18:23:56 +00:00
Adam Nemet e48134093d [LVer] Fix FIXME: hide addPHINodes, NFC
Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this
up.

Now clients that don't compute DefsUsedOutsideOfLoop can just call
versionLoop() and computing DefsUsedOutsideOfLoop will happen
implicitly.  With that there is no reason to expose addPHINodes anymore.

Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and
addPHINodes in LVerLICM and things should just work.

llvm-svn: 245579
2015-08-20 17:22:29 +00:00
James Molloy bf17009a97 [ARM] Don't try and custom lower a vNi64 SETCC.
It won't go well. We've already marked 64-bit SETCCs as non-Custom, but it's just possible that a SETCC has a legal result type but an illegal operand type. If this happens, bail out before we create unselectable nodes.

Fixes PR24292. I tried to create a testcase but in 99% of cases we can't trigger this - not surprising that this bug has been latent since 2009.

llvm-svn: 245577
2015-08-20 16:33:44 +00:00
Rafael Espindola c30c7c493f Fix symbol value computation when part of the expression is weak.
This matches the behaviour of the gnu assembler and is part of
fixing pr24486.

llvm-svn: 245576
2015-08-20 16:18:30 +00:00
Douglas Katzman 58195a2d74 [Sparc]: correct the 'set' synthetic instruction
Differential Revision: http://reviews.llvm.org/D12194

llvm-svn: 245575
2015-08-20 16:16:16 +00:00
Balaram Makam ccf59731e3 Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation.
Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd.

Reviewers: majnemer

Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D12156

llvm-svn: 245569
2015-08-20 15:35:00 +00:00
Marina Yatsina bce1ab67a5 [X86] Fix FBLD and FBSTP
FBLD and FBSTP should receive TBYTE because it is defined as
FBLD m80
FBSTP m80

Differential Revision: http://reviews.llvm.org/D11748

llvm-svn: 245553
2015-08-20 11:51:24 +00:00
Marina Yatsina 7a4e1ba737 [X86] Fix bug in COMISD and COMISS definition in td files
COMISD should receive QWORD because it is defined as
 (V)COMISD xmm1, xmm2/m64

COMISS should receive DWORD because it is defined as
 (V)COMISS xmm1, xmm2/m32

Differential Revision: http://reviews.llvm.org/D11712

llvm-svn: 245551
2015-08-20 11:21:36 +00:00
Benjamin Kramer fcdb1c14ac Make helper functions static. NFC.
llvm-svn: 245549
2015-08-20 09:57:22 +00:00
David Majnemer cfc1df553e [X86] Fix the (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2)) fold
We didn't check for the necessary preconditions before folding a
mask/shift into a single mask.

This fixes PR24516.

llvm-svn: 245544
2015-08-20 09:00:56 +00:00
Bjorn Steinbrink 2e2f66557e Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks"
llvm-svn: 245543
2015-08-20 08:58:47 +00:00
Bjorn Steinbrink cc7e8a9705 [DSE] Enable removal of lifetime intrinsics in terminating blocks
Usually DSE is not supposed to remove lifetime intrinsics, but it's
actually ok to remove them for dead objects in terminating blocks,
because they convey no extra information there. Until we hit a lifetime
start that cannot be removed, that is. Because from that point on the
lifetime intrinsics become interesting again, e.g. for stack coloring.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11710

llvm-svn: 245542
2015-08-20 08:25:28 +00:00
Chandler Carruth 0f792189a4 [ARC] Pull the ObjC ARC components that really serve the role of
analyses into LLVM's Analysis library rather than having them in
a Transforms library.

This is motivated by the need to have the core AliasAnalysis
infrastructure be aware of the ObjCARCAliasAnalysis. However, it also
seems like a nice and clean separation. Everything was very easy to move
and this doesn't create much clutter in the analysis library IMO.

Differential Revision: http://reviews.llvm.org/D12133

llvm-svn: 245541
2015-08-20 08:06:03 +00:00
Hal Finkel 9fdce9adee [PowerPC] Fix value type on XVCMPEQDP for v2f64 comparisons
XVCMPEQDP is used for VSX v2f64 equality comparisons, but the value type needs
to be v2i64 (as that's the corresponding SETCC type).

Fixes PR24225.

llvm-svn: 245535
2015-08-20 03:02:02 +00:00
Hal Finkel be78c25acb [PowerPC] Fix the int2fp(fp2int(x)) DAGCombine to ignore ppc_fp128
This DAGCombine was creating custom SDAG nodes with an illegal ppc_fp128
operand type because it was triggering on f64/f32 int2fp(fp2int(ppc_fp128 x)),
but shouldn't (it should only apply to f32/f64 types). The result was a crash.

llvm-svn: 245530
2015-08-20 01:18:20 +00:00
Alex Lorenz 36efd3883d MIR Serialization: Use the global value syntax for global value memory operands.
This commit modifies the serialization syntax so that the global IR values in
machine memory operands use the global value '@<name>' syntax instead of the
current '%ir.<name>' syntax.

The unnamed global IR values are handled by this commit as well, as the
existing global value parsing method can parse the unnamed globals already.

llvm-svn: 245527
2015-08-20 00:20:03 +00:00
Alex Lorenz 0d009645a1 MIR Serialization: Change syntax for the call entry pseudo source values.
The global IR values in machine memory operands should use the global value
'@<name>' syntax instead of the current '%ir.<name>' syntax.

However, the global value call entry pseudo source values use the global value
syntax already. Therefore, the syntax for the call entry pseudo source values
has to be changed so that the global values and call entry global value PSVs
can be parsed without ambiguities.

llvm-svn: 245526
2015-08-20 00:12:57 +00:00
Alex Lorenz dbd22a9a6c Fix test failure introduced by r245521.
Machine memory operands can contain pointer values that are constants, and
the 'getLocalSlot' method requires non-constant values.

The constant pointer values will have to be serialized in a different patch.

llvm-svn: 245523
2015-08-19 23:56:37 +00:00
Alex Lorenz dd13be0bcc MIR Serialization: Serialize unnamed local IR values in memory operands.
llvm-svn: 245521
2015-08-19 23:31:05 +00:00
Alex Lorenz 36593ac51b MIR Parser: parseIRValue should take in a constant pointer. NFC.
llvm-svn: 245520
2015-08-19 23:27:07 +00:00
Alex Lorenz 55dc6f8165 MIR Printer: Extract the code that prints IR slots to a separate function. NFC.
This code can be reused when printing references to unnamed local IR values.

llvm-svn: 245519
2015-08-19 23:24:37 +00:00
Sanjay Patel 9e5927fdc3 [x86] enable machine combiner reassociations for scalar double-precision min/max
llvm-svn: 245506
2015-08-19 21:27:27 +00:00
Sanjay Patel 4e3ee1e548 [x86] enable machine combiner reassociations for scalar single-precision maximums
llvm-svn: 245504
2015-08-19 21:18:46 +00:00
Simon Pilgrim 35f528262f [DAGCombiner] Added SMAX/SMIN/UMAX/UMIN constant folding
We still need to add constant folding of vector comparisons to fold the tests for targets that don't support the respective min/max nodes

I needed to update 2011-12-06-AVXVectorExtractCombine to load a vector instead of using a constant vector to prevent it folding

Differential Revision: http://reviews.llvm.org/D12118

llvm-svn: 245503
2015-08-19 21:11:58 +00:00
Juergen Ributzka b12248e9cd [AArch64][FastISel] Don't fold shifts with UB.
We are already falling back to SelectionDAG when encountering an shift with UB.
This adds the same checks for shifts with UB that get folded into arithmetic or
logical operations.

This fixes rdar://problem/22345295.

llvm-svn: 245499
2015-08-19 20:52:55 +00:00
David Majnemer f25fe64716 [X86] Emit more efficient >= comparisons against 0
We don't do a great job with >= 0 comparisons against zero when the
result is used as an i8.

Given something like:
  void f(long long LL, bool *B) {
    *B = LL >= 0;
  }

We used to generate:
  shrq    $63, %rdi
  xorb    $1, %dil
  movb    %dil, (%rsi)

Now we generate:
  testq   %rdi, %rdi
  setns   (%rsi)

Differential Revision: http://reviews.llvm.org/D12136

llvm-svn: 245498
2015-08-19 20:51:40 +00:00
Dan Gohman dde8dce6a9 [WebAssembly] Use the default alignment for SIMD types.
Previously WebAssembly's datalayout string had -v128:8:128. This had been an
attempt to declare a certain level of support for unaligned SIMD accesses.
However, clang makes its own determinations for SIMD alignment that are
independent of the datalayout string, so this wasn't actually meaningful.

llvm-svn: 245494
2015-08-19 20:30:20 +00:00
Simon Pilgrim 989cbbd2f5 [DAGCombiner] Fold CONCAT_VECTORS of EXTRACT_SUBVECTOR (or undef) to VECTOR_SHUFFLE.
Check to see if this is a CONCAT_VECTORS of a bunch of EXTRACT_SUBVECTOR operations. If so, and if the EXTRACT_SUBVECTOR vector inputs come from at most two distinct vectors the same size as the result, attempt to turn this into a legal shuffle.

Differential Revision: http://reviews.llvm.org/D12125

llvm-svn: 245490
2015-08-19 20:09:50 +00:00
David Majnemer ba275f9947 Replace some calls to isa<LandingPadInst> with isEHPad()
No functionality change is intended.

llvm-svn: 245487
2015-08-19 19:54:02 +00:00
Douglas Katzman 2362b69dd9 [Sparc]: asm-only support for the ldstub instruction.
llvm-svn: 245485
2015-08-19 19:30:57 +00:00
Alex Lorenz feb6b4395b MIR Parser: Rename 'MachineOperandWithLocation' to 'ParsedMachineOperand'. NFC.
Besides storing the operand's source range, this structure now stores other
attributes as well, so the name should reflect this fact.

llvm-svn: 245483
2015-08-19 19:19:16 +00:00
Alex Lorenz 5ef93b0c4c MIR Serialization: Serialize instruction's register ties.
This commit serializes the machine instruction's register operand ties.
The ties are printed out only when the instructon has register ties that are
different from the ties that are specified in the instruction's description.

llvm-svn: 245482
2015-08-19 19:05:34 +00:00
Nemanja Ivanovic 5f1cea4141 Temporary fix for the self-host failures introduced by rL244921.
This revision has introduced an issue that only affects bootstrapped compiler
when it is printing the ASM. I am working on resolving the issue, but in the
meantime, I'm disabling the legalization of scalar_to_vector operation for v2i64
and the associated testing until I can get this fixed.

llvm-svn: 245481
2015-08-19 19:04:47 +00:00
Alex Lorenz e66a7ccf77 MIR Serialization: Serialize defined registers that require 'def' register flag.
The defined registers are already serialized - they are represented by placing
them before the '=' in a machine instruction. However, certain instructions like
INLINEASM can have defined register operands after the '=', so this commit
introduces the 'def' register flag for such operands.

llvm-svn: 245480
2015-08-19 18:55:47 +00:00
Bruno Cardoso Lopes 27fd06922b [PeepholeOptimizer] Look through PHIs to find additional register sources
Reintroduce r245442. Remove an overly conservative assertion introduced
in r245442. We could replace the assertion to use `shareSameRegisterFile`
instead, but in that point in `insertPHI` we already lost the original
Def subreg to check against. So drop the assertion completely.

Original commit message:

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 245479
2015-08-19 18:53:36 +00:00
Douglas Katzman e5485c651e [SPARC] Enable writing to floating-point-state register.
llvm-svn: 245475
2015-08-19 18:34:48 +00:00
Ahmed Bougacha 9e00ec6195 [AArch64] Improve short-form diags on long-form Match_InvalidOperand.
Since r244955, we try to use the short-form ErrorInfo when both
tries failed, and the long-form match failed on a suffix operand.
However, this means we sometimes mix ErrorInfo and MatchResult
(one manifestation of this being PR24498). Instead, restore both.

llvm-svn: 245469
2015-08-19 17:40:19 +00:00
Hal Finkel ff08a2ecad [SCEV] Fix GCC 4.8.0 ICE in lambda function
Rewrite some code to not use a lambda function. The non-lambda code is just
about as clean as the original, and not any longer. The lambda function causes
an internal compiler error in GCC 4.8.0, and it is not worth breaking support
for that compiler over this. NFC.

llvm-svn: 245466
2015-08-19 17:26:07 +00:00
Adam Nemet cdb791cd33 [LAA] Comment how memchecks are codegened
llvm-svn: 245465
2015-08-19 17:24:36 +00:00
Renato Golin eb552e83e0 Revert "[AArch64] Simplify/refactor code to ease code review. NFC."
This reverts commit r245443, as it broke AArch64 test-suite tramp3d
with an assert "Reg && "Null register has no regunits".

llvm-svn: 245455
2015-08-19 16:29:53 +00:00
Derek Schuff 55817ee604 x32. Fixes a bug in x32 exception handling.
This patch updates the X86 lowering so that the Exception Pointer and Selector
are 64-bit wide only if Subtarget.isTarget64BitLP64.

Patch by João Porto

Reviewers: dschuff, rnk
Differential Revision: http://reviews.llvm.org/D12111

llvm-svn: 245454
2015-08-19 16:28:21 +00:00
JF Bastien 5ab87edbb4 x32. Fixes jmp %reg in x32
x32 has 32-bit pointers; x86-64 can't jmp %r32. This patch addresses this issue by explicitly zero-extending brind's target to 64-bits.

Author: jpp

Reviewers: jfb, dschuff, pavel.v.chupin

Subscribers: llvm-commits

Differential revision: http://reviews.llvm.org/D12112

llvm-svn: 245452
2015-08-19 16:17:08 +00:00
James Y Knight 3b0fd753c4 [Sparc] Rename LoadASR and StoreASR from r245360 to *ASI, as was intended.
llvm-svn: 245450
2015-08-19 15:59:49 +00:00
Bruno Cardoso Lopes 61009142b8 Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources"
Revert r245442 while investigating a fix. An assertion hit in
http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/11380

llvm-svn: 245446
2015-08-19 15:10:32 +00:00
James Y Knight d966fb6fef [SPARC] Fix BooleanContents, so that select of a trunc doesn't
eliminate the trunc.

Differential Revision: http://reviews.llvm.org/D10442

llvm-svn: 245444
2015-08-19 14:47:04 +00:00
Chad Rosier 494abf1ad8 [AArch64] Simplify/refactor code to ease code review. NFC.
llvm-svn: 245443
2015-08-19 14:34:54 +00:00
Bruno Cardoso Lopes 0a1c126684 [PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply r243486.

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 245442
2015-08-19 14:34:41 +00:00
Silviu Baranga ad1b19fcb7 [ARM] Add instruction selection patterns for vmin/vmax
Summary:
The mid-end was generating vector smin/smax/umin/umax nodes, but
we were using vbsl to generatate the code. This adds the vmin/vmax
patterns and a test to check that we are now generating vmin/vmax
instructions.

Reviewers: rengolin, jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D12105

llvm-svn: 245439
2015-08-19 14:11:27 +00:00
Joerg Sonnenberger 7d180c59bb Map %fprs to %asr6 in the Sparc assembler parser.
llvm-svn: 245437
2015-08-19 13:55:14 +00:00
Daniel Sanders 1e97a0b324 Emit <regmask R1 R2 R3 ...> instead of just <regmask> in IR dumps.
Reviewers: qcolombet

Subscribers: kparzysz, qcolombet, llvm-commits

Differential Revision: http://reviews.llvm.org/D11644

llvm-svn: 245433
2015-08-19 12:03:04 +00:00
Tobias Grosser 85508e804b Revert "[X86] Widen the 'AND' mask if doing so shrinks the encoding size"
This reverts commit 245169 which miscompiles MultiSource/Applications/siod
from LNT.

llvm-svn: 245432
2015-08-19 11:35:10 +00:00
Michael Kuperstein 9fe42604aa [X86] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize
There are some cases where the mul sequence is smaller, but for the most part,
using a div is preferable. This does not apply to vectors, since x86 doesn't
have vector idiv, and a vector mul/shifts sequence ought to be smaller than a
scalarized division.

Differential Revision: http://reviews.llvm.org/D12082

llvm-svn: 245431
2015-08-19 11:21:43 +00:00
Michael Kuperstein dcdab4cd3a [TLI] Refactor "is integer division cheap" queries.
This removes the isPow2SDivCheap() query, as it is not currently used in
any meaningful way. isIntDivCheap() no longer relies on a state variable
(as all in-tree target set it to false), but the interface allows querying
based on the type optimization level.

NFC.

Differential Revision: http://reviews.llvm.org/D12082

llvm-svn: 245430
2015-08-19 11:17:59 +00:00
Nick Lewycky 1098e496e1 More clean up, still NFC. Remove dead variables now that the casts are gone.
llvm-svn: 245420
2015-08-19 06:25:30 +00:00
Nick Lewycky 2c852543a3 Clean up this file a little. Remove dead casts, casting Values to Values. Adjust some comments for typos and whitespace. NFC.
llvm-svn: 245419
2015-08-19 06:22:33 +00:00
Ashutosh Nema c5b7b55589 Exposed findDefsUsedOutsideOfLoop as a loop utility function
Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving 
it from LoopDistribute to LoopUtils.

Reviewed By: anemet

llvm-svn: 245416
2015-08-19 05:40:42 +00:00
Chandler Carruth 44a1385c45 [LPM] Teach the legacy pass manager to support *using* an analysis
without *requiring* it.

This allows a pass indicate that it will use an analysis if available
(through getAnalysisIfAvailable). When the pass manager knows this, it
will refrain from deleting that analysis if it can. Naturally, it will
still get invalidated at the correct time. These passes are not
considered when scheduling the pass pipeline, so typically they will
require manual scheduling, but this may also allow passes with
getAnalysisIfAvailable to find the analysis more often if nothing after
them requires that analysis and it wasn't invalidated.

I don't have a particular use case with the current passes, but with my
new structure for alias analyses, this will be very useful. We want to
allow people to customize the set of AAs available by scheduling
additional passes. These's aren't ever *required* for obvious reasons.
So we need some way to mark in the legacy pass manager that they will
still be used if available.

This is essentially how analysis groups already work. But this makes the
feature generally available and more explicit. It should allow the AA
change to not impact how people trigger a custom alias analysis being
available at a certain point in compilation.

Differential Revision: http://reviews.llvm.org/D12114

llvm-svn: 245409
2015-08-19 03:02:12 +00:00
Hal Finkel 0ef2b10f16 Fix how DependenceAnalysis calls delinearization
Fix how DependenceAnalysis calls delinearization, mirroring what is done in
Delinearization.cpp (mostly by making sure to call getSCEVAtScope before
delinearizing, and by removing the unnecessary 'Pairs == 1' check).

Patch by Vaivaswatha Nagaraj!

llvm-svn: 245408
2015-08-19 02:56:36 +00:00
Eric Christopher 0efe9f60bb Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks."
This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960

This reverts r245195.

llvm-svn: 245402
2015-08-19 02:15:13 +00:00
Hal Finkel a8d205f145 Make ScalarEvolution::isKnownPredicate a little smarter
Here we make ScalarEvolution::isKnownPredicate, indirectly, a little smarter.
Given some relational comparison operator OP, and two AddRec SCEVs, {I,+,S} OP
{J,+,T}, we can reduce this to the comparison I OP J when S == T, both AddRecs
are for the same loop, and both are known not to wrap.

As it turns out, because of the way that backedge-guard expressions can be
leveraged when computing known predicates, this allows indvars to simplify the
if-statement comparison in this loop:

  void foo (int *a, int *b, int n) {
    for (int i = 0; i < n; ++i) {
      if (i > n)
        a[i] = b[i] + 1;
    }
  }

which, somewhat surprisingly, we were not previously optimizing away.

llvm-svn: 245400
2015-08-19 01:51:51 +00:00
Alex Lorenz df9e3c6fb0 MIR Serialization: Serialize MMI's variable debug information.
llvm-svn: 245396
2015-08-19 00:13:25 +00:00
Quentin Colombet b700e357b5 [BasicAA] Revert r221876 because it can produce incorrect aliasing
information: see PR24468.

llvm-svn: 245394
2015-08-19 00:07:20 +00:00
Steve King d4c8f70ce1 Fix backward operands in call to isTruncateFree() and improve comments.
llvm-svn: 245385
2015-08-18 23:02:41 +00:00
Alex Lorenz 607efb6c7e MIR Parser: Return true on error when parsing standalone registers.
llvm-svn: 245384
2015-08-18 22:57:36 +00:00
Alex Lorenz f3630113cd MIR Serialization: Serialize the operand's bit mask target flags.
This commit adds support for bit mask target flag serialization to the MIR
printer and the MIR parser. It also adds support for the machine operand's
target flag serialization to the AArch64 target.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 245383
2015-08-18 22:52:15 +00:00
Sanjay Patel 5c55fbc5ea use TLI.allowsMemoryAccess() to check if memory accesses are fast; NFCI
This consolidates use of isUnalignedMem32Slow() in one place.
There is a slight change in logic although I'm not sure that it would ever
come up in the real world: we were assuming that an alignment of the type 
size is always fast; now, we actually check the data layout to confirm that.

llvm-svn: 245382
2015-08-18 22:48:12 +00:00
Nick Lewycky 06b0ea2e8f Fix three typos in comments; "easilly" -> "easily".
llvm-svn: 245379
2015-08-18 22:41:58 +00:00
Peter Collingbourne 4cfa086df2 Support: Clean up TSan annotations.
Remove support for Valgrind-based TSan, which hasn't been maintained for a
few years. We now use the TSan annotations only if LLVM is compiled with
-fsanitize=thread. We no longer need the weak function definitions as we
are guaranteed that our program is linked directly with the TSan runtime.

Differential Revision: http://reviews.llvm.org/D12121

llvm-svn: 245374
2015-08-18 22:31:24 +00:00
Alex Lorenz a314d81328 MIR Serialization: Serialize the frame information's stack protector index.
llvm-svn: 245372
2015-08-18 22:26:26 +00:00
Alex Lorenz dc9dadf683 MIR Parser: Extract the code that parses stack object references into a new
method.

This commit extracts the code that parses the stack object references into a
new method named 'parseStackFrameIndex', so that it can be reused when
parsing standalone stack object references.

llvm-svn: 245370
2015-08-18 22:18:52 +00:00
David Majnemer 8e335ca278 [InstSimplify] Remove unused variable
No functionality change is intended.

llvm-svn: 245369
2015-08-18 22:18:22 +00:00
David Majnemer c6bb0e2a51 [InstSimplify] Don't assume getAggregateElement will succeed
It isn't always possible to get a value from getAggregateElement.
This fixes PR24488.

llvm-svn: 245365
2015-08-18 22:07:25 +00:00
David Majnemer 5eaf08ff1f [VectorUtils] Replace 'llvm::' qualification with 'using llvm'
No funcitonal change is intended, this just makes the file look more
like the rest of LLVM.

llvm-svn: 245364
2015-08-18 22:07:20 +00:00
Joerg Sonnenberger b0ce8747c3 Load/store instructions for floating points with address space require SparcV9.
To properly handle this, define the *a instructions as separate
instruction classes by refactoring the LoadA and StoreA multiclasses.
Move the instruction tests into the sparcv9 file to test the difference.

llvm-svn: 245360
2015-08-18 21:31:46 +00:00
Matthias Braun fa3b248a66 DAGCombiner: Improve DAGCombiner select normalization
The current code normalizes select(C0, x, select(C1, x, y)) towards
select(C0|C1, x, y) if the targets prefers that form. This patch adds an
additional rule that if the select(C1, x, y) part already exists in the
function then we want to normalize into the other direction because the
effects of reusing the existing value are bigger than transforming into
the target preferred form.

This addresses regressions following r238793, see also:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150727/290272.html

Differential Revision: http://reviews.llvm.org/D11616

llvm-svn: 245350
2015-08-18 20:48:36 +00:00
Matthias Braun 2e920bd04f DAGCombiner: Optimize SELECTs first before turning them into SELECT_CC
This is part of http://reviews.llvm.org/D11616 - I just decided to split
this up into a separate commit.

llvm-svn: 245349
2015-08-18 20:48:29 +00:00
David Majnemer 0ad363eebc [WinEH] Calculate state numbers for the new EH representation
State numbers are calculated by performing a walk from the innermost
funclet to the outermost funclet.   Rudimentary support for the new EH
constructs has been added to the assembly printer, just enough to test
the new machinery.

Differential Revision: http://reviews.llvm.org/D12098

llvm-svn: 245331
2015-08-18 19:07:12 +00:00
Matthias Braun d55bcf2646 MachineRegisterInfo: Introduce isPhysRegUsed()
This method checks whether a physical regiser or any of its aliases are
used in the function.

Using this function in SIRegisterInfo::findUnusedReg() should also fix
this reported failure:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150803/292143.html
http://reviews.llvm.org/rL242173#inline-533

The report doesn't come with a testcase and I don't know enough about
AMDGPU to create one myself.

llvm-svn: 245329
2015-08-18 18:54:27 +00:00
Chandler Carruth 2f02ea462c [LPM] Cleanup some loops to be range based for loops before hacking on
this code. NFC.

llvm-svn: 245327
2015-08-18 18:41:53 +00:00
Chandler Carruth 7adc3a2b0e [PM/AA] Remove the last relics of the separate IPA library from LLVM,
folding the code into the main Analysis library.

There already wasn't much of a distinction between Analysis and IPA.
A number of the passes in Analysis are actually IPA passes, and there
doesn't seem to be any advantage to separating them.

Moreover, it makes it hard to have interactions between analyses that
are both local and interprocedural. In trying to make the Alias Analysis
infrastructure work with the new pass manager, it becomes particularly
awkward to navigate this split.

I've tried to find all the places where we referenced this, but I may
have missed some. I have also adjusted the C API to continue to be
equivalently functional after this change.

Differential Revision: http://reviews.llvm.org/D12075

llvm-svn: 245318
2015-08-18 17:51:53 +00:00
Alex Lorenz eb7c9be43c MIR Parser: Implicit register verifier should accept unexpected implicit
subregister operands.

llvm-svn: 245315
2015-08-18 17:17:13 +00:00
Bruno Cardoso Lopes 1846ea3c71 [LVI] Use a SmallDenseMap instead of std::map for ValueCacheEntryTy
Historically there seems to be some resistance regarding the change to DenseMap
(r147980). However, I couldn't find cases of iterator invalidation for
ValueCacheEntryTy, but only for ValueCache, which I left untouched.

This reduces 20s on an internal testcase. Follow up from r245309.

Differential Revision: http://reviews.llvm.org/D11651

rdar://problem/21320066

llvm-svn: 245314
2015-08-18 16:54:36 +00:00
Sanjay Patel 1cd6d88e4d use minSize wrapper; NFCI
These were missed when other uses were switched over:
http://llvm.org/viewvc/llvm-project?view=revision&revision=243994

llvm-svn: 245311
2015-08-18 16:44:23 +00:00
Bruno Cardoso Lopes 6ac4ea4d29 [LVI] Improve LazyValueInfo compile time performance
Changes in LoopUnroll in the past six months exposed scalability
issues in LazyValueInfo when used from JumpThreading. One internal test
that used to take 20s under -O2 now takes 6min.

This commit change the OverDefinedCache from
DenseSet<std::pair<AssertingVH<BasicBlock>, Value*>> to
DenseMap<AssertingVH<BasicBlock>, SmallPtrSet<Value *, 4>>
and reduces compile time down to 1m40s.

Differential Revision: http://reviews.llvm.org/D11651

rdar://problem/21320066

llvm-svn: 245309
2015-08-18 16:34:27 +00:00
Chad Rosier 3dd0e942b6 [AArch64] Simplify the logic for computing in bounds offset. NFC.
llvm-svn: 245307
2015-08-18 16:20:03 +00:00
Daniel Sanders 63f4a5dcad [mips] Expand JAL instructions when PIC is enabled.
Summary: This is the correct way to handle JAL instructions when PIC is enabled.

Patch by Toma Tabacu

Reviewers: seanbruno, tomatabacu

Subscribers: brooks, seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D6231

llvm-svn: 245305
2015-08-18 16:18:09 +00:00
Zoran Jovanovic 2fe8466f6e [mips][microMIPS] Implement DDIV, DMOD, DDIVU and DMODU instructions
Differential Revision: http://reviews.llvm.org/D10953

llvm-svn: 245297
2015-08-18 14:40:43 +00:00
Zoran Jovanovic a6593ff613 [mips][microMIPS] Implement SW and SWE instructions
Differential Revision: http://reviews.llvm.org/D10869

llvm-svn: 245293
2015-08-18 12:53:08 +00:00
Daniel Sanders a699444094 [mips] Make the MipsAsmParser capable of knowing whether PIC mode is enabled or not.
Summary:
This information is needed to decide whether we do the PIC-only JAL expansions or not. It's also needed for an upcoming patch which implements the .cprestore assembler directive (which can only be used effectively in PIC mode).

By making this information available to the MipsAsmParser, we will know when to insert the instructions mandated by the .cprestore assembler directive and we will be able to give some useful warnings when we encounter a potential misuse of this directive.

Patch by Toma Tabacu

Reviewers: dsanders, seanbruno

Subscribers: brooks, seanbruno, rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D5626

llvm-svn: 245291
2015-08-18 12:33:54 +00:00
Michael Kruse c1c9f8a0d5 [Support] On Windows, generate PDF files for graphs and open with associated viewer
Summary: Windows system rarely have good PostScript viewers installed, but PDF viewers are common. So for viewing graphs, generate PDF files and open with the associated PDF viewer using cmd.exe's start command.

Reviewers: Bigcheese, aaron.ballman

Subscribers: aaron.ballman, JakeVanAdrighem, dwiberg, llvm-commits

Differential Revision: http://reviews.llvm.org/D11877

llvm-svn: 245290
2015-08-18 12:17:37 +00:00
Michael Kruse c0a8414c1c [Support] Always wait for GraphViz before opening the viewer
Summary:
When calling DisplayGraph and a PS viewer is chosen, two programs are executed: The GraphViz generator and the PostScript viewer. Always for the generator to finish to ensure that the .ps file is written before opening the viewer for that file. DisplayGraph's wait parameter refers to whether to wait until the user closes the viewer.

This happened on Windows and if none of the options to open the .dot file directly applies, also on Linux.

Reviewers: Bigcheese, chandlerc, aaron.ballman

Subscribers: dwiberg, aaron.ballman, llvm-commits

Differential Revision: http://reviews.llvm.org/D11876

llvm-svn: 245289
2015-08-18 12:13:57 +00:00
Daniel Sanders f1ae367a99 [mips] Correct -Woverflow warning in r245208 without changing signedness of the constant.
This was supposed to have been committed as part of r245208

llvm-svn: 245285
2015-08-18 09:55:57 +00:00
Justin Bogner 9f00ebaeda Revert "Constant propagation after hiting llvm.assume"
This was also failing bootstrap:

http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build

This reverts r245265.

llvm-svn: 245269
2015-08-18 07:00:34 +00:00
Piotr Padlewski 94ca3783b8 Constant propagation after hiting llvm.assume
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
  is constant we can change all variables with constants in the same BasicBlock

http://reviews.llvm.org/D11918

llvm-svn: 245265
2015-08-18 03:55:30 +00:00
Dan Gohman ab48abeafa [WebAssembly] Don't default to ELF in the triple.
WebAssembly doesn't yet have a specified binary format, and it may not
end up being ELF, so we don't want the Triple class defaulting to ELF
for it at this time.

llvm-svn: 245254
2015-08-17 22:37:56 +00:00
Guozhi Wei f66d384443 Align SP adjustment in function getSPAdjust
This commit adds a new function TargetFrameLowering::alignSPAdjust
and calls it from TargetInstrInfo::getSPAdjust. It fixes PR24142.

llvm-svn: 245253
2015-08-17 22:36:27 +00:00
Dan Gohman 4e2d799cab [WebAssembly] Make getArchTypePrefix return "wasm".
The arch prefix string isn't currently being used for anything on
WebAssembly, but if it were to be used, it makes sense to use the
same arch prefix string for wasm32 and wasm64.

llvm-svn: 245252
2015-08-17 22:35:40 +00:00
Alex Lorenz a56ba6a6dd MIR Serialization: Serialize the local offsets for the stack objects.
llvm-svn: 245249
2015-08-17 22:17:42 +00:00
Alex Lorenz eb62568625 MIR Serialization: Serialize the memory operand's range metadata node.
llvm-svn: 245247
2015-08-17 22:09:52 +00:00
Alex Lorenz 03e940d1f8 MIR Serialization: Serialize the memory operand's noalias metadata node.
llvm-svn: 245246
2015-08-17 22:08:02 +00:00
Alex Lorenz a16f624dc3 MIR Serialization: Serialize the memory operand's alias scope metadata node.
llvm-svn: 245245
2015-08-17 22:06:40 +00:00
Alex Lorenz a617c9162d MIR Serialization: Serialize the memory operand's TBAA metadata node.
llvm-svn: 245244
2015-08-17 22:05:15 +00:00
David Majnemer 83f4bb23c4 [WinEHPrepare] Replace unreasonable funclet terminators with unreachable
It is possible to be in a situation where more than one funclet token is
a valid SSA value.  If we see a terminator which exits a funclet which
doesn't use the funclet's token, replace it with unreachable.

Differential Revision: http://reviews.llvm.org/D12074

llvm-svn: 245238
2015-08-17 20:56:39 +00:00
Douglas Katzman 685a7d1a70 [SPARC]: recognize '.' as the start of an assembler expression.
llvm-svn: 245232
2015-08-17 19:55:01 +00:00
James Molloy 974838f294 [ARM] Fix crash when targetting CPU without NEON
We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available.

Found here: https://code.google.com/p/chromium/issues/detail?id=521671

llvm-svn: 245231
2015-08-17 19:37:12 +00:00
Igor Laevsky 06044f97d2 [ScalarEvolutionExpander] Reuse findExistingExpansion during expansion cost calculation for division
Primary purpose of this change is to reuse existing code inside findExistingExpansion. However it introduces very slight semantic change - findExistingExpansion now looks into exiting blocks instead of a loop latches. Originally heuristic was based on the fact that we want to look at the loop exit conditions. And since all exiting latches will be listed in the ExitingBlocks, heuristic stays roughly the same.

Differential Revision: http://reviews.llvm.org/D12008

llvm-svn: 245227
2015-08-17 16:37:04 +00:00
Silviu Baranga b322aa6f53 [CostModel][AArch64] Increase cost of vector insert element and add missing cast costs
Summary:
Increase the estimated costs for insert/extract element operations on
AArch64. This is motivated by results from benchmarking interleaved
accesses.

Add missing costs for zext/sext/trunc instructions and some integer to
floating point conversions. These costs were previously calculated
by scalarizing these operation and were affected by the cost increase of
the insert/extract element operations.

Reviewers: rengolin

Subscribers: mcrosier, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11939

llvm-svn: 245226
2015-08-17 16:05:09 +00:00
Silviu Baranga d5ac26937c [CostModel][ARM] Increase cost of insert/extract operations
Summary:
This change limits the minimum cost of an insert/extract
element operation to 2 in cases where this would result
in mixing of NEON and VFP code.

Reviewers: rengolin

Subscribers: mssimpso, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12030

llvm-svn: 245225
2015-08-17 15:57:05 +00:00
Igor Laevsky b20bda77e7 [BasicAliasAnalysis] Do not check ModRef table for intrinsics
All possible ModRef behaviours can be completely represented using existing LLVM IR attributes.

Differential Revision: http://reviews.llvm.org/D12033

llvm-svn: 245224
2015-08-17 15:56:56 +00:00
Artur Pilipenko 34d8ba84c8 Take alignment into account in isSafeToSpeculativelyExecute and isSafeToLoadUnconditionally.
Reviewed By: hfinkel, sanjoy, MatzeB

Differential Revision: http://reviews.llvm.org/D9791

llvm-svn: 245223
2015-08-17 15:54:26 +00:00
Benjamin Kramer 1ee99a8b46 Extend MCAsmLexer so that it can peek forward several tokens
This commit adds a virtual `peekTokens()` function to `MCAsmLexer`
which can peek forward an arbitrary number of tokens.

It also makes the `peekTok()` method call `peekTokens()` method, but
only requesting one token.

The idea is to better support targets which more more ambiguous
assembly syntaxes.

Patch by Dylan McKay!

llvm-svn: 245221
2015-08-17 14:35:25 +00:00
Aaron Ballman aa3d810b5f Correcting a -Woverflow warning where 0xFFFF was overflowing an implicit constant conversion.
llvm-svn: 245220
2015-08-17 14:25:57 +00:00
Joseph Tremoulet 7031c9fc2e [WinEHPrepare] Fix catchret successor phi demotion
Summary:
When demoting an SSA value that has a use on a phi and one of the phi's
predecessors terminates with catchret, the edge needs to be split and the
load inserted in the new block, else we'll still have a cross-funclet SSA
value.

Add a test for this, and for the similar case where a def to be spilled is
on and invoke and a critical edge, which was already implemented but
missing a test.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12065

llvm-svn: 245218
2015-08-17 13:51:37 +00:00
Tobias Grosser 58fdd88751 Revert "Disable targetdatalayoutcheck"
I committed by accident a local hack that should not have made it upstream.
Sorry for the noise.

llvm-svn: 245212
2015-08-17 10:58:03 +00:00
Tobias Grosser 607b8b26e9 Disable targetdatalayoutcheck
llvm-svn: 245210
2015-08-17 10:56:35 +00:00
Daniel Sanders a39ef1c68f [mips] [IAS] Add support for the DLA pseudo-instruction and fix problems with DLI
Summary: It is the same as LA, except that it can also load 64-bit addresses and it only works on 64-bit MIPS architectures.

Reviewers: tomatabacu, seanbruno, vkalintiris

Subscribers: brooks, seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D9524

llvm-svn: 245208
2015-08-17 10:11:55 +00:00
Michael Kuperstein adc4e9c414 [GMR] isNonEscapingGlobalNoAlias() should look through Bitcasts/GEPs when looking at loads.
This fixes yet another case from PR24288.

Differential Revision: http://reviews.llvm.org/D12064

llvm-svn: 245207
2015-08-17 10:06:08 +00:00
James Molloy 88edc8243d Remove hand-rolled matching for fmin and fmax.
SDAGBuilder now does this all for us.

llvm-svn: 245198
2015-08-17 07:13:20 +00:00
James Molloy c617be559a Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNM
This is no longer needed - SDAGBuilder will do this for us.

llvm-svn: 245197
2015-08-17 07:13:15 +00:00
James Molloy ef183397b1 Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder.
These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted.

For example on AArch32 (V8), we have scalar fminnm but not fmin.

Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with.

llvm-svn: 245196
2015-08-17 07:13:10 +00:00
Karthik Bhat 3af28945b9 Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks.
PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was
deleting the next basic block iterator. Fixed the same by resetting the basic block iterator
post call to DeleteDeadInstruction.

llvm-svn: 245195
2015-08-17 05:51:39 +00:00
David Majnemer 8ed559ad22 Revert "[InstCombinePHI] Partial simplification of identity operations."
This reverts commit r244887, it caused PR24470.

llvm-svn: 245194
2015-08-17 03:11:26 +00:00
Chandler Carruth 2f1fd1658f [PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193
2015-08-17 02:08:17 +00:00
Chandler Carruth b596ba2376 [ADT] Teach FoldingSet to be movable.
This is a very minimal move support - it leaves the moved-from object in
a zombie state that is only valid for destruction and move assignment.
This seems fine to me, and leaving it in the default constructed state
would require adding more state to the object and potentially allocating
memory (!!!) and so seems like a Bad Idea.

llvm-svn: 245192
2015-08-16 23:17:27 +00:00
Benjamin Kramer bb70d751de [SimplifyLibCalls] Drop default template args. No functional change.
llvm-svn: 245189
2015-08-16 21:16:37 +00:00
Benjamin Kramer dc1d1cbd82 [IR] Simplify code. No functionality change.
llvm-svn: 245188
2015-08-16 21:16:26 +00:00
Sanjay Patel 57fd1dc5db transform fmin/fmax calls when possible (PR24314)
If we can ignore NaNs, fmin/fmax libcalls can become compare and select
(this is what we turn std::min / std::max into).

This IR should then be optimized in the backend to whatever is best for
any given target. Eg, x86 can use minss/maxss instructions.

This should solve PR24314:
https://llvm.org/bugs/show_bug.cgi?id=24314

Differential Revision: http://reviews.llvm.org/D11866

llvm-svn: 245187
2015-08-16 20:18:19 +00:00
Sanjoy Das 94c4aecf83 [LSR][NFC] Don’t duplicate entity name at the beginning of the comment.
llvm-svn: 245183
2015-08-16 18:22:46 +00:00
Sanjoy Das 302bfd04b5 [LSR][NFC] Use camelCase for method names in Formula and RegUseTracker.
llvm-svn: 245182
2015-08-16 18:22:43 +00:00
Sanjay Patel 3ab4a73bac use SDValue bool operator; NFCI
llvm-svn: 245181
2015-08-16 17:54:28 +00:00
Yaron Keren 178c465223 Add missing include guard.
llvm-svn: 245173
2015-08-16 07:55:08 +00:00
David Majnemer e04443baff Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks."
This reverts commit r245025, it caused PR24469.

llvm-svn: 245172
2015-08-16 07:11:59 +00:00
David Majnemer dfa3b09541 [InstCombine] Replace an and+icmp with a trunc+icmp
Bitwise arithmetic can obscure a simple sign-test.  If replacing the
mask with a truncate is preferable if the type is legal because it
permits us to rephrase the comparison more explicitly.

llvm-svn: 245171
2015-08-16 07:09:17 +00:00
Chandler Carruth 5efd530cbc Revert r244127: [PM] Remove a failed attempt to port the CallGraph
analysis ...

It turns out that we *do* need the old CallGraph ported to the new pass
manager. There are times where this model of a call graph is really
superior to the one provided by the LazyCallGraph. For example,
GlobalsModRef very specifically needs the model provided by CallGraph.

While here, I've tried to make the move semantics actually work. =]

llvm-svn: 245170
2015-08-16 06:35:19 +00:00
David Majnemer 1a59e49f3c [X86] Widen the 'AND' mask if doing so shrinks the encoding size
We can set additional bits in a mask given that we know the other
operand of an AND already has some bits set to zero.  This can be more
efficient if doing so allows us to use an instruction which implicitly
sign extends the immediate.

This fixes PR24085.

Differential Revision: http://reviews.llvm.org/D11289

llvm-svn: 245169
2015-08-16 04:52:11 +00:00
NAKAMURA Takumi 5196275eea MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting.
Don't assume second would be ordered in the module.

llvm-svn: 245168
2015-08-16 02:41:23 +00:00
Yaron Keren dfb655fe17 Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890
ByteSize and BitSize should not be size_t but unsigned, considering

1) They are at most 2^16 and 2^19, respectively.
2) BitSize is an argument to Type::getIntNTy which takes unsigned.

Also, use the correct utostr instead itostr and cache the string result.

Thanks to James Touton for reporting this!

llvm-svn: 245167
2015-08-15 19:06:14 +00:00
Sanjay Patel 40d4eb40f6 [x86] enable machine combiner reassociations for scalar single-precision minimums
llvm-svn: 245166
2015-08-15 17:01:54 +00:00
Yaron Keren 8b2a031cff Silence VS2015 warning.
Patch by James Touton!

http://reviews.llvm.org/D11890

llvm-svn: 245161
2015-08-15 14:54:43 +00:00
Simon Pilgrim 0750c84623 [DAGCombiner] Attempt to mask vectors before zero extension instead of after.
For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors.

(zext (truncate x)) -> (zext (and(x, m))

Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code.

This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles.

Differential Revision: http://reviews.llvm.org/D11764

llvm-svn: 245160
2015-08-15 13:27:30 +00:00
Chandler Carruth e8824e3026 [PM/AA] Delete the LibCallAliasAnalysis and all the associated
infrastructure.

This AA was never used in tree. It's infrastructure also completely
overlaps that of TargetLibraryInfo which is used heavily by BasicAA to
achieve similar goals to those stated for this analysis.

As has come up in several discussions, the use case here is still really
important, but this code isn't helping move toward that use case. Any
progress on better supporting rich AA information for runtime library
environments would likely be better off starting from scratch or
starting from TargetLibraryInfo than from this base.

Differential Revision: http://reviews.llvm.org/D12028

llvm-svn: 245155
2015-08-15 09:22:21 +00:00
Matt Arsenault 588732bd6e AMDGPU/SI: Only look at live out SGPR defs
When trying to fix SGPR live ranges, skip defs that are
killed in the same block as the def. I don't think
we need to worry about these cases as long as the
live ranges of the SGPRs in dominating blocks are
correct.

This reduces the number of elements the second
loop over the function needs to look at, and makes
it generally easier to understand. The second loop
also only considers if the live range is live
in to a block, which logically means it
must have been live out from another.

llvm-svn: 245150
2015-08-15 02:58:49 +00:00
David Majnemer 0bc0eef71c [IR] Give catchret an optional 'return value' operand
Some personality routines require funclet exit points to be clearly
marked, this is done by producing a token at the funclet pad and
consuming it at the corresponding ret instruction.  CleanupReturnInst
already had a spot for this operand but CatchReturnInst did not.
Other personality routines don't need to use this which is why it has
been made optional.

llvm-svn: 245149
2015-08-15 02:46:08 +00:00
James Y Knight 5567bafe93 Remove redundant TargetFrameLowering::getFrameIndexOffset virtual
function.

This was the same as getFrameIndexReference, but without the FrameReg
output.

Differential Revision: http://reviews.llvm.org/D12042

llvm-svn: 245148
2015-08-15 02:32:35 +00:00
JF Bastien d4698e1bac [WebAssembly] Add Relooper
This is just an initial checkin of an implementation of the Relooper algorithm, in preparation for WebAssembly codegen to utilize. It doesn't do anything yet by itself.

The Relooper algorithm takes an arbitrary control flow graph and generates structured control flow from that, utilizing a helper variable when necessary to handle irreducibility. The WebAssembly backend will be able to use this in order to generate an AST for its binary format.

Author: azakai

Reviewers: jfb, sunfish

Subscribers: jevinskie, arsenm, jroelofs, llvm-commits

Differential revision: http://reviews.llvm.org/D11691

llvm-svn: 245142
2015-08-15 01:23:28 +00:00
JF Bastien 5e4303dc14 Accelerate MergeFunctions with hashing
This patch makes the Merge Functions pass faster by calculating and comparing
a hash value which captures the essential structure of a function before
performing a full function comparison.

The hash is calculated by hashing the function signature, then walking the basic
blocks of the function in the same order as the main comparison function. The
opcode of each instruction is hashed in sequence, which means that different
functions according to the existing total order cannot have the same hash, as
the comparison requires the opcodes of the two functions to be the same order.

The hash function is a static member of the FunctionComparator class because it
is tightly coupled to the exact comparison function used. For example, functions
which are equivalent modulo a single variant callsite might be merged by a more
aggressive MergeFunctions, and the hash function would need to be insensitive to
these differences in order to exploit this.

The hashing function uses a utility class which accumulates the values into an
internal state using a standard bit-mixing function. Note that this is a different interface
than a regular hashing routine, because the values to be hashed are scattered
amongst the properties of a llvm::Function, not linear in memory. This scheme is
fast because only one word of state needs to be kept, and the mixing function is
a few instructions.

The main runOnModule function first computes the hash of each function, and only
further processes functions which do not have a unique function hash. The hash
is also used to order the sorted function set. If the hashes differ, their
values are used to order the functions, otherwise the full comparison is done.

Both of these are helpful in speeding up MergeFunctions. Together they result in
speedups of 9% for mysqld (a mostly C application with little redundancy), 46%
for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all
three cases, the new speed of MergeFunctions is about half that of the module
verifier, making it relatively inexpensive even for large LTO builds with
hundreds of thousands of functions. The same functions are merged, so this
change is free performance.

Author: jrkoenig

Reviewers: nlewycky, dschuff, jfb

Subscribers: llvm-commits, aemerson

Differential revision: http://reviews.llvm.org/D11923

llvm-svn: 245140
2015-08-15 01:18:18 +00:00
Matt Arsenault 427a0fd22e LoopStrengthReduce: Try to pass address space to isLegalAddressingMode
This seems to only work some of the time. In some situations,
this seems to use a nonsensical type and isn't actually aware of the
memory being accessed. e.g. if branch condition is an icmp of a pointer,
it checks the addressing mode of i1.

llvm-svn: 245137
2015-08-15 00:53:06 +00:00
Matt Arsenault 297ae311ce AMDGPU/SI: Fix printing useless info with amdhsa
The comments at the bottom would all report 0 if
amdhsa was used.

llvm-svn: 245135
2015-08-15 00:12:39 +00:00
Matt Arsenault 0259a7aa41 AMDGPU/SI: Update LiveVariables
This is simple but won't work if/when this pass
is moved to be post-SSA.

llvm-svn: 245134
2015-08-15 00:12:37 +00:00
Matt Arsenault 670ba46efe AMDGPU/SI: Update LiveIntervals during SIFixSGPRLiveRanges
Does not mark SlotIndexes as reserved, although I think
that might be OK.

LiveVariables still need to be handled.

llvm-svn: 245133
2015-08-15 00:12:35 +00:00
Matt Arsenault b75233235c AMDGPU: Remove unnecessary assert
These shouldn't ever be null. The number of successors
was already asserted to be 2.

llvm-svn: 245132
2015-08-15 00:12:32 +00:00
Matt Arsenault 4275c29a02 AMDGPU/SI: Make comments more precise.
True branch instructions do behave as expected with liveness.

Avoid the phrasing "branch decision is based on a value in an SGPR"
because this could be misleading. A VALU compare instruction's
result is still based on an SGPR, even though that condition
may be divergent.

llvm-svn: 245131
2015-08-15 00:12:30 +00:00
Nick Lewycky 8075fd22b9 Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458!
llvm-svn: 245119
2015-08-14 22:46:49 +00:00
Bjarke Hammersholt Roune 9791ed4705 [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
Summary:
http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.

This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i - offset];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i * stride];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[3 * (i << 7)];


Reviewers: atrick, sanjoy

Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben

Differential Revision: http://reviews.llvm.org/D11860

llvm-svn: 245118
2015-08-14 22:45:26 +00:00
Pat Gavlin b399095c3f Add a target environment for CoreCLR.
Although targeting CoreCLR is similar to targeting MSVC, there are
certain important differences that the backend must be aware of
(e.g. differences in stack probes, EH, and library calls).

Differential Revision: http://reviews.llvm.org/D11012

llvm-svn: 245115
2015-08-14 22:41:43 +00:00
Ahmed Bougacha cd35787217 [AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns.
We canonicalize V64 vectors to V128 through insert_subvector: the other
FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't,
so we'd fail to match fmls and generate fneg+fmla instead.
The vector equivalents are already tested and functional.

llvm-svn: 245107
2015-08-14 22:06:05 +00:00
Evgeniy Stepanov 24ac55d884 [msan] Fix handling of musttail calls.
MSan instrumentation for return values of musttail calls is not
allowed by the IR constraints, and not needed at the same time.

llvm-svn: 245106
2015-08-14 22:03:50 +00:00
Alex Lorenz 577d271a75 MIR Serialization: Serialize the '.cfi_same_value' CFI directive.
llvm-svn: 245103
2015-08-14 21:55:58 +00:00
Alex Lorenz c3ba7508f6 MIR Serialization: Serialize the external symbol call entry pseudo source
values.

llvm-svn: 245098
2015-08-14 21:14:50 +00:00
Alex Lorenz 50b826fb75 MIR Serialization: Serialize the global value call entry pseudo source values.
llvm-svn: 245097
2015-08-14 21:08:30 +00:00
Tom Stellard bef1094ee7 AMDGPU/SI: Add missing spill class
The compiler was failing to spill for some shaders.

Patch By: Axel Davy

llvm-svn: 245087
2015-08-14 19:46:05 +00:00
Renato Golin 980b6cc42b Revert "[ARM] Fix MachO CPU Subtype selection"
This reverts commit r245081, as it breaks many builds.

llvm-svn: 245086
2015-08-14 19:35:47 +00:00
Alex Lorenz 1039fd1ae5 MIR Serialization: Serialize the 'internal' register operand flag.
llvm-svn: 245085
2015-08-14 19:07:07 +00:00
Alex Lorenz f9a2b12361 MIR Serialization: Serialize the bundled machine instructions.
llvm-svn: 245082
2015-08-14 18:57:24 +00:00
Vedant Kumar 2f079be789 [ARM] Fix MachO CPU Subtype selection
This patch makes the Darwin ARM backend take advantage of TargetParser.  It
also teaches TargetParser about ARMV7K for the first time. This makes target
triple parsing more consistent across llvm.

Differential Revision: http://reviews.llvm.org/D11996

llvm-svn: 245081
2015-08-14 18:36:47 +00:00
Sanjay Patel ed502905f7 [x86] fix allowsMisalignedMemoryAccess() implementation
This patch fixes the x86 implementation of allowsMisalignedMemoryAccess() to correctly
return the 'Fast' output parameter for 32-byte accesses. To test that, an existing load
merging optimization is changed to use the TLI hook. This exposes a shortcoming in the
current logic and results in the regression test update. Changing other direct users of
the isUnalignedMem32Slow() x86 CPU attribute would be a follow-on patch.

Without the fix in allowsMisalignedMemoryAccesses(), we will infinite loop when targeting
SandyBridge because LowerINSERT_SUBVECTOR() creates 32-byte loads from two 16-byte loads
while PerformLOADCombine() splits them back into 16-byte loads.

Differential Revision: http://reviews.llvm.org/D10662

llvm-svn: 245075
2015-08-14 17:53:40 +00:00
Justin Bogner 7ae63aa85d [sancov] Fix an unused variable warning introduced in r245067
llvm-svn: 245072
2015-08-14 17:03:45 +00:00
Kit Barton ae78d53aeb Reverting patch r244235.
This patch will be redone in a different way. See
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150810/292978.html
for more details.

llvm-svn: 245071
2015-08-14 16:54:32 +00:00
Reid Kleckner a57d015154 [sancov] Leave llvm.localescape in the entry block
Summary: Similar to the change we applied to ASan. The same test case works.

Reviewers: samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11961

llvm-svn: 245067
2015-08-14 16:45:42 +00:00
Rafael Espindola dbaf0498a9 Revert "Centralize the information about which object format we are using."
This reverts commit r245047.

It was failing on the darwin bots. The problem was that when running

./bin/llc -march=msp430

llc gets to

  if (TheTriple.getTriple().empty())
    TheTriple.setTriple(sys::getDefaultTargetTriple());

Which means that we go with an arch of msp430 but a triple of
x86_64-apple-darwin14.4.0 which fails badly.

That code has to be updated to select a triple based on the value of
march, but that is not a trivial fix.

llvm-svn: 245062
2015-08-14 15:48:41 +00:00
Sanjay Patel 2e75341b7f don't repeaat function names in comments; NFC
llvm-svn: 245058
2015-08-14 15:11:42 +00:00
Rafael Espindola 90eb70c8a7 Centralize the information about which object format we are using.
Other than some places that were handling unknown as ELF, this should
have no change. The test updates are because we were detecting
arm-coff or x86_64-win64-coff as ELF targets before.

It is not clear if the enum should live on the Triple. At least now it lives
in a single location and should be easier to move somewhere else.

llvm-svn: 245047
2015-08-14 13:31:17 +00:00
James Molloy 87405c7f66 Separate out BDCE's analysis into a separate DemandedBits analysis.
This allows other areas of the compiler to use BDCE's bit-tracking.
NFCI.

llvm-svn: 245039
2015-08-14 11:09:09 +00:00
James Molloy 63be198712 [AArch64] FMINNAN/FMAXNAN on f16 is not legal.
Spotted by Ahmed - in r244594 I inadvertently marked f16 min/max as legal.

I've reverted it here, and marked min/max on scalar f16's as promote. I've also added a testcase. The test just checks that the compiler doesn't fall over - it doesn't create fmin nodes for f16 yet.

llvm-svn: 245035
2015-08-14 09:08:50 +00:00
Adam Nemet 06ccf0145f [LVer] Remove unused Pass parameter from versionLoop, NFC
llvm-svn: 245032
2015-08-14 06:30:26 +00:00
Lang Hames 7e66c6c5e3 [RuntimeDyld] Make sure code-sections aren't under-aligned.
Code-section alignment should be at least as high as the minimum
stub alignment. If the section alignment is lower it can cause
padding to be emitted resulting in alignment errors if the section
is mapped to a higher alignment on the target.

E.g. If a text section with a 4-byte alignment gets 4-bytes of
padding to guarantee 8-byte alignment for stubs but is re-mapped to
an 8-byte alignment on the target, the 4-bytes of padding will push
the stubs to 4-byte alignment causing a crash.

No test case: There is currently no way to control host section
alignment in llvm-rtdyld. This could be made testable by adding
a custom memory manager. I'll look at that in a follow-up patch.

llvm-svn: 245031
2015-08-14 06:26:42 +00:00
David Majnemer b611e3f50e [IR] Add token types
This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.

There are several applications for such a type but my immediate
motivation stems from WinEH.  Our personality routine enforces a
single-entry - single-exit regime for cleanups.  After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together.  We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.

Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup.  This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.

What is the burden to the optimizer?  Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway.  There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.

Differential Revision: http://reviews.llvm.org/D11861

llvm-svn: 245029
2015-08-14 05:09:07 +00:00
Karthik Bhat ddc2a86a00 Add support for cross block dse.
This patch enables dead stroe elimination across basicblocks.

Example:
define void @test_02(i32 %N) {
  %1 = alloca i32
  store i32 %N, i32* %1
  store i32 10, i32* @x
  %2 = load i32, i32* %1
  %3 = icmp ne i32 %2, 0
  br i1 %3, label %4, label %5

; <label>:4
  store i32 5, i32* @x
  br label %7

; <label>:5
  %6 = load i32, i32* @x
  store i32 %6, i32* @y
  br label %7

; <label>:7
  store i32 15, i32* @x
  ret void
}
In the above example dead store "store i32 5, i32* @x" is now eliminated.

Differential Revision: http://reviews.llvm.org/D11143

llvm-svn: 245025
2015-08-14 04:17:23 +00:00
Chandler Carruth d541e7304f [PM/AA] Run clang-format over the ObjCARC Alias Analysis code to
normalize its formatting before I make more substantial changes.

llvm-svn: 245024
2015-08-14 03:57:00 +00:00
Chandler Carruth b4ebdf3d72 [PM/AA] Don't bother forward declaring Function and Value, just include
their headers.

llvm-svn: 245023
2015-08-14 03:55:36 +00:00
Saleem Abdulrasool 3e190cb098 PowerPC: remove dead initialization (NFC)
Identified by the clang static analyzer.  No functional change intended.

llvm-svn: 245022
2015-08-14 03:48:35 +00:00
Chandler Carruth 21dcff799a [PM/AA] Extract the interface for GlobalsModRef into a header along with
its creation function.

This required shifting a bunch of method definitions to be out-of-line
so that we could leave most of the implementation guts in the .cpp file.

llvm-svn: 245021
2015-08-14 03:48:20 +00:00
Chandler Carruth 1db22822b4 [PM/AA] Hoist the interface to TBAA into a dedicated header along with
its creation function. Update the relevant includes accordingly.

llvm-svn: 245019
2015-08-14 03:33:48 +00:00
Chandler Carruth f9cbd039bd [PM/AA] Run clang-format over TBAA code to normalize the formatting
before making substantial changes.

llvm-svn: 245017
2015-08-14 03:26:15 +00:00
Chandler Carruth 55eec8be9e [PM/AA] Clean up the SCEV-AA comment formatting and typos.
llvm-svn: 245015
2015-08-14 03:14:50 +00:00
Chandler Carruth 79687faee6 [PM/AA] Run clang-format over the SCEV-AA code to normalize the
formatting.

llvm-svn: 245014
2015-08-14 03:12:16 +00:00
Chandler Carruth ed23528fb2 [PM/AA] Hoist the SCEV-AA interface to its own header and pull the
creation function into that header.

llvm-svn: 245013
2015-08-14 03:11:16 +00:00
Chandler Carruth 42ff448fe4 [PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the
creation function there.

Same basic refactoring as the other alias analyses. Nothing special
required this time around.

llvm-svn: 245012
2015-08-14 02:55:50 +00:00
Chandler Carruth 29109f5b1e [PM/AA] Hoist the value handle definition for CFLAA into the header to
satisfy libc++'s std::forward_list which requires the value type to be
complete.

llvm-svn: 245011
2015-08-14 02:50:34 +00:00
Chandler Carruth 496b7fb569 [PM/AA] Run clang-format over the ScopedNoAliasAA pass prior to making
substantial changes to normalize any formatting.

llvm-svn: 245010
2015-08-14 02:46:07 +00:00
Chandler Carruth 8b046a42f4 [PM/AA] Extract a minimal interface for CFLAA to its own header file.
I've used forward declarations and reorderd the source code some to make
this reasonably clean and keep as much of the code as possible in the
source file, including all the stratified set details. Just the basic AA
interface and the create function are in the header file, and the header
file is now included into the relevant locations.

llvm-svn: 245009
2015-08-14 02:42:20 +00:00
Chandler Carruth 1b179e1102 [PM/AA] Sink all the actual code from AliasAnalysisCounter back into the
.cpp file to make the header much less noisy.

Also makes it easy to use a static helper rather than a public method
for printing lines of stats.

llvm-svn: 245006
2015-08-14 02:12:12 +00:00
Chandler Carruth fafee839d9 [PM/AA] Run clang-format over this code to establish a clean baseline
for subsequent changes.

llvm-svn: 245005
2015-08-14 02:07:05 +00:00
Chandler Carruth 7a9ba04809 [PM/AA] Hoist the AA counter pass into a header to match the analysis
pattern.

Also hoist the creation routine out of the generic header and into the
pass header now that we have one.

I've worked to not make any changes, even formatting ones here. I'll
clean up the formatting and other things in a follow-up patch now that
the code is in the right place.

llvm-svn: 245004
2015-08-14 02:05:41 +00:00
Jingyue Wu 1238f341ba [SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow.
Summary:
This patch implements my promised optimization to reunites certain sexts from
operands after we extract the constant offset. See the header comment of
reuniteExts for its motivation.

One key building block that enables this optimization is Bjarke's poison value
analysis (D11212). That helps to prove "a +nsw b" can't overflow.

Reviewers: broune

Subscribers: jholewinski, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12016

llvm-svn: 245003
2015-08-14 02:02:05 +00:00
Chandler Carruth 45cf0bf117 [PM/AA] Remove the function names and class names from doxygen comments
and generally clean up their formatting.

llvm-svn: 245002
2015-08-14 01:43:46 +00:00
Chandler Carruth 4ac50b08e3 [PM/AA] Move the LibCall AA creation routine declaration to that
analysis's header file to be more consistent with other analyses.

llvm-svn: 245001
2015-08-14 01:43:02 +00:00
Chandler Carruth cf76e57dd8 [PM/AA] Run clang-format over LibCallAliasAnalysis prior to making
substantial changes needed for the new pass manager's AA integration.

llvm-svn: 245000
2015-08-14 01:38:25 +00:00
Chandler Carruth bf143e2a20 [LIR] Re-instate r244880, reverted in r244884, factoring the handling of
AliasAnalysis in LoopIdiomRecognize.

The previous commit to LIR, r244879, exposed some scary bug in the loop
pass pipeline with an assert failure that showed up on several bots.
This patch got reverted as part of getting that revision reverted, but
they're actually independent and unrelated. This patch has no functional
change and should be completely safe. It is also useful for my current
work on the AA infrastructure.

llvm-svn: 244993
2015-08-14 00:21:10 +00:00
Alex Lorenz 5022f6bb81 MIR Serialization: Change MIR syntax - use custom syntax for MBBs.
This commit modifies the way the machine basic blocks are serialized - now the
machine basic blocks are serialized using a custom syntax instead of relying on
YAML primitives. Instead of using YAML mappings to represent the individual
machine basic blocks in a machine function's body, the new syntax uses a single
YAML block scalar which contains all of the machine basic blocks and
instructions for that function.

This is an example of a function's body that uses the old syntax:

    body:
      - id: 0
        name: entry
        instructions:
          - '%eax = MOV32r0 implicit-def %eflags'
          - 'RETQ %eax'
    ...

The same body is now written like this:

    body: |
      bb.0.entry:
        %eax = MOV32r0 implicit-def %eflags
        RETQ %eax
    ...

This syntax change is motivated by the fact that the bundled machine
instructions didn't map that well to the old syntax which was using a single
YAML sequence to store all of the machine instructions in a block. The bundled
machine instructions internally use flags like BundledPred and BundledSucc to
determine the bundles, and serializing them as MI flags using the old syntax
would have had a negative impact on the readability and the ease of editing
for MIR files. The new syntax allows me to serialize the bundled machine
instructions using a block construct without relying on the internal flags,
for example:

   BUNDLE implicit-def dead %itstate, implicit-def %s1 ... {
      t2IT 1, 24, implicit-def %itstate
      %s1 = VMOVS killed %s0, 1, killed %cpsr, implicit killed %itstate
   }

This commit also converts the MIR testcases to the new syntax. I developed
a script that can convert from the old syntax to the new one. I will post the
script on the llvm-commits mailing list in the thread for this commit.

llvm-svn: 244982
2015-08-13 23:10:16 +00:00
Sanjay Patel a75c41e5f3 don't repeat function names in comments; NFC
llvm-svn: 244977
2015-08-13 22:53:20 +00:00
David Majnemer 5c73c941c9 [IR] Cleanup indentation of EH instructions
No functional change is intended, just tidying up whitespace.

llvm-svn: 244966
2015-08-13 22:11:40 +00:00