Commit Graph

123430 Commits

Author SHA1 Message Date
Chris Bieneman 5e96fe905b [CMake] Bug 25059 - CMake libllvm.so.$MAJOR.$MINOR shared object name not compatible with ldconfig
Summary:
This change makes the CMake build system generate libraries for Linux and Darwin matching the makefile build system.

Linux libraries follow the pattern lib${name}.${MAJOR}.${MINOR}.so so that ldconfig won't pick it up incorrectly.

Darwin libraries are not versioned.

Note: On linux the non-versioned symlink is generated at install-time not build time. I plan to fix that eventually, but I expect that is good enough for the purposes of fixing this bug.

Reviewers: loladiro, tstellarAMD

Subscribers: axw, llvm-commits

Differential Revision: http://reviews.llvm.org/D13841

llvm-svn: 252093
2015-11-04 23:11:12 +00:00
Rafael Espindola 49b8548903 Slightly saner handling of thumb branches.
The generic infrastructure already did a lot of work to decide if the
fixup value is know or not. It doesn't make sense to reimplement a very
basic case: same fragment.

llvm-svn: 252090
2015-11-04 23:00:39 +00:00
Quentin Colombet 421723cdd8 [x86] Teach the shrink-wrapping hooks to do the proper thing with Win64.
Win64 has some strict requirements for the epilogue. As a result, we disable
shrink-wrapping for Win64 unless the block that gets the epilogue is already an
exit block.

Fixes PR24193.

llvm-svn: 252088
2015-11-04 22:37:28 +00:00
Eugene Zelenko ffec81ca00 Fix some Clang-tidy modernize warnings, other minor fixes.
Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg.

Differential revision: http://reviews.llvm.org/D14312

llvm-svn: 252087
2015-11-04 22:32:32 +00:00
Justin Bogner c2b98f03db PM: Rephrase PrintLoopPass as a wrapper around a new-style pass. NFC
Splits PrintLoopPass into a new-style pass and a PrintLoopPassWrapper,
much like we already do for PrintFunctionPass and PrintModulePass.

llvm-svn: 252085
2015-11-04 22:24:08 +00:00
Cong Hou 23a3bf0147 Add new interfaces to MBB for manipulating successors with probabilities instead of weights. NFC.
This is part-1 of the patch that replaces all edge weights in MBB by
probabilities, which only adds new interfaces. No functional changes.

Differential revision: http://reviews.llvm.org/D13908

llvm-svn: 252083
2015-11-04 21:37:58 +00:00
Simon Pilgrim f669d381f9 Warning fix.
llvm-svn: 252078
2015-11-04 21:27:22 +00:00
Sanjoy Das a4bae3bb21 [IR] Add a `data_operand` abstraction
Summary:
Data operands of a call or invoke consist of the call arguments, and
the bundle operands associated with the `call` (or `invoke`)
instruction.  The motivation for this change is that we'd like to be
able to query "argument attributes" like `readonly` and `nocapture`
for bundle operands naturally.

This change also provides a conservative "implementation" for these
attributes for any bundle operand, and an extension point for future
work.

Reviewers: chandlerc, majnemer, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14305

llvm-svn: 252077
2015-11-04 21:05:24 +00:00
Tom Stellard 18bf626cf7 llvm-config: Add --has-rtti option
Summary:
This prints NO if LLVM was built with -fno-rtti or an equivalent flag
and YES otherwise.  The reasons to add -has-rtti rather than adding -fno-rtti
to --cxxflags are:

1. Building LLVM with -fno-rtti does not always mean that client
applications need this flag.

2. Some compilers have a different flag for disabling rtti, and the
compiler being used to build LLVM may not be the compiler being used to
build the application.

Reviewers: echristo, chandlerc, beanz

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11849

llvm-svn: 252075
2015-11-04 20:57:43 +00:00
Simon Pilgrim 7e6606f4f1 [X86][SSE] Add general memory folding for (V)INSERTPS instruction
This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction.

The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening.

This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done.

It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted.

Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll

Differential Revision: http://reviews.llvm.org/D13988

llvm-svn: 252074
2015-11-04 20:48:09 +00:00
Sanjoy Das b11b440f8e [IR] Add bounds checking to paramHasAttr
Summary:
This is intended to make a later change simpler.

Note: adding this bounds checking required fixing `X86FastISel`.  As
far I can tell I've preserved original behavior but a careful review
will be appreciated.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14304

llvm-svn: 252073
2015-11-04 20:33:45 +00:00
David Blaikie a895aa635c Orc: Streamline some lambda usage in a unit test
llvm-svn: 252070
2015-11-04 19:43:24 +00:00
Rafael Espindola 337675b891 Relax the check for ninja.
On fedora the ninja executable is called ninja-build :-(

llvm-svn: 252062
2015-11-04 19:18:11 +00:00
Andrew Kaylor e41a8c4182 Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of scalar FMA intrinsics.
Patch by Slava Klochkov 

The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result.

Reviewers: Quentin Colombet, Elena Demikhovsky

Differential Revision: http://reviews.llvm.org/D13710

llvm-svn: 252060
2015-11-04 18:10:41 +00:00
James Molloy e7d679cf4c [ARM] Combine CMOV into BFI where possible
If we have a CMOV, OR and AND combination such as:
  if (x & CN)
    y |= CM;

And:
  * CN is a single bit;
  * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

llvm-svn: 252057
2015-11-04 16:55:07 +00:00
Teresa Johnson f1b0a6e37c [ThinLTO] Always set linkage type to external when converting alias
When converting an alias to a non-alias when the aliasee is not
imported, ensure that the linkage type is set to external so that it is
a valid linkage type. Added a test case that exposed this issue.

llvm-svn: 252054
2015-11-04 16:01:16 +00:00
James Molloy 4de84ddec9 [SimplifyCFG] Merge conditional stores
We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code:

  if (c & flag1)
    *a = x;
  if (c & flag2)
    *a = y;
  ...

There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to.

It is, however, legal to merge the stores together and do the store once:

  tmp = undef;
  if (c & flag1)
    tmp = x;
  if (c & flag2)
    tmp = y;
  if (c & flag1 || c & flag2)
    *a = tmp;

The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too.

This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure.

The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate.

This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite.

llvm-svn: 252051
2015-11-04 15:28:04 +00:00
Filipe Cabecinhas a2b0ac40cf Error out when faced with value names containing '\0'
Bug found with afl-fuzz.

llvm-svn: 252048
2015-11-04 14:53:36 +00:00
Aaron Ballman 5db085d688 Silence an extra semicolon warning; NFC.
llvm-svn: 252046
2015-11-04 14:40:54 +00:00
Michael Kuperstein a3b79dd783 [ELF] elfiamcu triple should imply e_machine == EM_IAMCU
Differential Revision: http://reviews.llvm.org/D14109

llvm-svn: 252043
2015-11-04 11:21:50 +00:00
Michael Kuperstein b34de72269 [X86] DAGCombine should not introduce FILD in soft-float mode
The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp 
directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode.

llvm-svn: 252042
2015-11-04 11:17:53 +00:00
James Molloy 8bfc772da0 Revert "[PatternMatch] Switch to use ValueTracking::matchSelectPattern"
This was breaking the modules build and is being reverted while we reach consensus on the right way to solve this layering problem. This reverts commit r251785.

llvm-svn: 252040
2015-11-04 08:36:53 +00:00
Pawel Bylica 85ca2949d4 Fix unit tests on Windows: handle env vars with non-ASCII chars.
Summary: On Windows we have to take UTF16 encoded env vars and convert them to UTF8. This patch fixes CopyEnvironment helper function used by process unit tests.

Reviewers: yaron.keren

Subscribers: yaron.keren, llvm-commits

Differential Revision: http://reviews.llvm.org/D14278

llvm-svn: 252039
2015-11-04 08:25:20 +00:00
Sanjoy Das bf9d76d62e [OperandBundles] Refactor; NFCI.
Extract out a helper function `operandBundleFromBundleOpInfo`.

llvm-svn: 252038
2015-11-04 04:31:21 +00:00
Sanjoy Das 3c95d5d28e [OperandBundles] Refactor; NFCI
Intended to make later changes simpler.  Exposes
`getBundleOperandsStartIndex` and `getBundleOperandsEndIndex`, and uses
them for the computation in `getNumTotalBundleOperands`.

llvm-svn: 252037
2015-11-04 04:31:06 +00:00
Philip Reames aeefae0cc5 [LVI] Update a comment to clarify what's actually happening and why
llvm-svn: 252033
2015-11-04 01:47:04 +00:00
Philip Reames 814fb60130 [CVP] Fold return values if possible
In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns.

The motivation for this is two fold:
 * The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified.
 * It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263.

Differential Revision: http://reviews.llvm.org/D14271

llvm-svn: 252032
2015-11-04 01:43:54 +00:00
Igor Laevsky 35fe692025 [StatepointLowering] Remove distinction between call and invoke safepoints
There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
 allow to lower call safepoints with relocates and results in a different 
basic blocks. See test case for example.

Differential Revision: http://reviews.llvm.org/D14158

llvm-svn: 252028
2015-11-04 01:16:10 +00:00
Alexey Samsonov 24519d98b3 Fix the test case for Windows.
llvm-svn: 252027
2015-11-04 01:09:37 +00:00
Alexey Samsonov 5365a01dc7 [LLVMSymbolize] Reduce indentation by using helper function. NFC.
llvm-svn: 252022
2015-11-04 00:30:26 +00:00
Alexey Samsonov 884adda0fb [LLVMSymbolize] Properly propagate object parsing errors from the library.
llvm-svn: 252021
2015-11-04 00:30:24 +00:00
Alexey Samsonov b0742319fc [llvm-symbolizer] Improve the test for missing input file.
llvm-svn: 252020
2015-11-04 00:30:19 +00:00
Adam Nemet 7c94c9bf07 Fix unused variable warning from r252017
llvm-svn: 252019
2015-11-04 00:10:33 +00:00
Adam Nemet e54a4fa95d LLE 6/6: Add LoopLoadElimination pass
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop.  E.g.:

  for (i)
     A[i + 1] = A[i] + B[i]

  =>

  T = A[0]
  for (i)
     T = T + B[i]
     A[i + 1] = T

The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load.  Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.

This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning.  As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here.  Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.

In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads.  I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.

The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop.  Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%).  This gain also transfers over to x86: it's
around 8-10%.

Right now the pass is off by default and can be enabled
with -enable-loop-load-elim.  On the LNT testsuite, there are two
performance changes (negative number -> improvement):

  1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
     critical paths is reduced
  2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
     outside of LNT

The pass is scheduled after the loop vectorizer (which is after loop
distribution).  The rational is to try to reuse LAA state, rather than
recomputing it.  The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.

LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences).  LAA is known to omit loop-independent
dependences in certain situations.  The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.

Reviewers: dberlin, hfinkel

Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13259

llvm-svn: 252017
2015-11-03 23:50:08 +00:00
Adam Nemet 397f5829c7 [LAA] LLE 5/6: Add predicate functions Dependence::isForward/isBackward, NFC
Summary: Will be used by the LoopLoadElimination pass.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13258

llvm-svn: 252016
2015-11-03 23:50:03 +00:00
Adam Nemet ed653d6774 [LAA] LLE 4/6: APIs to access the dependent instructions for a dependence, NFC
Summary:
The functions use LAI and MemoryDepChecker classes so they need to be
defined after those definitions outside of the Dependence class.

Will be used by the LoopLoadElimination pass.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13257

llvm-svn: 252015
2015-11-03 23:49:58 +00:00
Peter Collingbourne 94d778697a CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.
A profile of an LTO link of Chrome revealed that we were spending some
~30-50% of execution time in the function Constant::getRelocationInfo(),
which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn
from TargetMachine::getNameWithPrefix().

It turns out that we only need the result of getKindForGlobal() when
targeting Mach-O, so this change moves the relevant part of the logic to
TargetLoweringObjectFileMachO.

NFCI.

Differential Revision: http://reviews.llvm.org/D14168

llvm-svn: 252014
2015-11-03 23:40:03 +00:00
Matt Arsenault aac9b49325 AMDGPU: Make flat_scratch name consistent
The printed name and the parsed assembler names weren't the same.
I'm not sure which name SC prints these as, but I think it's this one.

llvm-svn: 252010
2015-11-03 22:50:34 +00:00
Matt Arsenault 967c2f5dee AMDGPU: Fix asserts on invalid register ranges
If the requested SGPR was not actually aligned, it was
accepted and rounded down instead of rejected.

Also fix an assert if the range is an invalid size.

llvm-svn: 252009
2015-11-03 22:50:32 +00:00
Matt Arsenault 3473c72aab AMDGPU: Fix off by one error in register parsing
If trying to use one past the end, this would assert.

llvm-svn: 252008
2015-11-03 22:50:27 +00:00
Derek Schuff cd9488d521 Address nit
llvm-svn: 252004
2015-11-03 22:40:45 +00:00
Derek Schuff b44d4d350e Align whitespace
llvm-svn: 252003
2015-11-03 22:40:43 +00:00
Derek Schuff 6b5c6da760 [WebAssembly] Support wasm select operator
Summary:
Add support for wasm's select operator, and lower LLVM's select DAG node
to it.

Reviewers: sunfish

Subscribers: dschuff, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D14295

llvm-svn: 252002
2015-11-03 22:40:40 +00:00
Matt Arsenault e8ed13d946 AMDGPU: s[102:103] is unavailable on VI
llvm-svn: 252000
2015-11-03 22:39:52 +00:00
Matt Arsenault 192b282bf3 AMDGPU: Define correct number of SGPRs
There are actually 104 so 2 were missing.

More assembler tests with high register number tuples
will be included in later patches.

llvm-svn: 251999
2015-11-03 22:39:50 +00:00
Matt Arsenault 6c0674112a AMDGPU: Make findUsedSGPR more readable
Add more comments etc.

llvm-svn: 251996
2015-11-03 22:30:15 +00:00
Matt Arsenault 782c03bb7e AMDGPU: Initialize SIFixSGPRCopies so -print-after works
llvm-svn: 251995
2015-11-03 22:30:13 +00:00
Matt Arsenault d9d659aa23 AMDGPU: Alphabetize includes
llvm-svn: 251994
2015-11-03 22:30:08 +00:00
Fiona Glaser a8b653a372 InstCombine: fix sinking of convergent calls
llvm-svn: 251991
2015-11-03 22:23:39 +00:00
Simon Pilgrim 191ac7c679 [SelectionDAG] Use existing constant nodes instead of recreating them. NFC.
llvm-svn: 251990
2015-11-03 22:21:38 +00:00
Alexey Samsonov d6aa820262 [LLVMSymbolize] Factor out the logic for printing structs from DIContext. NFC.
Introduce DIPrinter which takes care of rendering DILineInfo and
friends. This allows LLVMSymbolizer class to return a structured data
instead of plain std::strings.

llvm-svn: 251989
2015-11-03 22:20:52 +00:00
Simon Pilgrim b0d860a394 [X86][AVX] Tweaked shuffle stack folding tests
To avoid alternative lowerings.

llvm-svn: 251986
2015-11-03 21:58:35 +00:00
Adam Nemet a2df750fb3 [LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC
Summary:
We now collect all types of dependences including lexically forward
deps not just "interesting" ones.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13256

llvm-svn: 251985
2015-11-03 21:39:52 +00:00
Simon Pilgrim df993479c9 [X86][AVX512] Fixed shuffle test name to match shuffle
llvm-svn: 251984
2015-11-03 21:39:30 +00:00
Alexey Samsonov 6881249895 [LLVMSymbolize] Move demangling away from printing routines. NFC.
Make printDILineInfo and friends responsible for just rendering the
contents of the structures, demangling should actually be performed
earlier, when we have the information about the originating
SymbolizableModule at hand.

llvm-svn: 251981
2015-11-03 21:36:13 +00:00
Davide Italiano c8a7913f23 [SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)
This one is enabled only under -ffast-math (due to rounding/overflows)
but allows us to emit shorter code.

Before (on FreeBSD x86-64):
4007f0:       50                      push   %rax
4007f1:       f2 0f 11 0c 24          movsd  %xmm1,(%rsp)
4007f6:       e8 75 fd ff ff          callq  400570 <exp2@plt>
4007fb:       f2 0f 10 0c 24          movsd  (%rsp),%xmm1
400800:       58                      pop    %rax
400801:       e9 7a fd ff ff          jmpq   400580 <pow@plt>
400806:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
40080d:       00 00 00

After:
4007b0:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
4007b4:       e9 87 fd ff ff          jmpq   400540 <exp2@plt>
4007b9:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Differential Revision:	http://reviews.llvm.org/D14045

llvm-svn: 251976
2015-11-03 20:32:23 +00:00
Simon Pilgrim e88dc04c48 [X86][XOP] Add support for the matching of the VPCMOV bit select instruction
XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) )

This patch adds tablegen pattern matching for this instruction.

Differential Revision: http://reviews.llvm.org/D8841

llvm-svn: 251975
2015-11-03 20:27:01 +00:00
Rui Ueyama f4acad30ec llmv-pdbdump: Make BuiltinDumper shorter. NFC.
llvm-svn: 251974
2015-11-03 20:16:18 +00:00
Adam Nemet d7037c56d3 [LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependence
Summary:
When the dependence distance in zero then we have a loop-independent
dependence from the earlier to the later access.

No current client of LAA uses forward dependences so other than
potentially hitting the MaxDependences threshold earlier, this change
shouldn't affect anything right now.

This and the previous patch were tested together for compile-time
regression.  None found in LNT/SPEC.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13255

llvm-svn: 251973
2015-11-03 20:13:43 +00:00
Adam Nemet b45516e875 [LAA] LLE 1/6: Expose Forward dependences
Summary:
Before this change, we didn't use to collect forward dependences since
none of the current clients (LV, LDist) required them.

The motivation to also collect forward dependences is a new pass
LoopLoadElimination (LLE) which discovers store-to-load forwarding
opportunities across the loop's backedge.  The pass uses both lexically
forward or backward loop-carried dependences to detect these
opportunities.

The new pass also analyzes loop-independent (forward) dependences since
they can conflict with the loop-carried dependences in terms of how the
data flows through memory.

The newly added test only covers loop-carried forward dependences
because loop-independent ones are currently categorized as NoDep.  The
next patch will fix this.

The two patches were tested together for compile-time regression.  None
found in LNT/SPEC.

Note that with this change LAA provides all dependences rather than just
"interesting" ones.  A subsequent NFC patch will remove the now trivial
isInterestingDependence and rename the APIs.

Reviewers: hfinkel

Subscribers: jmolloy, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13254

llvm-svn: 251972
2015-11-03 20:13:23 +00:00
Rafael Espindola 2b27b2f5a7 Don't create empty sections just to look like gas.
We are long past the time when this much bug for bug compatibility was
useful.

llvm-svn: 251970
2015-11-03 20:02:22 +00:00
Rafael Espindola 56e0aafa2b Relax a few more overspecified tests.
llvm-svn: 251967
2015-11-03 19:38:19 +00:00
Teresa Johnson 255787a969 Revert "Move metadata linking after lazy global materialization/linking."
This reverts commit r251926. I believe this is causing an LTO
bootstrapping bot failure
(http://lab.llvm.org:8080/green/job/llvm-stage2-cmake-RgLTO_build/3669/).

Haven't been able to repro it yet, but after looking at the metadata I
am pretty sure I know what is going on.

llvm-svn: 251965
2015-11-03 19:36:04 +00:00
Rafael Espindola 41302723de Remove unnecessary dependency on section and string positions.
llvm-svn: 251964
2015-11-03 19:24:17 +00:00
Kostya Serebryany 856b7afe60 [libFuzzer] make -test_single_input more reliable: make sure the input's size is equal to it's capacity
llvm-svn: 251961
2015-11-03 18:57:25 +00:00
Rafael Espindola 43e2e251ea Delete dead code.
llvm-svn: 251960
2015-11-03 18:55:58 +00:00
Rafael Espindola e7fe1a46e4 Simplify local common output.
We now create them as they are found and use higher level APIs.

This is a step in avoiding creating unnecessary sections.

llvm-svn: 251958
2015-11-03 18:50:51 +00:00
Igor Laevsky f637b4a52e [CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks
Differential Revision: http://reviews.llvm.org/D14258

llvm-svn: 251957
2015-11-03 18:37:40 +00:00
Rafael Espindola e0550a80a4 Move code out of a loop and use a range loop.
llvm-svn: 251952
2015-11-03 18:04:07 +00:00
Rafael Espindola e63e0188e4 Revert "Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines.""
This reverts commit r251937.

The test was updated to the new API, bring the API back.

llvm-svn: 251944
2015-11-03 16:40:37 +00:00
Lang Hames bc78a65adc [Kaleidoscope][Orc] Fix the fully_lazy Orc Kaleidoscope example.
r251933 changed the Orc compile callbacks API, which broke this.

llvm-svn: 251942
2015-11-03 16:35:10 +00:00
Silviu Baranga 308a7c7ed4 Fix PR25372 - teach replaceCongruentPHIs to handle cases where SE evaluates a PHI to a SCEVConstant
Summary:
Since now Scalar Evolution can create non-add rec expressions for PHI
nodes, it can also create SCEVConstant expressions. This will confuse
replaceCongruentPHIs, which previously relied on the fact that SCEV
could not produce constants in this case.

We will now replace the node with a constant in these cases - or avoid
processing the Phi in case of a type mismatch.

Reviewers: sanjoy

Subscribers: llvm-commits, majnemer

Differential Revision: http://reviews.llvm.org/D14230

llvm-svn: 251938
2015-11-03 16:27:04 +00:00
Rafael Espindola 2f344637d6 Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines."
This reverts commit r251933.

It broke the build of examples/Kaleidoscope/Orc/fully_lazy/toy.cpp.

llvm-svn: 251937
2015-11-03 16:25:20 +00:00
David Blaikie 96a9d8c8e5 Kaleidoscope-ch2: Remove the dependence on LLVM by cloning make_unique into this project
llvm-svn: 251936
2015-11-03 16:23:21 +00:00
Lang Hames a4a227f7e8 [Orc] Directly emit machine code for the x86 resolver block and trampolines.
Bypassing LLVM for this has a number of benefits:

1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't
   work on Windows as the resolver block was in Darwin asm).

2) For cross-process JITs, it allows resolver blocks and trampolines to be
   emitted directly in the target process, reducing cross process traffic.

3) It should be marginally faster.

llvm-svn: 251933
2015-11-03 16:10:18 +00:00
Teresa Johnson 07b825b01c Move metadata linking after lazy global materialization/linking.
Summary:
Currently, named metadata is linked before the LazilyLinkGlobalValues
list is walked and materialized/linked. As a result, references
from DISubprogram and DIGlobalVariable metadata to yet unmaterialized
functions and variables cause them to be added to the lazy linking
list and their definitions are materialized and linked.

This makes the llvm-link -only-needed option not have the intended
effect when debug information is present, as the otherwise unneeded
functions/variables are still linked in.

Additionally, for ThinLTO I have implemented a mechanism to only link
in debug metadata needed by imported functions. Moving named metadata
linking after lazy GV linking will facilitate applying this mechanism
to the LTO and "llvm-link -only-needed" cases as well.

Reviewers: dexonsmith, tra, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14195

llvm-svn: 251926
2015-11-03 15:11:27 +00:00
Teresa Johnson 04e5877430 Pass enum instead of bool to new linkInModule call in llvm-link
A new call I added to linkInModule from llvm-link in r251866
was still passing in a boolean for an argument that was changed to an
enum in r246561. I didn't catch this in my merge since the bool false
matched the flag value it mapped to.

llvm-svn: 251925
2015-11-03 15:10:50 +00:00
Filipe Cabecinhas 7aae2f23c8 Don't assert if materializing before seeing any function bodies
This assert was reachable from user input. A minimized test case (no
FUNCTION_BLOCK_ID record) is attached.

Bug found with afl-fuzz

llvm-svn: 251910
2015-11-03 13:48:26 +00:00
Filipe Cabecinhas f3e167af4b Don't use Twine objects after their lifetimes end.
No test, since it would depend on what the compiler can optimize/reuse.
My next commit made this bug visible on Linux Release compiles with some
versions of gcc.

llvm-svn: 251909
2015-11-03 13:48:21 +00:00
Elena Demikhovsky 2b06b0fe2a LoopVectorizer - skip 'bitcast' between GEP and load.
Skipping 'bitcast' in this case allows to vectorize load:

  %arrayidx = getelementptr inbounds double*, double** %in, i64 %indvars.iv
  %tmp53 = bitcast double** %arrayidx to i64*
  %tmp54 = load i64, i64* %tmp53, align 8

Differential Revision http://reviews.llvm.org/D14112

llvm-svn: 251907
2015-11-03 10:29:34 +00:00
Michael Kuperstein 73dc85293f [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.

Darwin does not support this well, so don't use pushes whenever it
would be required.

Differential Revision: http://reviews.llvm.org/D13767

llvm-svn: 251904
2015-11-03 08:17:25 +00:00
Igor Breger 4ec5abffae AVX512: add encoding tests for vmovq/d instructions.
llvm-svn: 251903
2015-11-03 07:30:17 +00:00
Tobias Grosser 526d52691a Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader"
Commit 251839 triggers miscompiles on some bots:

http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723

(The commit is listed in 13722, but due to an existing failure introduced in
13721 and reverted in 13723 the failure is only visible in 13723)

To verify r251839 is indeed the only change that triggered the buildbot failures
and to ensure the buildbots remain green while investigating I temporarily
revert this commit. At the current state it is unclear if this commit introduced
some miscompile or if it only exposed code to Polly that is subsequently
miscompiled by Polly.

llvm-svn: 251901
2015-11-03 07:14:39 +00:00
Matthias Braun f538e133cc Fix build problme introduced in r251883
llvm-svn: 251888
2015-11-03 02:19:07 +00:00
Matthias Braun 6f4ed269b9 RegisterPressure: Improve assert message
llvm-svn: 251885
2015-11-03 01:53:36 +00:00
Matthias Braun 11859b5c8f RegisterPressure: Slightly nicer pressure diff dumping
llvm-svn: 251884
2015-11-03 01:53:33 +00:00
Matthias Braun 93563e7032 ScheduleDAGInstrs: Remove IsPostRA flag; NFC
ScheduleDAGInstrs doesn't behave differently before or after register
allocation. It was only used in a method of MachineSchedulerBase which
behaved differently in MachineScheduler/PostMachineScheduler. Change
this to let MachineScheduler/PostMachineScheduler just pass in a
parameter to that function.

The order of the LiveIntervals* and bool RemoveKillFlags paramters have
been switched to make out-of-tree code fail instead of unintentionally
passing a value intended for the IsPostRA flag to the (previously
following and default initialized) RemoveKillFlags.

Differential Revision: http://reviews.llvm.org/D14245

llvm-svn: 251883
2015-11-03 01:53:29 +00:00
Rafael Espindola 52151b3edf Don't implicitly construct a Archive::child_iterator.
llvm-svn: 251878
2015-11-03 01:32:40 +00:00
Rafael Espindola cc86d824d5 This never returns end(), simplify to use Child instead of iterator. NFC.
llvm-svn: 251876
2015-11-03 01:20:44 +00:00
Rui Ueyama fa05aacd3b llvm-pdbdump: Simplify. NFC.
llvm-svn: 251873
2015-11-03 01:04:44 +00:00
Colin LeMahieu 160f73e36f [Hexagon] Fixing mistaken case fallthrough.
llvm-svn: 251867
2015-11-03 00:21:19 +00:00
Teresa Johnson c7ed52f2ba Restore "Support for ThinLTO function importing and symbol linking."
This restores commit r251837, with the new library dependence added to
llvm-link/Makefile to address bot failures.

llvm-svn: 251866
2015-11-03 00:14:15 +00:00
Kevin Enderby 9c8905c7c8 Allow llvm-nm’s single letter command line flags to be grouped.
Which is needed if we want to replace darwin’s nm(1) with llvm-nm
as there are many uses of grouped flags.  The added test case is
one specific case that is in real use.

rdar://23337419

llvm-svn: 251864
2015-11-02 23:42:05 +00:00
Matt Arsenault f1aebbf33a AMDGPU: Stop assuming vreg for build_vector
This was causing a variety of test failures when v2i64
is added as a legal type.

SIFixSGPRCopies should correctly handle the case of vector inputs
to a scalar reg_sequence, so this isn't necessary anymore. This
was hiding some deficiencies in how reg_sequence is handled later,
but this shouldn't be a problem anymore since the register class
copy of a reg_sequence is now done before the reg_sequence.

llvm-svn: 251860
2015-11-02 23:30:48 +00:00
Derek Schuff 43e96c4feb [WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinter
llvm-svn: 251859
2015-11-02 23:23:16 +00:00
Matt Arsenault d48da14269 AMDGPU: Error on graphics shaders with HSA
I've found myself pointlessly debugging problems from running
graphics tests with an HSA triple a few times, so stop this from
happening again.

llvm-svn: 251858
2015-11-02 23:23:02 +00:00
Sanjay Patel 0ed9aeaa5f [CGP] widen switch condition and case constants to target's register width (2nd try)
This is a redo of r251849 except the tests have been split into arch-specific folders
to hopefully make the bots happy.

This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1:
  cmpwi  3, 99
  bgt    0, .LBB0_9
 BB#2:
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#3:
  cmplwi         4, 10
  bne    0, .LBB0_12
 BB#4:
  li 3, 1
  blr
.LBB0_5:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 65436
  beq    0, .LBB0_13
 BB#6:
  cmplwi         3, 65526
  beq    0, .LBB0_15
 BB#7:
  cmplwi         3, 65535
  bne    0, .LBB0_12
 BB#8:
  li 3, 4
  blr
.LBB0_9:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 100
  beq    0, .LBB0_14
...

After:
BB#0:
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi  4, 999
  ble 0, .LBB0_5
 BB#1:
  lis 3, 0
  ori 3, 3, 65525
  cmpw   4, 3
  bgt    0, .LBB0_9
 BB#2:
  cmplwi         4, 1000
  beq    0, .LBB0_14
 BB#3:
  cmplwi         4, 65436
  bne    0, .LBB0_13
 BB#4:
  li 3, 6
  blr
.LBB0_5:
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#6:
  cmplwi         4, 10
  beq    0, .LBB0_12
 BB#7:
  cmplwi         4, 100
  bne    0, .LBB0_13
 BB#8:
  li 3, 2
  blr
.LBB0_9:
  cmplwi         4, 65526
  beq    0, .LBB0_15
 BB#10:
  cmplwi         4, 65535
  bne    0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532

llvm-svn: 251857
2015-11-02 23:22:49 +00:00
Matt Arsenault d299aa22e2 AMDGPU: Un XFAIL a test
This should probably be merged with one of the other private memory
tests, but it fails on r600.

llvm-svn: 251856
2015-11-02 23:15:46 +00:00
Matt Arsenault 0de924b76d AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE
Make the REG_SEQUENCE be a VGPR, and do the register class
copy first.

llvm-svn: 251855
2015-11-02 23:15:42 +00:00
David Blaikie 7f4df5432e Fix the build I just broke
llvm-svn: 251854
2015-11-02 23:10:52 +00:00
David Blaikie eff511089d Orc: Drop some else-after-return, reflow a few spots, and avoid use of pointee types
llvm-svn: 251853
2015-11-02 23:09:38 +00:00
Davide Italiano b7487e6b8d [SimplifyLibCalls] Remove variables that are not used. NFC.
llvm-svn: 251852
2015-11-02 23:07:14 +00:00
Sanjay Patel dfc825eb36 revert r251849; need to move tests to arch-specific folders
llvm-svn: 251851
2015-11-02 23:05:20 +00:00
Cong Hou cf2ed26836 Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger vectorization factor.
To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers.

This is the second attempt to submit this patch. The first attempt got a test failure on ARM. This patch is updated to try to fix the failure (more specifically, by handling the case when VF=1).

Differential revision: http://reviews.llvm.org/D8943

llvm-svn: 251850
2015-11-02 22:53:48 +00:00
Sanjay Patel b90a078de9 [CGP] widen switch condition and case constants to target's register width
This is a follow-up from the discussion in D12965. The block-at-a-time limitation of 
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any 
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1: 
  cmpwi	 3, 99
  bgt	 0, .LBB0_9
 BB#2:            
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#3:            
  cmplwi	 4, 10
  bne	 0, .LBB0_12
 BB#4:                      
  li 3, 1
  blr
.LBB0_5:                             
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 65436
  beq	 0, .LBB0_13
 BB#6:                            
  cmplwi	 3, 65526
  beq	 0, .LBB0_15
 BB#7:                       
  cmplwi	 3, 65535
  bne	 0, .LBB0_12
 BB#8:                       
  li 3, 4
  blr
.LBB0_9:                       
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 100
  beq	 0, .LBB0_14
...

After:
BB#0:        
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi	 4, 999
  ble 0, .LBB0_5
 BB#1:          
  lis 3, 0
  ori 3, 3, 65525
  cmpw	 4, 3
  bgt	 0, .LBB0_9
 BB#2:         
  cmplwi	 4, 1000
  beq	 0, .LBB0_14
 BB#3:    
  cmplwi	 4, 65436
  bne	 0, .LBB0_13
 BB#4:       
  li 3, 6
  blr
.LBB0_5:   
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#6: 
  cmplwi	 4, 10
  beq	 0, .LBB0_12
 BB#7:             
  cmplwi	 4, 100
  bne	 0, .LBB0_13
 BB#8:             
  li 3, 2
  blr
.LBB0_9:       
  cmplwi	 4, 65526
  beq	 0, .LBB0_15
 BB#10:      
  cmplwi	 4, 65535
  bne	 0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532

llvm-svn: 251849
2015-11-02 22:46:24 +00:00
Bill Schmidt 8ed7cec170 [PPC64LE] Properly initialize instr-info in PPCVSXSwapRemoval pass
Replace some hacky code with the proper way to get at this data.

No functional change.

llvm-svn: 251848
2015-11-02 22:43:57 +00:00
Sanjay Patel e6e841791c don't repeat function names in comments; NFC
llvm-svn: 251846
2015-11-02 22:34:55 +00:00
Davide Italiano e84d4da234 [SimplifyLibCalls] Merge two if statements. NFC.
llvm-svn: 251845
2015-11-02 22:33:26 +00:00
Teresa Johnson 227a923140 Revert "Support for ThinLTO function importing and symbol linking."
This reverts commit r251837, due to a number of bot failures of the form:

/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to
'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef,
llvm::LLVMContext&, llvm::Module const*, bool)'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()'

I'm not sure why these are happening - I added Object to the requred
libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS
in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these
symbols come out of libLLVMObject.a. What am I missing?

llvm-svn: 251841
2015-11-02 22:17:32 +00:00
Chen Li d715310162 [IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.


Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13974

llvm-svn: 251839
2015-11-02 22:00:15 +00:00
Teresa Johnson b1d4a39990 Support for ThinLTO function importing and symbol linking.
Summary:
Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.

Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.

Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.

Reviewers: dexonsmith, joker.eph, davidxl

Subscribers: tobiasvk, tejohnson, llvm-commits

Differential Revision: http://reviews.llvm.org/D13515

llvm-svn: 251837
2015-11-02 21:39:10 +00:00
Tim Northover bfbfb12d38 MachO: support tvOS and watchOS version min commands in llvm-objdump
llvm-svn: 251834
2015-11-02 21:26:58 +00:00
Cong Hou b90b9e0531 In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available.
In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality.

Consider a simple example shown below:

A
|
B⇆C
|
D

When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D.

This patch filters those cold blocks off from the loop chain by comparing the ratio:

LoopBBFreq / LoopFreq

to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa.


Differential revision: http://reviews.llvm.org/D11662

llvm-svn: 251833
2015-11-02 21:24:00 +00:00
Reid Kleckner 06665475cd [Support] Assert that reported key+data lenghts match reality
This found a bug in Clang's PTH implementation.

llvm-svn: 251829
2015-11-02 20:49:29 +00:00
Teresa Johnson 746426b487 Fix use-after-free in function index merging code.
This was flagged by ASAN when using a test case I will be committing
along with D13515.

llvm-svn: 251827
2015-11-02 20:43:33 +00:00
David Blaikie 260b65850c Revert parts accidentally included in r251823
llvm-svn: 251826
2015-11-02 20:05:54 +00:00
David Blaikie 2297a9142e StringRef-ify DiagnosticInfoSampleProfile::Filename
llvm-svn: 251823
2015-11-02 20:01:13 +00:00
Rafael Espindola dcbac6285a ELF can handle some relocations of the form -sym + constant.
Remove code that was assuming that this would never work.

Thanks to Colin LeMahie for finding and diagnosing the bug.

llvm-svn: 251818
2015-11-02 19:13:59 +00:00
Rafael Espindola 84f4cb8ba5 Convert tabs to spaces.
llvm-svn: 251817
2015-11-02 19:03:18 +00:00
James Y Knight 646c4032e7 Fix two issues in MergeConsecutiveStores:
1) PR25154. This is basically a repeat of PR18102, which was fixed in
r200201, and broken again by r234430. The latter changed which of the
store nodes was merged into from the first to the last. Thus, we now
also need to prefer merging a later store at a given address into the
target node, instead of an earlier one.

2) While investigating that, I also realized I'd introduced a bug in
r236850. There, I removed a check for alignment -- not realizing that
nothing except the alignment check was ensuring that none of the stores
were overlapping! This is a really bogus way to ensure there's no
aliased stores.

A better solution to both of these issues is likely to always use the
code added in the 'if (UseAA)' branches which rearrange the chain based
on a more principled analysis. I'll look into whether that can be used
always, but in the interest of getting things back to working, I think a
minimal change makes sense.

llvm-svn: 251816
2015-11-02 18:48:08 +00:00
Tim Northover 06fb04f1fa MachO: improve load command tests slightly
llvm-svn: 251815
2015-11-02 18:33:35 +00:00
Tim Northover 155103ec18 WatchOS: update default CPU for triple after t2dsp -> dsp rename
llvm-svn: 251814
2015-11-02 18:21:07 +00:00
Teresa Johnson f72278f051 Clang format a few prior patches (NFC)
I had clang formatted my earlier patches using the wrong style.
Reformatted with the LLVM style.

llvm-svn: 251812
2015-11-02 18:02:11 +00:00
Tim Northover 89a6eefe6f TvOS: add missing support for some libcalls.
llvm-svn: 251811
2015-11-02 18:00:00 +00:00
Artur Pilipenko 5c5011d503 Preserve load alignment and dereferenceable metadata during some transformations
Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D13953

llvm-svn: 251809
2015-11-02 17:53:51 +00:00
Matthias Braun 493edbbe49 lit: Add '-a' option to display commands+output of all tests
The existing -v option only displays commands and outputs for failed
tests, the newly introduced -a displays it for all executed tests.

llvm-svn: 251806
2015-11-02 16:13:46 +00:00
Silviu Baranga ac5540f3ea Add missing override statements in ScalarEvolution.h. NFC
llvm-svn: 251805
2015-11-02 15:29:49 +00:00
Pawel Bylica 7c1f36a6b7 Use static instead of anonymous namespace for helper functions. NFC.
llvm-svn: 251801
2015-11-02 14:57:24 +00:00
Silviu Baranga e3c0534b11 [SCEV][LV] Add SCEV Predicates and use them to re-implement stride versioning
Summary:
SCEV Predicates represent conditions that typically cannot be derived from
static analysis, but can be used to reduce SCEV expressions to forms which are
usable for different optimizers.

ScalarEvolution now has the rewriteUsingPredicate method which can simplify a
SCEV expression using a SCEVPredicateSet. The normal workflow of a pass using
SCEVPredicates would be to hold a SCEVPredicateSet and every time assumptions
need to be made a new SCEV Predicate would be created and added to the set.
Each time after calling getSCEV, the user will call the rewriteUsingPredicate
method.

We add two types of predicates
SCEVPredicateSet - implements a set of predicates
SCEVEqualPredicate - tests for equality between two SCEV expressions

We use the SCEVEqualPredicate to re-implement stride versioning. Every time we
version a stride, we will add a SCEVEqualPredicate to the context.
Instead of adding specific stride checks, LoopVectorize now adds a more
generic SCEV check.

We only need to add support for this in the LoopVectorizer since this is the
only pass that will do stride versioning.

Reviewers: mzolotukhin, anemet, hfinkel, sanjoy

Subscribers: sanjoy, hfinkel, rengolin, jmolloy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13595

llvm-svn: 251800
2015-11-02 14:41:02 +00:00
Nemanja Ivanovic be5f0c04f1 Fix for bootstrap bug introduced in r244921
This revision has introduced an issue that only affects bootstrapped compiler
when it is printing the ASM. It turns out that the new code path taken due to
legalizing a scalar_to_vector of i64 -> v2i64 exposes a missing check in a
micro optimization to change a load followed by a scalar_to_vector into a
load and splat instruction on PPC.

llvm-svn: 251798
2015-11-02 14:01:11 +00:00
Rafael Espindola 54d263c61c This doesn't need a object::Archive::child_iterator.
llvm-svn: 251796
2015-11-02 13:30:46 +00:00
Rafael Espindola f82425851a Avoid implicitly constructing a Archive::child_iterator.
llvm-svn: 251794
2015-11-02 13:17:11 +00:00
James Molloy 9d8bbada54 [PatternMatch] Switch to use ValueTracking::matchSelectPattern
Instead of rolling our own min/max matching code (which is notoriously
hard to get completely right), use ValueTracking's instead.

llvm-svn: 251785
2015-11-02 09:54:00 +00:00
Pawel Bylica 0e97e5cb19 [Support] Extend sys::path with user_cache_directory function.
Summary:
The new function sys::path::user_cache_directory tries to discover
a directory suitable for cache storage for current system user.

On Windows and Darwin it returns a path to system-specific user cache directory.

On Linux it follows XDG Base Directory Specification, what is:
- use non-empty $XDG_CACHE_HOME env var,
- use $HOME/.cache.

Reviewers: chapuni, aaron.ballman, rafael

Subscribers: rafael, aaron.ballman, llvm-commits

Differential Revision: http://reviews.llvm.org/D13801

llvm-svn: 251784
2015-11-02 09:49:17 +00:00
Igor Breger fa798a9dbb AVX512: Implemented encoding and intrinsics for VBROADCASTI32x2 and VBROADCASTF32x2 instructions.
Differential Revision: http://reviews.llvm.org/D14216

llvm-svn: 251781
2015-11-02 07:39:36 +00:00
Craig Topper 45e83b8ba7 [X86] Remove assertions that check for valid scale values on scatter/gather intrinsics. Nothing upstream prevented illegal values from getting here.
llvm-svn: 251780
2015-11-02 07:24:40 +00:00
Craig Topper cc56e3a1fc [X86] Don't pass a scale value of 0 to scatter/gather intrinsics. This causes the code emitter to throw an assertion if we try to encode it. Need to add a check to fail isel for this, but for now avoid testing it.
llvm-svn: 251779
2015-11-02 07:24:37 +00:00
Craig Topper e69eb78510 [X86] Fold 'if' followed by just an llvm_unreachable into an assert.
llvm-svn: 251778
2015-11-02 07:24:34 +00:00
Craig Topper aebab7c03f [X86] Use isa instead of dyn_cast in a bool context. NFC
llvm-svn: 251777
2015-11-02 07:24:32 +00:00
Craig Topper c70af642a2 [X86] Remove some llvm_unreachables after switches that already have an unreachable in their default case.
llvm-svn: 251776
2015-11-02 07:24:30 +00:00
Craig Topper d6a77ca4bb [X86] Remove a 'break' after an llvm_unreachable.
llvm-svn: 251775
2015-11-02 07:24:27 +00:00
Craig Topper d49a41793c [X86] Use cast instead of dyn_cast and a null check marked unreachable.
llvm-svn: 251774
2015-11-02 07:24:25 +00:00
Craig Topper a8d0be0cd6 Fix a -Wpessimizing-move warning.
llvm-svn: 251773
2015-11-02 05:24:28 +00:00
Craig Topper 95ceb5a60a [X86] Use MVT instead of EVT when the type is known to be simple. NFC
llvm-svn: 251772
2015-11-02 05:24:22 +00:00
Xinliang David Li 2004f003b6 [PGO] Value profiling (index format) code cleanup and testing
1. Added a set of public interfaces in InstrProfRecord
    class to access (read/write) value profile data.
 2. Changed IndexedProfile reader and writer code to 
    use the newly defined interfaces and hide implementation
    details.
 3. Added a couple of unittests for value profiling:
   - Test new interfaces to get and set value profile data
   - Test value profile data merging with various scenarios.

 No functional change is expected. The new interfaces will also
 make it possible to change on-disk format of value prof data
 to be more compact (to be submitted). 

llvm-svn: 251771
2015-11-02 05:08:23 +00:00
Sanjoy Das 52bfa0faa4 [SCEV] Fix PR25369
Have `getConstantEvolutionLoopExitValue` work correctly with multiple
entry loops.

As far as I can tell, `getConstantEvolutionLoopExitValue` never did the
right thing for multiple entry loops; and before r249712 it would
silently return an incorrect answer.  r249712 changed SCEV to fail an
assert on a multiple entry loop, and this change fixes the underlying
issue.

llvm-svn: 251770
2015-11-02 02:06:01 +00:00
NAKAMURA Takumi 50df0c2037 Untabify.
llvm-svn: 251769
2015-11-02 01:38:12 +00:00
Davide Italiano 83b3481601 [LibraryInfo] Point to FreeBSD HEAD repo and not to a dolphin branch.
The latter might go away (anytime soon).

llvm-svn: 251765
2015-11-01 17:00:13 +00:00
Elena Demikhovsky db738d9cc3 AVX-512: Optimized SIMD truncate operations for AVX512F set.
Optimized <8 x i32> to <8 x i16>
<4 x i64> to < 4 x i32>
<16 x i16> to <16 x i8>
All these oprtrations use now AVX512F set (KNL). Before this change it was implemented with AVX2 set.


Differential Revision: http://reviews.llvm.org/D14108

llvm-svn: 251764
2015-11-01 11:45:47 +00:00
Saleem Abdulrasool 50406b92ec RuntimeDyld: add COFF i386 support
This adds support for COFF I386.  This is sufficient for code execution in a
32-bit JIT, though, imported symbols need to custom lowered for the redirection.

llvm-svn: 251761
2015-11-01 01:26:15 +00:00