Commit Graph

120837 Commits

Author SHA1 Message Date
Guozhi Wei f66d384443 Align SP adjustment in function getSPAdjust
This commit adds a new function TargetFrameLowering::alignSPAdjust
and calls it from TargetInstrInfo::getSPAdjust. It fixes PR24142.

llvm-svn: 245253
2015-08-17 22:36:27 +00:00
Dan Gohman 4e2d799cab [WebAssembly] Make getArchTypePrefix return "wasm".
The arch prefix string isn't currently being used for anything on
WebAssembly, but if it were to be used, it makes sense to use the
same arch prefix string for wasm32 and wasm64.

llvm-svn: 245252
2015-08-17 22:35:40 +00:00
Alex Lorenz a56ba6a6dd MIR Serialization: Serialize the local offsets for the stack objects.
llvm-svn: 245249
2015-08-17 22:17:42 +00:00
Alex Lorenz eb62568625 MIR Serialization: Serialize the memory operand's range metadata node.
llvm-svn: 245247
2015-08-17 22:09:52 +00:00
Alex Lorenz 03e940d1f8 MIR Serialization: Serialize the memory operand's noalias metadata node.
llvm-svn: 245246
2015-08-17 22:08:02 +00:00
Alex Lorenz a16f624dc3 MIR Serialization: Serialize the memory operand's alias scope metadata node.
llvm-svn: 245245
2015-08-17 22:06:40 +00:00
Alex Lorenz a617c9162d MIR Serialization: Serialize the memory operand's TBAA metadata node.
llvm-svn: 245244
2015-08-17 22:05:15 +00:00
David Majnemer 83f4bb23c4 [WinEHPrepare] Replace unreasonable funclet terminators with unreachable
It is possible to be in a situation where more than one funclet token is
a valid SSA value.  If we see a terminator which exits a funclet which
doesn't use the funclet's token, replace it with unreachable.

Differential Revision: http://reviews.llvm.org/D12074

llvm-svn: 245238
2015-08-17 20:56:39 +00:00
Douglas Katzman 685a7d1a70 [SPARC]: recognize '.' as the start of an assembler expression.
llvm-svn: 245232
2015-08-17 19:55:01 +00:00
James Molloy 974838f294 [ARM] Fix crash when targetting CPU without NEON
We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available.

Found here: https://code.google.com/p/chromium/issues/detail?id=521671

llvm-svn: 245231
2015-08-17 19:37:12 +00:00
Igor Laevsky 06044f97d2 [ScalarEvolutionExpander] Reuse findExistingExpansion during expansion cost calculation for division
Primary purpose of this change is to reuse existing code inside findExistingExpansion. However it introduces very slight semantic change - findExistingExpansion now looks into exiting blocks instead of a loop latches. Originally heuristic was based on the fact that we want to look at the loop exit conditions. And since all exiting latches will be listed in the ExitingBlocks, heuristic stays roughly the same.

Differential Revision: http://reviews.llvm.org/D12008

llvm-svn: 245227
2015-08-17 16:37:04 +00:00
Silviu Baranga b322aa6f53 [CostModel][AArch64] Increase cost of vector insert element and add missing cast costs
Summary:
Increase the estimated costs for insert/extract element operations on
AArch64. This is motivated by results from benchmarking interleaved
accesses.

Add missing costs for zext/sext/trunc instructions and some integer to
floating point conversions. These costs were previously calculated
by scalarizing these operation and were affected by the cost increase of
the insert/extract element operations.

Reviewers: rengolin

Subscribers: mcrosier, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11939

llvm-svn: 245226
2015-08-17 16:05:09 +00:00
Silviu Baranga d5ac26937c [CostModel][ARM] Increase cost of insert/extract operations
Summary:
This change limits the minimum cost of an insert/extract
element operation to 2 in cases where this would result
in mixing of NEON and VFP code.

Reviewers: rengolin

Subscribers: mssimpso, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12030

llvm-svn: 245225
2015-08-17 15:57:05 +00:00
Igor Laevsky b20bda77e7 [BasicAliasAnalysis] Do not check ModRef table for intrinsics
All possible ModRef behaviours can be completely represented using existing LLVM IR attributes.

Differential Revision: http://reviews.llvm.org/D12033

llvm-svn: 245224
2015-08-17 15:56:56 +00:00
Artur Pilipenko 34d8ba84c8 Take alignment into account in isSafeToSpeculativelyExecute and isSafeToLoadUnconditionally.
Reviewed By: hfinkel, sanjoy, MatzeB

Differential Revision: http://reviews.llvm.org/D9791

llvm-svn: 245223
2015-08-17 15:54:26 +00:00
Benjamin Kramer 1ee99a8b46 Extend MCAsmLexer so that it can peek forward several tokens
This commit adds a virtual `peekTokens()` function to `MCAsmLexer`
which can peek forward an arbitrary number of tokens.

It also makes the `peekTok()` method call `peekTokens()` method, but
only requesting one token.

The idea is to better support targets which more more ambiguous
assembly syntaxes.

Patch by Dylan McKay!

llvm-svn: 245221
2015-08-17 14:35:25 +00:00
Aaron Ballman aa3d810b5f Correcting a -Woverflow warning where 0xFFFF was overflowing an implicit constant conversion.
llvm-svn: 245220
2015-08-17 14:25:57 +00:00
Joseph Tremoulet 7031c9fc2e [WinEHPrepare] Fix catchret successor phi demotion
Summary:
When demoting an SSA value that has a use on a phi and one of the phi's
predecessors terminates with catchret, the edge needs to be split and the
load inserted in the new block, else we'll still have a cross-funclet SSA
value.

Add a test for this, and for the similar case where a def to be spilled is
on and invoke and a critical edge, which was already implemented but
missing a test.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12065

llvm-svn: 245218
2015-08-17 13:51:37 +00:00
Tobias Grosser 58fdd88751 Revert "Disable targetdatalayoutcheck"
I committed by accident a local hack that should not have made it upstream.
Sorry for the noise.

llvm-svn: 245212
2015-08-17 10:58:03 +00:00
Tobias Grosser 607b8b26e9 Disable targetdatalayoutcheck
llvm-svn: 245210
2015-08-17 10:56:35 +00:00
Daniel Sanders a39ef1c68f [mips] [IAS] Add support for the DLA pseudo-instruction and fix problems with DLI
Summary: It is the same as LA, except that it can also load 64-bit addresses and it only works on 64-bit MIPS architectures.

Reviewers: tomatabacu, seanbruno, vkalintiris

Subscribers: brooks, seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D9524

llvm-svn: 245208
2015-08-17 10:11:55 +00:00
Michael Kuperstein adc4e9c414 [GMR] isNonEscapingGlobalNoAlias() should look through Bitcasts/GEPs when looking at loads.
This fixes yet another case from PR24288.

Differential Revision: http://reviews.llvm.org/D12064

llvm-svn: 245207
2015-08-17 10:06:08 +00:00
James Molloy 88edc8243d Remove hand-rolled matching for fmin and fmax.
SDAGBuilder now does this all for us.

llvm-svn: 245198
2015-08-17 07:13:20 +00:00
James Molloy c617be559a Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNM
This is no longer needed - SDAGBuilder will do this for us.

llvm-svn: 245197
2015-08-17 07:13:15 +00:00
James Molloy ef183397b1 Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder.
These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted.

For example on AArch32 (V8), we have scalar fminnm but not fmin.

Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with.

llvm-svn: 245196
2015-08-17 07:13:10 +00:00
Karthik Bhat 3af28945b9 Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks.
PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was
deleting the next basic block iterator. Fixed the same by resetting the basic block iterator
post call to DeleteDeadInstruction.

llvm-svn: 245195
2015-08-17 05:51:39 +00:00
David Majnemer 8ed559ad22 Revert "[InstCombinePHI] Partial simplification of identity operations."
This reverts commit r244887, it caused PR24470.

llvm-svn: 245194
2015-08-17 03:11:26 +00:00
Chandler Carruth 2f1fd1658f [PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193
2015-08-17 02:08:17 +00:00
Chandler Carruth b596ba2376 [ADT] Teach FoldingSet to be movable.
This is a very minimal move support - it leaves the moved-from object in
a zombie state that is only valid for destruction and move assignment.
This seems fine to me, and leaving it in the default constructed state
would require adding more state to the object and potentially allocating
memory (!!!) and so seems like a Bad Idea.

llvm-svn: 245192
2015-08-16 23:17:27 +00:00
Craig Topper 802d3d397c [TableGen] Use range-based for loop.
llvm-svn: 245191
2015-08-16 21:27:10 +00:00
Craig Topper c4de7ee73d [TableGen] Move the ConversionRow vector into the ConversionTable instead of copying.
llvm-svn: 245190
2015-08-16 21:27:08 +00:00
Benjamin Kramer bb70d751de [SimplifyLibCalls] Drop default template args. No functional change.
llvm-svn: 245189
2015-08-16 21:16:37 +00:00
Benjamin Kramer dc1d1cbd82 [IR] Simplify code. No functionality change.
llvm-svn: 245188
2015-08-16 21:16:26 +00:00
Sanjay Patel 57fd1dc5db transform fmin/fmax calls when possible (PR24314)
If we can ignore NaNs, fmin/fmax libcalls can become compare and select
(this is what we turn std::min / std::max into).

This IR should then be optimized in the backend to whatever is best for
any given target. Eg, x86 can use minss/maxss instructions.

This should solve PR24314:
https://llvm.org/bugs/show_bug.cgi?id=24314

Differential Revision: http://reviews.llvm.org/D11866

llvm-svn: 245187
2015-08-16 20:18:19 +00:00
Sanjoy Das 94c4aecf83 [LSR][NFC] Don’t duplicate entity name at the beginning of the comment.
llvm-svn: 245183
2015-08-16 18:22:46 +00:00
Sanjoy Das 302bfd04b5 [LSR][NFC] Use camelCase for method names in Formula and RegUseTracker.
llvm-svn: 245182
2015-08-16 18:22:43 +00:00
Sanjay Patel 3ab4a73bac use SDValue bool operator; NFCI
llvm-svn: 245181
2015-08-16 17:54:28 +00:00
Yaron Keren 178c465223 Add missing include guard.
llvm-svn: 245173
2015-08-16 07:55:08 +00:00
David Majnemer e04443baff Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks."
This reverts commit r245025, it caused PR24469.

llvm-svn: 245172
2015-08-16 07:11:59 +00:00
David Majnemer dfa3b09541 [InstCombine] Replace an and+icmp with a trunc+icmp
Bitwise arithmetic can obscure a simple sign-test.  If replacing the
mask with a truncate is preferable if the type is legal because it
permits us to rephrase the comparison more explicitly.

llvm-svn: 245171
2015-08-16 07:09:17 +00:00
Chandler Carruth 5efd530cbc Revert r244127: [PM] Remove a failed attempt to port the CallGraph
analysis ...

It turns out that we *do* need the old CallGraph ported to the new pass
manager. There are times where this model of a call graph is really
superior to the one provided by the LazyCallGraph. For example,
GlobalsModRef very specifically needs the model provided by CallGraph.

While here, I've tried to make the move semantics actually work. =]

llvm-svn: 245170
2015-08-16 06:35:19 +00:00
David Majnemer 1a59e49f3c [X86] Widen the 'AND' mask if doing so shrinks the encoding size
We can set additional bits in a mask given that we know the other
operand of an AND already has some bits set to zero.  This can be more
efficient if doing so allows us to use an instruction which implicitly
sign extends the immediate.

This fixes PR24085.

Differential Revision: http://reviews.llvm.org/D11289

llvm-svn: 245169
2015-08-16 04:52:11 +00:00
NAKAMURA Takumi 5196275eea MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting.
Don't assume second would be ordered in the module.

llvm-svn: 245168
2015-08-16 02:41:23 +00:00
Yaron Keren dfb655fe17 Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890
ByteSize and BitSize should not be size_t but unsigned, considering

1) They are at most 2^16 and 2^19, respectively.
2) BitSize is an argument to Type::getIntNTy which takes unsigned.

Also, use the correct utostr instead itostr and cache the string result.

Thanks to James Touton for reporting this!

llvm-svn: 245167
2015-08-15 19:06:14 +00:00
Sanjay Patel 40d4eb40f6 [x86] enable machine combiner reassociations for scalar single-precision minimums
llvm-svn: 245166
2015-08-15 17:01:54 +00:00
Simon Pilgrim d65ace84c7 Updated broadcast stack folding test to avoid use of broadcast intrinsics.
llvm-svn: 245165
2015-08-15 16:54:18 +00:00
Sanjay Patel 3b7e3677e3 fix typos; NFC
llvm-svn: 245164
2015-08-15 16:53:08 +00:00
Sanjay Patel 9f6c7dddd2 add test case to show current codegen
llvm-svn: 245163
2015-08-15 16:49:50 +00:00
Yaron Keren 8b2a031cff Silence VS2015 warning.
Patch by James Touton!

http://reviews.llvm.org/D11890

llvm-svn: 245161
2015-08-15 14:54:43 +00:00
Simon Pilgrim 0750c84623 [DAGCombiner] Attempt to mask vectors before zero extension instead of after.
For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors.

(zext (truncate x)) -> (zext (and(x, m))

Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code.

This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles.

Differential Revision: http://reviews.llvm.org/D11764

llvm-svn: 245160
2015-08-15 13:27:30 +00:00
Chandler Carruth e8824e3026 [PM/AA] Delete the LibCallAliasAnalysis and all the associated
infrastructure.

This AA was never used in tree. It's infrastructure also completely
overlaps that of TargetLibraryInfo which is used heavily by BasicAA to
achieve similar goals to those stated for this analysis.

As has come up in several discussions, the use case here is still really
important, but this code isn't helping move toward that use case. Any
progress on better supporting rich AA information for runtime library
environments would likely be better off starting from scratch or
starting from TargetLibraryInfo than from this base.

Differential Revision: http://reviews.llvm.org/D12028

llvm-svn: 245155
2015-08-15 09:22:21 +00:00
David Majnemer ad28aaa131 [IR] Update CreateCatchRet to take a return value
llvm-svn: 245152
2015-08-15 03:19:29 +00:00
Matt Arsenault 588732bd6e AMDGPU/SI: Only look at live out SGPR defs
When trying to fix SGPR live ranges, skip defs that are
killed in the same block as the def. I don't think
we need to worry about these cases as long as the
live ranges of the SGPRs in dominating blocks are
correct.

This reduces the number of elements the second
loop over the function needs to look at, and makes
it generally easier to understand. The second loop
also only considers if the live range is live
in to a block, which logically means it
must have been live out from another.

llvm-svn: 245150
2015-08-15 02:58:49 +00:00
David Majnemer 0bc0eef71c [IR] Give catchret an optional 'return value' operand
Some personality routines require funclet exit points to be clearly
marked, this is done by producing a token at the funclet pad and
consuming it at the corresponding ret instruction.  CleanupReturnInst
already had a spot for this operand but CatchReturnInst did not.
Other personality routines don't need to use this which is why it has
been made optional.

llvm-svn: 245149
2015-08-15 02:46:08 +00:00
James Y Knight 5567bafe93 Remove redundant TargetFrameLowering::getFrameIndexOffset virtual
function.

This was the same as getFrameIndexReference, but without the FrameReg
output.

Differential Revision: http://reviews.llvm.org/D12042

llvm-svn: 245148
2015-08-15 02:32:35 +00:00
JF Bastien d4698e1bac [WebAssembly] Add Relooper
This is just an initial checkin of an implementation of the Relooper algorithm, in preparation for WebAssembly codegen to utilize. It doesn't do anything yet by itself.

The Relooper algorithm takes an arbitrary control flow graph and generates structured control flow from that, utilizing a helper variable when necessary to handle irreducibility. The WebAssembly backend will be able to use this in order to generate an AST for its binary format.

Author: azakai

Reviewers: jfb, sunfish

Subscribers: jevinskie, arsenm, jroelofs, llvm-commits

Differential revision: http://reviews.llvm.org/D11691

llvm-svn: 245142
2015-08-15 01:23:28 +00:00
JF Bastien 5e4303dc14 Accelerate MergeFunctions with hashing
This patch makes the Merge Functions pass faster by calculating and comparing
a hash value which captures the essential structure of a function before
performing a full function comparison.

The hash is calculated by hashing the function signature, then walking the basic
blocks of the function in the same order as the main comparison function. The
opcode of each instruction is hashed in sequence, which means that different
functions according to the existing total order cannot have the same hash, as
the comparison requires the opcodes of the two functions to be the same order.

The hash function is a static member of the FunctionComparator class because it
is tightly coupled to the exact comparison function used. For example, functions
which are equivalent modulo a single variant callsite might be merged by a more
aggressive MergeFunctions, and the hash function would need to be insensitive to
these differences in order to exploit this.

The hashing function uses a utility class which accumulates the values into an
internal state using a standard bit-mixing function. Note that this is a different interface
than a regular hashing routine, because the values to be hashed are scattered
amongst the properties of a llvm::Function, not linear in memory. This scheme is
fast because only one word of state needs to be kept, and the mixing function is
a few instructions.

The main runOnModule function first computes the hash of each function, and only
further processes functions which do not have a unique function hash. The hash
is also used to order the sorted function set. If the hashes differ, their
values are used to order the functions, otherwise the full comparison is done.

Both of these are helpful in speeding up MergeFunctions. Together they result in
speedups of 9% for mysqld (a mostly C application with little redundancy), 46%
for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all
three cases, the new speed of MergeFunctions is about half that of the module
verifier, making it relatively inexpensive even for large LTO builds with
hundreds of thousands of functions. The same functions are merged, so this
change is free performance.

Author: jrkoenig

Reviewers: nlewycky, dschuff, jfb

Subscribers: llvm-commits, aemerson

Differential revision: http://reviews.llvm.org/D11923

llvm-svn: 245140
2015-08-15 01:18:18 +00:00
Alex Lorenz 3a4a60cba5 MIRLangRef: Describe the syntax that is used to represent machine basic blocks.
llvm-svn: 245138
2015-08-15 01:06:06 +00:00
Matt Arsenault 427a0fd22e LoopStrengthReduce: Try to pass address space to isLegalAddressingMode
This seems to only work some of the time. In some situations,
this seems to use a nonsensical type and isn't actually aware of the
memory being accessed. e.g. if branch condition is an icmp of a pointer,
it checks the addressing mode of i1.

llvm-svn: 245137
2015-08-15 00:53:06 +00:00
Matt Arsenault 297ae311ce AMDGPU/SI: Fix printing useless info with amdhsa
The comments at the bottom would all report 0 if
amdhsa was used.

llvm-svn: 245135
2015-08-15 00:12:39 +00:00
Matt Arsenault 0259a7aa41 AMDGPU/SI: Update LiveVariables
This is simple but won't work if/when this pass
is moved to be post-SSA.

llvm-svn: 245134
2015-08-15 00:12:37 +00:00
Matt Arsenault 670ba46efe AMDGPU/SI: Update LiveIntervals during SIFixSGPRLiveRanges
Does not mark SlotIndexes as reserved, although I think
that might be OK.

LiveVariables still need to be handled.

llvm-svn: 245133
2015-08-15 00:12:35 +00:00
Matt Arsenault b75233235c AMDGPU: Remove unnecessary assert
These shouldn't ever be null. The number of successors
was already asserted to be 2.

llvm-svn: 245132
2015-08-15 00:12:32 +00:00
Matt Arsenault 4275c29a02 AMDGPU/SI: Make comments more precise.
True branch instructions do behave as expected with liveness.

Avoid the phrasing "branch decision is based on a value in an SGPR"
because this could be misleading. A VALU compare instruction's
result is still based on an SGPR, even though that condition
may be divergent.

llvm-svn: 245131
2015-08-15 00:12:30 +00:00
Sanjay Patel 7332e0455f make current codegen visible in the checks, so we can decide if it's right
llvm-svn: 245120
2015-08-14 23:03:01 +00:00
Nick Lewycky 8075fd22b9 Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458!
llvm-svn: 245119
2015-08-14 22:46:49 +00:00
Bjarke Hammersholt Roune 9791ed4705 [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
Summary:
http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.

This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i - offset];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i * stride];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[3 * (i << 7)];


Reviewers: atrick, sanjoy

Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben

Differential Revision: http://reviews.llvm.org/D11860

llvm-svn: 245118
2015-08-14 22:45:26 +00:00
Pat Gavlin b399095c3f Add a target environment for CoreCLR.
Although targeting CoreCLR is similar to targeting MSVC, there are
certain important differences that the backend must be aware of
(e.g. differences in stack probes, EH, and library calls).

Differential Revision: http://reviews.llvm.org/D11012

llvm-svn: 245115
2015-08-14 22:41:43 +00:00
Sanjay Patel dd175bc6c4 make current codegen visible in the checks, so we can decide if it's right
llvm-svn: 245108
2015-08-14 22:10:59 +00:00
Ahmed Bougacha cd35787217 [AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns.
We canonicalize V64 vectors to V128 through insert_subvector: the other
FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't,
so we'd fail to match fmls and generate fneg+fmla instead.
The vector equivalents are already tested and functional.

llvm-svn: 245107
2015-08-14 22:06:05 +00:00
Evgeniy Stepanov 24ac55d884 [msan] Fix handling of musttail calls.
MSan instrumentation for return values of musttail calls is not
allowed by the IR constraints, and not needed at the same time.

llvm-svn: 245106
2015-08-14 22:03:50 +00:00
Alexei Starovoitov cb6b408da4 [bpf] add documentation and instruction set description
llvm-svn: 245105
2015-08-14 22:00:45 +00:00
Alex Lorenz 577d271a75 MIR Serialization: Serialize the '.cfi_same_value' CFI directive.
llvm-svn: 245103
2015-08-14 21:55:58 +00:00
Alex Lorenz c3ba7508f6 MIR Serialization: Serialize the external symbol call entry pseudo source
values.

llvm-svn: 245098
2015-08-14 21:14:50 +00:00
Alex Lorenz 50b826fb75 MIR Serialization: Serialize the global value call entry pseudo source values.
llvm-svn: 245097
2015-08-14 21:08:30 +00:00
Michael Kruse 78a2e4720d [RegionInfo] Remove unused and broken function splitBlock
Summary:
It always makes NewBB the entry of the region instead of OldBB. This breaks if there are edges from inside the region to OldBB. OldBB is moved out of the region and hence there are exiting edges to OldBB and the region's exit block, contradicting the single-exit condition for regions.

The only use from Polly is going to be removed, hence I propose to remove the function completely.

Reviewers: grosser

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11873

llvm-svn: 245092
2015-08-14 20:20:00 +00:00
Tom Stellard bef1094ee7 AMDGPU/SI: Add missing spill class
The compiler was failing to spill for some shaders.

Patch By: Axel Davy

llvm-svn: 245087
2015-08-14 19:46:05 +00:00
Renato Golin 980b6cc42b Revert "[ARM] Fix MachO CPU Subtype selection"
This reverts commit r245081, as it breaks many builds.

llvm-svn: 245086
2015-08-14 19:35:47 +00:00
Alex Lorenz 1039fd1ae5 MIR Serialization: Serialize the 'internal' register operand flag.
llvm-svn: 245085
2015-08-14 19:07:07 +00:00
Alex Lorenz f9a2b12361 MIR Serialization: Serialize the bundled machine instructions.
llvm-svn: 245082
2015-08-14 18:57:24 +00:00
Vedant Kumar 2f079be789 [ARM] Fix MachO CPU Subtype selection
This patch makes the Darwin ARM backend take advantage of TargetParser.  It
also teaches TargetParser about ARMV7K for the first time. This makes target
triple parsing more consistent across llvm.

Differential Revision: http://reviews.llvm.org/D11996

llvm-svn: 245081
2015-08-14 18:36:47 +00:00
Sanjay Patel ed502905f7 [x86] fix allowsMisalignedMemoryAccess() implementation
This patch fixes the x86 implementation of allowsMisalignedMemoryAccess() to correctly
return the 'Fast' output parameter for 32-byte accesses. To test that, an existing load
merging optimization is changed to use the TLI hook. This exposes a shortcoming in the
current logic and results in the regression test update. Changing other direct users of
the isUnalignedMem32Slow() x86 CPU attribute would be a follow-on patch.

Without the fix in allowsMisalignedMemoryAccesses(), we will infinite loop when targeting
SandyBridge because LowerINSERT_SUBVECTOR() creates 32-byte loads from two 16-byte loads
while PerformLOADCombine() splits them back into 16-byte loads.

Differential Revision: http://reviews.llvm.org/D10662

llvm-svn: 245075
2015-08-14 17:53:40 +00:00
Vedant Kumar 06f0678010 [test] Testing write access to llvm
llvm-svn: 245074
2015-08-14 17:42:50 +00:00
Justin Bogner 7ae63aa85d [sancov] Fix an unused variable warning introduced in r245067
llvm-svn: 245072
2015-08-14 17:03:45 +00:00
Kit Barton ae78d53aeb Reverting patch r244235.
This patch will be redone in a different way. See
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150810/292978.html
for more details.

llvm-svn: 245071
2015-08-14 16:54:32 +00:00
Reid Kleckner d7a568fc4f [cmake] Start adding support for LLVM_USE_SANITIZER=Address on Windows
Pass "-fsanitize=address" to the compiler and "-debug" to the linker.

llvm-svn: 245070
2015-08-14 16:48:34 +00:00
Reid Kleckner a57d015154 [sancov] Leave llvm.localescape in the entry block
Summary: Similar to the change we applied to ASan. The same test case works.

Reviewers: samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11961

llvm-svn: 245067
2015-08-14 16:45:42 +00:00
Chad Rosier 67dca908fe Cleanup test whitespace or lack thereof. NFC.
llvm-svn: 245065
2015-08-14 16:34:15 +00:00
Chris Bieneman 9df5d7c272 [CMake] Fix PR14200, llvm-config output misses -fno-rtti
This change adds RTTI and Exception flags to llvm-config's cxxflags. This solution is a minimal patch to solve the issue, and is recommended for the 3.7 release branch. Tom Stellard's outstanding work is the longer term solution.

Patch By: David Wiberg

llvm-svn: 245064
2015-08-14 16:20:31 +00:00
Rafael Espindola dbaf0498a9 Revert "Centralize the information about which object format we are using."
This reverts commit r245047.

It was failing on the darwin bots. The problem was that when running

./bin/llc -march=msp430

llc gets to

  if (TheTriple.getTriple().empty())
    TheTriple.setTriple(sys::getDefaultTargetTriple());

Which means that we go with an arch of msp430 but a triple of
x86_64-apple-darwin14.4.0 which fails badly.

That code has to be updated to select a triple based on the value of
march, but that is not a trivial fix.

llvm-svn: 245062
2015-08-14 15:48:41 +00:00
Davide Italiano c0ea1e6d63 Convert tests under MC/ELF from macho-dump to llvm-readobj.
Yet another step towards deprecating macho-dump.

llvm-svn: 245059
2015-08-14 15:16:37 +00:00
Sanjay Patel 2e75341b7f don't repeaat function names in comments; NFC
llvm-svn: 245058
2015-08-14 15:11:42 +00:00
Rafael Espindola 90eb70c8a7 Centralize the information about which object format we are using.
Other than some places that were handling unknown as ELF, this should
have no change. The test updates are because we were detecting
arm-coff or x86_64-win64-coff as ELF targets before.

It is not clear if the enum should live on the Triple. At least now it lives
in a single location and should be easier to move somewhere else.

llvm-svn: 245047
2015-08-14 13:31:17 +00:00
James Molloy 87405c7f66 Separate out BDCE's analysis into a separate DemandedBits analysis.
This allows other areas of the compiler to use BDCE's bit-tracking.
NFCI.

llvm-svn: 245039
2015-08-14 11:09:09 +00:00
Simon Pilgrim 1fdc177ccf Renamed min tests (typo)
llvm-svn: 245038
2015-08-14 11:03:31 +00:00
James Molloy 63be198712 [AArch64] FMINNAN/FMAXNAN on f16 is not legal.
Spotted by Ahmed - in r244594 I inadvertently marked f16 min/max as legal.

I've reverted it here, and marked min/max on scalar f16's as promote. I've also added a testcase. The test just checks that the compiler doesn't fall over - it doesn't create fmin nodes for f16 yet.

llvm-svn: 245035
2015-08-14 09:08:50 +00:00
Chandler Carruth bae9e88e06 [PM/AA] Remove two no-op overridden functions that just delegated to the
base class anyways.

llvm-svn: 245034
2015-08-14 08:39:32 +00:00
Adam Nemet 06ccf0145f [LVer] Remove unused Pass parameter from versionLoop, NFC
llvm-svn: 245032
2015-08-14 06:30:26 +00:00
Lang Hames 7e66c6c5e3 [RuntimeDyld] Make sure code-sections aren't under-aligned.
Code-section alignment should be at least as high as the minimum
stub alignment. If the section alignment is lower it can cause
padding to be emitted resulting in alignment errors if the section
is mapped to a higher alignment on the target.

E.g. If a text section with a 4-byte alignment gets 4-bytes of
padding to guarantee 8-byte alignment for stubs but is re-mapped to
an 8-byte alignment on the target, the 4-bytes of padding will push
the stubs to 4-byte alignment causing a crash.

No test case: There is currently no way to control host section
alignment in llvm-rtdyld. This could be made testable by adding
a custom memory manager. I'll look at that in a follow-up patch.

llvm-svn: 245031
2015-08-14 06:26:42 +00:00
David Majnemer b611e3f50e [IR] Add token types
This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.

There are several applications for such a type but my immediate
motivation stems from WinEH.  Our personality routine enforces a
single-entry - single-exit regime for cleanups.  After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together.  We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.

Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup.  This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.

What is the burden to the optimizer?  Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway.  There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.

Differential Revision: http://reviews.llvm.org/D11861

llvm-svn: 245029
2015-08-14 05:09:07 +00:00
Karthik Bhat ddc2a86a00 Add support for cross block dse.
This patch enables dead stroe elimination across basicblocks.

Example:
define void @test_02(i32 %N) {
  %1 = alloca i32
  store i32 %N, i32* %1
  store i32 10, i32* @x
  %2 = load i32, i32* %1
  %3 = icmp ne i32 %2, 0
  br i1 %3, label %4, label %5

; <label>:4
  store i32 5, i32* @x
  br label %7

; <label>:5
  %6 = load i32, i32* @x
  store i32 %6, i32* @y
  br label %7

; <label>:7
  store i32 15, i32* @x
  ret void
}
In the above example dead store "store i32 5, i32* @x" is now eliminated.

Differential Revision: http://reviews.llvm.org/D11143

llvm-svn: 245025
2015-08-14 04:17:23 +00:00
Chandler Carruth d541e7304f [PM/AA] Run clang-format over the ObjCARC Alias Analysis code to
normalize its formatting before I make more substantial changes.

llvm-svn: 245024
2015-08-14 03:57:00 +00:00
Chandler Carruth b4ebdf3d72 [PM/AA] Don't bother forward declaring Function and Value, just include
their headers.

llvm-svn: 245023
2015-08-14 03:55:36 +00:00
Saleem Abdulrasool 3e190cb098 PowerPC: remove dead initialization (NFC)
Identified by the clang static analyzer.  No functional change intended.

llvm-svn: 245022
2015-08-14 03:48:35 +00:00
Chandler Carruth 21dcff799a [PM/AA] Extract the interface for GlobalsModRef into a header along with
its creation function.

This required shifting a bunch of method definitions to be out-of-line
so that we could leave most of the implementation guts in the .cpp file.

llvm-svn: 245021
2015-08-14 03:48:20 +00:00
Chandler Carruth 1db22822b4 [PM/AA] Hoist the interface to TBAA into a dedicated header along with
its creation function. Update the relevant includes accordingly.

llvm-svn: 245019
2015-08-14 03:33:48 +00:00
Chandler Carruth f9cbd039bd [PM/AA] Run clang-format over TBAA code to normalize the formatting
before making substantial changes.

llvm-svn: 245017
2015-08-14 03:26:15 +00:00
Chandler Carruth d30e45c35c [PM/AA] Remove a stray #include that snuck in via copy/paste when
creating this header.

llvm-svn: 245016
2015-08-14 03:16:11 +00:00
Chandler Carruth 55eec8be9e [PM/AA] Clean up the SCEV-AA comment formatting and typos.
llvm-svn: 245015
2015-08-14 03:14:50 +00:00
Chandler Carruth 79687faee6 [PM/AA] Run clang-format over the SCEV-AA code to normalize the
formatting.

llvm-svn: 245014
2015-08-14 03:12:16 +00:00
Chandler Carruth ed23528fb2 [PM/AA] Hoist the SCEV-AA interface to its own header and pull the
creation function into that header.

llvm-svn: 245013
2015-08-14 03:11:16 +00:00
Chandler Carruth 42ff448fe4 [PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the
creation function there.

Same basic refactoring as the other alias analyses. Nothing special
required this time around.

llvm-svn: 245012
2015-08-14 02:55:50 +00:00
Chandler Carruth 29109f5b1e [PM/AA] Hoist the value handle definition for CFLAA into the header to
satisfy libc++'s std::forward_list which requires the value type to be
complete.

llvm-svn: 245011
2015-08-14 02:50:34 +00:00
Chandler Carruth 496b7fb569 [PM/AA] Run clang-format over the ScopedNoAliasAA pass prior to making
substantial changes to normalize any formatting.

llvm-svn: 245010
2015-08-14 02:46:07 +00:00
Chandler Carruth 8b046a42f4 [PM/AA] Extract a minimal interface for CFLAA to its own header file.
I've used forward declarations and reorderd the source code some to make
this reasonably clean and keep as much of the code as possible in the
source file, including all the stratified set details. Just the basic AA
interface and the create function are in the header file, and the header
file is now included into the relevant locations.

llvm-svn: 245009
2015-08-14 02:42:20 +00:00
Chandler Carruth f4616a4084 [PM/AA] Delete two pointlessly overridden methods on the AA interface by
the AA counter pass.

For pointsToConstantMemory, I think this is a "bug fix" as I think the
code as written will actually infloop if ever reached. For the
getModRefInfo, this is a no-op change but with a significantly simpler
form.

llvm-svn: 245007
2015-08-14 02:16:12 +00:00
Chandler Carruth 1b179e1102 [PM/AA] Sink all the actual code from AliasAnalysisCounter back into the
.cpp file to make the header much less noisy.

Also makes it easy to use a static helper rather than a public method
for printing lines of stats.

llvm-svn: 245006
2015-08-14 02:12:12 +00:00
Chandler Carruth fafee839d9 [PM/AA] Run clang-format over this code to establish a clean baseline
for subsequent changes.

llvm-svn: 245005
2015-08-14 02:07:05 +00:00
Chandler Carruth 7a9ba04809 [PM/AA] Hoist the AA counter pass into a header to match the analysis
pattern.

Also hoist the creation routine out of the generic header and into the
pass header now that we have one.

I've worked to not make any changes, even formatting ones here. I'll
clean up the formatting and other things in a follow-up patch now that
the code is in the right place.

llvm-svn: 245004
2015-08-14 02:05:41 +00:00
Jingyue Wu 1238f341ba [SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow.
Summary:
This patch implements my promised optimization to reunites certain sexts from
operands after we extract the constant offset. See the header comment of
reuniteExts for its motivation.

One key building block that enables this optimization is Bjarke's poison value
analysis (D11212). That helps to prove "a +nsw b" can't overflow.

Reviewers: broune

Subscribers: jholewinski, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12016

llvm-svn: 245003
2015-08-14 02:02:05 +00:00
Chandler Carruth 45cf0bf117 [PM/AA] Remove the function names and class names from doxygen comments
and generally clean up their formatting.

llvm-svn: 245002
2015-08-14 01:43:46 +00:00
Chandler Carruth 4ac50b08e3 [PM/AA] Move the LibCall AA creation routine declaration to that
analysis's header file to be more consistent with other analyses.

llvm-svn: 245001
2015-08-14 01:43:02 +00:00
Chandler Carruth cf76e57dd8 [PM/AA] Run clang-format over LibCallAliasAnalysis prior to making
substantial changes needed for the new pass manager's AA integration.

llvm-svn: 245000
2015-08-14 01:38:25 +00:00
David Blaikie 08964cade1 Update ExceptionDemo for exception handling API changes (personality function call->function move)
The ExceptionDemo now compiles, but doesn't link... undefined type
references to various typeinfo.

llvm-svn: 244997
2015-08-14 00:37:16 +00:00
Alex Lorenz 9846167816 Update MIRLangRef for MIR syntax change from r244982.
llvm-svn: 244996
2015-08-14 00:36:10 +00:00
David Blaikie f4057fec55 Fix -Wformat warnings in ExceptionDemo
llvm-svn: 244995
2015-08-14 00:31:49 +00:00
David Blaikie 3b20b04072 Fix up the ExceptionDemo for some API changes over the past <time>
This still doesn't build -Werror clean, but other than that it should at
least build.

llvm-svn: 244994
2015-08-14 00:24:56 +00:00
Chandler Carruth bf143e2a20 [LIR] Re-instate r244880, reverted in r244884, factoring the handling of
AliasAnalysis in LoopIdiomRecognize.

The previous commit to LIR, r244879, exposed some scary bug in the loop
pass pipeline with an assert failure that showed up on several bots.
This patch got reverted as part of getting that revision reverted, but
they're actually independent and unrelated. This patch has no functional
change and should be completely safe. It is also useful for my current
work on the AA infrastructure.

llvm-svn: 244993
2015-08-14 00:21:10 +00:00
Alex Lorenz 5022f6bb81 MIR Serialization: Change MIR syntax - use custom syntax for MBBs.
This commit modifies the way the machine basic blocks are serialized - now the
machine basic blocks are serialized using a custom syntax instead of relying on
YAML primitives. Instead of using YAML mappings to represent the individual
machine basic blocks in a machine function's body, the new syntax uses a single
YAML block scalar which contains all of the machine basic blocks and
instructions for that function.

This is an example of a function's body that uses the old syntax:

    body:
      - id: 0
        name: entry
        instructions:
          - '%eax = MOV32r0 implicit-def %eflags'
          - 'RETQ %eax'
    ...

The same body is now written like this:

    body: |
      bb.0.entry:
        %eax = MOV32r0 implicit-def %eflags
        RETQ %eax
    ...

This syntax change is motivated by the fact that the bundled machine
instructions didn't map that well to the old syntax which was using a single
YAML sequence to store all of the machine instructions in a block. The bundled
machine instructions internally use flags like BundledPred and BundledSucc to
determine the bundles, and serializing them as MI flags using the old syntax
would have had a negative impact on the readability and the ease of editing
for MIR files. The new syntax allows me to serialize the bundled machine
instructions using a block construct without relying on the internal flags,
for example:

   BUNDLE implicit-def dead %itstate, implicit-def %s1 ... {
      t2IT 1, 24, implicit-def %itstate
      %s1 = VMOVS killed %s0, 1, killed %cpsr, implicit killed %itstate
   }

This commit also converts the MIR testcases to the new syntax. I developed
a script that can convert from the old syntax to the new one. I will post the
script on the llvm-commits mailing list in the thread for this commit.

llvm-svn: 244982
2015-08-13 23:10:16 +00:00
Sanjay Patel a75c41e5f3 don't repeat function names in comments; NFC
llvm-svn: 244977
2015-08-13 22:53:20 +00:00
David Majnemer 5c73c941c9 [IR] Cleanup indentation of EH instructions
No functional change is intended, just tidying up whitespace.

llvm-svn: 244966
2015-08-13 22:11:40 +00:00
Simon Pilgrim 7218251861 [AMDGPU] Use the general SMAX/SMIN/UMAX/UMIN pattern matching and remove the AMDGPU implementation
D9746 added general SMAX/SMIN/UMAX/UMIN pattern matching to SelectionDAGBuilder::visitSelect.

Differential Revision: http://reviews.llvm.org/D12007

llvm-svn: 244960
2015-08-13 21:40:02 +00:00
Ahmed Bougacha 80e4ac802a [AArch64] Provide "too few operands" diags on short-form NEON also.
We used to just say "invalid type suffix for instruction", which is
misleading. This is because we fallback to the long-form matcher if the
short-form matcher failed, losing the error information on the way.

Save it, so that we can provide a little better diagnostics when the
long-form matcher thinks a suffix is the cause of the error.

llvm-svn: 244955
2015-08-13 21:09:13 +00:00
Alex Lorenz 6866104073 MIR Parser: Don't allow negative alignments for memory operands.
llvm-svn: 244953
2015-08-13 20:55:01 +00:00
Simon Pilgrim 4a8d6b3b9e [X86][SSE] Use the general SMAX/SMIN/UMAX/UMIN pattern matching and remove the X86 implementation
Follow up to D10947 - D9746 added general SMAX/SMIN/UMAX/UMIN pattern matching to SelectionDAGBuilder::visitSelect.

This patch removes the X86 implementation and improves the AVX1/AVX2 support to correctly lower 256-bit integer vectors.

Differential Revision: http://reviews.llvm.org/D12006

llvm-svn: 244949
2015-08-13 20:45:55 +00:00
Davide Italiano a195386ca1 [SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttz
If <src> is non-zero we can safely set the flag to true, and this
results in less code generated for, e.g. ffs(x) + 1 on FreeBSD.
Thanks to majnemer for suggesting the fix and reviewing.

Code generated before the patch was applied:


 0:   0f bc c7                bsf    %edi,%eax
 3:   b9 20 00 00 00          mov    $0x20,%ecx
 8:   0f 45 c8                cmovne %eax,%ecx
 b:   83 c1 02                add    $0x2,%ecx
 e:   b8 01 00 00 00          mov    $0x1,%eax
13:   85 ff                   test   %edi,%edi
15:   0f 45 c1                cmovne %ecx,%eax
18:   c3                      retq

Code generated after the patch was applied:

 0:   0f bc cf                bsf    %edi,%ecx
 3:   83 c1 02                add    $0x2,%ecx
 6:   85 ff                   test   %edi,%edi
 8:   b8 01 00 00 00          mov    $0x1,%eax
 d:   0f 45 c1                cmovne %ecx,%eax
10:   c3                      retq

It seems we can still use cmove and save another 'test' instruction, but
that can be tackled separately.

Differential Revision:  http://reviews.llvm.org/D11989	

llvm-svn: 244947
2015-08-13 20:34:26 +00:00
Alex Lorenz 620f89145b MIR Parser: Extract the code that parses the alignment into a new method. NFC.
This commit extracts the code that parses the memory operand's alignment into
a new method named 'parseAlignment' so that it can be reused when parsing the
basic block's alignment attribute.

llvm-svn: 244945
2015-08-13 20:33:33 +00:00
Simon Pilgrim 829091e310 [X86][SSE] Tests for SMAX/SMIN/UMAX/UMIN vector instructions
llvm-svn: 244944
2015-08-13 20:31:03 +00:00
Alex Lorenz 9b62cf6143 MIR Parser: Rename the method 'diagFromLLVMAssemblyDiag'. NFC.
This commit renames the method 'diagFromLLVMAssemblyDiag' to
'diagFromBlockStringDiag'. This method will be used when converting diagnostics
from other YAML block strings, and not just the LLVM module block string, so
the new name should reflect that.

llvm-svn: 244943
2015-08-13 20:30:11 +00:00
Jingyue Wu 13a80eaceb [SeparateConstOffsetFromGEP] strengthen the inbounds attribute
We used to be over-conservative about preserving inbounds. Actually, the second
GEP (which applies the constant offset) can inherit the inbounds attribute of
the original GEP, because the resultant pointer is equivalent to that of the
original GEP. For example,

  x  = GEP inbounds a, i+5
    =>
  y = GEP a, i               // inbounds removed
  x = GEP inbounds y, 5      // inbounds preserved

llvm-svn: 244937
2015-08-13 18:48:49 +00:00
David Majnemer d7fafa0ffc [llvm-cxxdump] Correctly process relocations when given multiple files
Archive files wouldn't lead to us reprocessing the section relocations
for the new object files.

llvm-svn: 244932
2015-08-13 18:31:43 +00:00
Yaron Keren 556b21aa10 Remove and forbid raw_svector_ostream::flush() calls.
After r244870 flush() will only compare two null pointers and return,
doing nothing but wasting run time. The call is not required any more
as the stream and its SmallString are always in sync.

Thanks to David Blaikie for reviewing.

llvm-svn: 244928
2015-08-13 18:12:56 +00:00
Nick Lewycky e2f6fb5d0a Fix GCC warning: extra `;' [-Wpedantic].
llvm-svn: 244924
2015-08-13 18:10:19 +00:00
Nemanja Ivanovic 1c39ca6501 Scalar to vector conversions using direct moves
This patch corresponds to review:
http://reviews.llvm.org/D11471

It improves the code generated for converting a scalar to a vector value. With
direct moves from GPRs to VSRs, we no longer require expensive stack operations
for this. Subsequent patches will handle the reverse case and more general
operations between vectors and their scalar elements.

llvm-svn: 244921
2015-08-13 17:40:44 +00:00
Igor Laevsky 30143aee11 Emit argmemonly attribute for intrinsics.
Differential Revision: http://reviews.llvm.org/D11352

llvm-svn: 244920
2015-08-13 17:40:04 +00:00
James Molloy 31117875c2 [ARM] FMINNAN/FMAXNAN of f64 are not legal.
This was my error. We've got f32 marked as legal because they're simulated using a v2f32 instruction, but there's no equivalent for f64.

This will get test coverage imminently when D12015 lands.

llvm-svn: 244916
2015-08-13 17:28:26 +00:00
James Molloy c71f78f49f [ARM] Allow vmin/vmax of scalars to be emitted without UseNEONForFP.
This overrides the default to more closely resemble the hand-crafted matching logic in ISelLowering. It makes sense, as there is no VFP equivalent of vmin or vmax, to use them when they're available even if in general VFP ops should be preferred.

This should be NFC.

llvm-svn: 244915
2015-08-13 17:28:20 +00:00
James Molloy 6f0a9d7e1b [ARM] Rejig vmax tests a bit
They rely on global fast-math options, but soon ISel will rely only on fast-math flags on the instructions themselves. Rip the fast checks out into their own file so we can mark their instructions as fast.

llvm-svn: 244914
2015-08-13 17:28:16 +00:00
James Molloy e7695bca0f [AArch64] Small rejig of fmax tests, NFCI.
These tests relied on -enable-no-nans-fp-math, whereas soon they'll take their no-nans hint
from the FCMP instruction itself, so split the no-nans stuff out into its own test.

Also do a slight rejig of instruction order. The old FMIN/MAX backend matching had to deal with looking through casts, which it never did particularly well. Now, instcombine will recognize such patterns and canonicalize the cast outside the select. So modify the test inputs to assume that instcombine has already run.

llvm-svn: 244913
2015-08-13 17:28:10 +00:00
Erik Eckstein 11fc8175d9 [DeadStoreElimination] remove a redundant store even if the load is in a different block.
DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location.
So far this worked only if the store is in the same block as the load.
Now we can also handle stores which are in a different block than the load.
Example:

define i32 @test(i1, i32*) {
entry:
  %l2 = load i32, i32* %1, align 4
  br i1 %0, label %bb1, label %bb2
bb1:
  br label %bb3
bb2:
  ; This store is redundant
  store i32 %l2, i32* %1, align 4
  br label %bb3
bb3:
  ret i32 0
}

Differential Revision: http://reviews.llvm.org/D11854

llvm-svn: 244901
2015-08-13 15:36:11 +00:00
Petar Jovanovic d22164dc3b [mips][mcjit] Calculate correct addend for HI16 and PCHI16 reloc
Previously, for O32 ABI we did not calculate correct addend for R_MIPS_HI16
and R_MIPS_PCHI16 relocations. This patch fixes that.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D11186

llvm-svn: 244897
2015-08-13 15:12:49 +00:00
Joseph Tremoulet c9ff914ced [WinEHPrepare] Update demotion logic
Summary:
Update the demotion logic in WinEHPrepare to avoid creating new cleanups by
walking predecessors as necessary to insert stores for EH-pad PHIs.

Also avoid creating stores for EH-pad PHIs that have no uses.

The store/load placement is still pretty naive.  Likely future improvements
(at least for optimized compiles) include:
 - Share loads for related uses as possible
 - Coalesce non-interfering use/def-related PHIs
 - Store at definition point rather than each PHI pred for non-interfering
   lifetimes.


Reviewers: rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11955

llvm-svn: 244894
2015-08-13 14:30:10 +00:00
Ulrich Weigand a887f06214 [SystemZ] Support large LLVM IR struct return values
Recent mesa/llvmpipe crashes on SystemZ due to a failed assertion when
attempting to compile a routine with a return type of
  { <4 x float>, <4 x float>, <4 x float>, <4 x float> }
on a system without vector instruction support.

This is because after legalizing the vector type, we get a return value
consisting of 16 floats, which cannot all be returned in registers.

Usually, what should happen in this case is that the target's CanLowerReturn
routine rejects the return type, in which case SelectionDAG falls back to
implementing a structure return in memory via implicit reference.

However, the SystemZ target never actually implemented any CanLowerReturn
routine, and thus would accept any struct return type.

This patch fixes the crash by implementing CanLowerReturn.  As a side effect,
this also handles fp128 return values, fixing a todo that was noted in
SystemZCallingConv.td.

llvm-svn: 244889
2015-08-13 13:37:06 +00:00
Yaron Keren a3668a3fcd Remove raw_svector_ostream::resync and users. It's no-op after r244870.
llvm-svn: 244888
2015-08-13 12:42:25 +00:00
Charlie Turner 6153698f26 [InstCombinePHI] Partial simplification of identity operations.
Consider this code:

BB:
  %i = phi i32 [ 0, %if.then ], [ %c, %if.else ]
  %add = add nsw i32 %i, %b
  ...

In this common case the add can be moved to the %if.else basic block, because
adding zero is an identity operation. If we go though %if.then branch it's
always a win, because add is not executed; if not, the number of instructions
stays the same.

This pattern applies also to other instructions like sub, shl, shr, ashr | 0,
mul, sdiv, div | 1.

Patch by Jakub Kuderski!

llvm-svn: 244887
2015-08-13 12:38:58 +00:00
Renato Golin 655348f0b2 Revert "[LIR] Start leveraging the fundamental guarantees of a loop..."
This reverts commit r244879, as it broke the test-suite on
SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64.

llvm-svn: 244885
2015-08-13 11:25:38 +00:00
Renato Golin 4d57906b0e Revert "[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize."
This reverts commit r244880, as it broke the test-suite on
SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64.

llvm-svn: 244884
2015-08-13 11:25:35 +00:00
Ashutosh Nema 47802628f7 Test Commit.
llvm-svn: 244883
2015-08-13 11:18:35 +00:00
John Brawn 68acdcb435 [ARM] Reorganise and simplify thumb-1 load/store selection
Other than PC-relative loads/store the patterns that match the various
load/store addressing modes have the same complexity, so the order that they
are matched is the order that they appear in the .td file.

Rearrange the instruction definitions in ARMInstrThumb.td, and make use of
AddedComplexity for PC-relative loads, so that the instruction matching order
is the order that results in the simplest selection logic. This also makes
register-offset load/store be selected when it should, as previously it was
only selected for too-large immediate offsets.

Differential Revision: http://reviews.llvm.org/D11800

llvm-svn: 244882
2015-08-13 10:48:22 +00:00
Chandler Carruth c2af09823f [LIR] Handle access to AliasAnalysis the same way as the other analysis
in LoopIdiomRecognize. This is what started me staring at this code. Now
migrating it with the new AA stuff will be trivial.

llvm-svn: 244880
2015-08-13 10:00:53 +00:00
Chandler Carruth 8ae7b81559 [LIR] Start leveraging the fundamental guarantees of a loop in
simplified form to remove redundant checks and simplify the code for
popcount recognition. We don't actually need to handle all of these
cases.

I've left a FIXME for one in particular until I finish inspecting to
make sure we don't actually *rely* on the predicate in any way.

llvm-svn: 244879
2015-08-13 09:56:20 +00:00
Chandler Carruth 18c2669aca [LIR] Handle the LoopInfo the same as all the other analyses. No utility
really in breaking pattern just for this analysis.

llvm-svn: 244878
2015-08-13 09:27:01 +00:00
Simon Pilgrim becd5e8abd [InstCombine] SSE/AVX vector shifts demanded shift amount bits
Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this.

I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift.

Differential Revision: http://reviews.llvm.org/D11938

llvm-svn: 244872
2015-08-13 07:39:03 +00:00
Yaron Keren 3d1173ba1a Modify raw_svector_ostream to use its SmallString without additional buffering.
This is faster and avoids the stream and SmallString state synchronization issue.
resync() is a no-op and may be safely deleted.  I'll do so in a follow-up commit.

Reviewed by Rafael Espindola.

llvm-svn: 244870
2015-08-13 06:19:52 +00:00
Chen Li f458c6f313 [LoopUnswitch] Check OptimizeForSize before traversing over all basic blocks in current loop
Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC. 

Reviewers: reames, weimingz, broune

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11997

llvm-svn: 244868
2015-08-13 05:24:29 +00:00
Ahmed Bougacha a196661bb0 [CodeGen] Mark the promoted FCOPYSIGN result FP_ROUND as TRUNCating.
Now that we can properly promote mismatched FCOPYSIGNs (r244858), we
can mark the FP_ROUND on the result as truncating, to expose folding.

FCOPYSIGN doesn't change anything but the sign bit, so
  (fp_round (fcopysign (fpext a), b))
is equivalent to (modulo the sign bit):
  (fp_round (fpext a))
which is a no-op.

llvm-svn: 244862
2015-08-13 01:32:30 +00:00
Ahmed Bougacha b2a9ed910e [AArch64] Cleanup vector-fcopysign.ll test. NFC.
llvm-svn: 244861
2015-08-13 01:20:38 +00:00
Ahmed Bougacha 2a97b1bcf8 [AArch64] Also custom-lowering mismatched vector/f16 FCOPYSIGN.
We can lower them using our cool tricks if we fpext/fptrunc the second
input, like we do for f32/f64.

Follow-up to r243924, r243926, and r244858.

llvm-svn: 244860
2015-08-13 01:13:56 +00:00
Ahmed Bougacha b5b0cfdff7 [CodeGen] Assert on getNode(FP_EXTEND) with a smaller dst type.
This would have caught the problem in r244858.

llvm-svn: 244859
2015-08-13 01:10:29 +00:00
Ahmed Bougacha 40ded502ff [CodeGen] When Promoting, don't extend the 2nd FCOPYSIGN operand.
We don't care about its type, and there's even a combine that'll fold
away the FP_EXTEND if we let it run. However, until it does, we'll have
something broken like:
  (f32 (fp_extend (f64 v)))

Scalar f16 follow-up to r243924.

llvm-svn: 244858
2015-08-13 01:09:43 +00:00
Ahmed Bougacha 31e0d9a2b1 [CodeGen] Simplify getNode(*EXT/TRUNC) type size assert. NFC.
We already check that vectors have the same number of elements, we
don't need to use the scalar types explicitly: comparing the size of
the whole vector is enough.

llvm-svn: 244857
2015-08-13 01:08:48 +00:00
Rafael Espindola b82455d262 There is only one saver of strings.
llvm-svn: 244854
2015-08-13 01:07:02 +00:00
Chandler Carruth dc298329cc [LIR] Make the LoopIdiomRecognize pass get analyses essentially the same
way as every other pass. This simplifies the code quite a bit and is
also more idiomatic! <ba-dum!>

llvm-svn: 244853
2015-08-13 01:03:26 +00:00
Chandler Carruth 8219a501da [LIR] Remove the dedicated class for popcount recognition and sink the
code into methods on LoopIdiomRecognize.

This simplifies the code somewhat and also makes it much easier to move
the analyses around. Ultimately, the separate class wasn't providing
significant value over methods -- it contained the precondition basic
block and the current loop. The current loop is already available and
the precondition block wasn't needed everywhere and is easy to pass
around.

In several cases I just moved things to be static functions because they
already accepted most of their inputs as arguments.

This doesn't fix the way we manage analyses yet, that will be the next
patch, but it already makes the code over 50 lines shorter.

No functionality changed.

llvm-svn: 244851
2015-08-13 00:44:29 +00:00
Rafael Espindola 169284a67b Return ErrorOr from FileOutputBuffer::create. NFC.
llvm-svn: 244848
2015-08-13 00:31:39 +00:00
Dan Gohman ed8aa2745a [WebAssembly] Declare the llvm.wasm.page.size() intrinsic.
llvm-svn: 244847
2015-08-13 00:26:04 +00:00
Chandler Carruth d9c6070c98 [LIR] Move all the helpers to be private and re-order the methods in
a way that groups things logically. No functionality changed.

llvm-svn: 244845
2015-08-13 00:10:03 +00:00
Steve King a20803a70f Test Commit - Corrected spelling in README.txt.
llvm-svn: 244842
2015-08-12 23:56:50 +00:00
Chandler Carruth be158b17db [LIR] Remove the 'LIRUtils' abstraction which was unnecessary and adding
complexity.

There is only one function that was called from multiple locations, and
that was 'getBranch' which has a reasonable one-line spelling already:
dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but
it doesn't seem to add much value. Instead, we should avoid calling it
so many times on the same basic blocks, but that will be in a subsequent
patch.

The other functions are only called in one location, so inline them
there, and take advantage of this to use direct early exit and reduce
indentation. This makes it much more clear what is being tested for, and
in fact makes it clear now to me that there are simpler ways to do this
work. However, this patch just does the mechanical inlining. I'll clean
up the functionality of the code to leverage loop simplified form more
effectively in a follow-up.

Despite lots of early line breaks due to early-exit, this is still
shorter than it was before.

llvm-svn: 244841
2015-08-12 23:55:56 +00:00
David Blaikie b600718a35 Simplify PackedVector by removing user-defined special members that aren't any different than the defaults
This causes the other special members (like move and copy construction,
and move assignment) to come through for free. Some code in clang was
depending on the (deprecated, in the original code) copy ctor. Now that
there's no user-defined special members, they're all available without
any deprecation concerns.

llvm-svn: 244835
2015-08-12 23:26:12 +00:00
David Blaikie 8c20b9e1c9 IRBuilder: Use move semantics for the IRBuilderInserter parameter
Just drive by cleanup while fixing -Wdeprecated warnings.

llvm-svn: 244832
2015-08-12 23:18:49 +00:00
Chandler Carruth bad690e8f7 [LIR] Run clang-format over LoopIdiomRecognize in preparation for
a significant code cleanup here.

The handling of analyses in this pass is overly complex and can be
simplified significantly, but the right way to do that is to simplify
all of the code not just the analyses, and that'll require pretty
extensive edits that would be noisy with formatting changes mixed into
them.

llvm-svn: 244828
2015-08-12 23:06:37 +00:00
Chandler Carruth 295282e0ab [PM/AA] Remove the AliasDebugger pass.
This debugger was designed to catch places where the old update API was
failing to be used correctly. As I've removed the update API, it no
longer serves any purpose. We can introduce new debugging aid passes
around any future work w.r.t. updating AAs.

Note that I've updated the documentation here, but really I need to
rewrite the documentation to carefully spell out the ideas around
stateful AA and how things are changing in the AA world. However, I'm
hoping to do that as a follow-up to the refactoring of the AA
infrastructure to work in both old and new pass managers so that I can
write the documentation specific to that world.

Differential Revision: http://reviews.llvm.org/D11984

llvm-svn: 244825
2015-08-12 22:54:47 +00:00
David Majnemer f49de4814c Add myself as the InstCombine owner.
llvm-svn: 244823
2015-08-12 22:30:45 +00:00
Philip Reames 971dc3a82a [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints
To be clear: this is an *optimization* not a correctness change.

CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll.

One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust.

I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially.

Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly.

Differential Revision: http://reviews.llvm.org/D11819

llvm-svn: 244821
2015-08-12 22:11:45 +00:00
Alex Lorenz 2791dcca60 MIR Parser: Allow the MI IR references to reference global values.
This commit fixes a bug where MI parser couldn't resolve the named IR
references that referenced named global values.

llvm-svn: 244817
2015-08-12 21:27:16 +00:00
Alex Lorenz 0cc671bf79 MIR Serialization: Serialize the fixed stack pseudo source values.
llvm-svn: 244816
2015-08-12 21:23:17 +00:00
Cong Hou 2a02c1cb1a NFC. Convert comments in MachineBasicBlock.cpp into new style.
llvm-svn: 244815
2015-08-12 21:18:54 +00:00
Alex Lorenz ea88212b41 MIR Parser: Move the parsing of fixed stack object indices into new method. NFC
This commit moves the code that parses the frame indices for the fixed stack
objects from the method 'parseFixedStackObjectOperand' to a new method named
'parseFixedStackFrameIndex', so that it can be reused when parsing fixed stack
pseudo source values.

llvm-svn: 244814
2015-08-12 21:17:02 +00:00
Alex Lorenz 4be56e9370 MIR Serialization: Serialize the jump table pseudo source values.
llvm-svn: 244813
2015-08-12 21:11:08 +00:00
Aaron Ballman c25c9fdf08 Move the object being used to move-initialize when calling the base class' constructor from the ctor-initializer. This should have no effect given the triviality of the class, but it allows for easier maintenance should the semantics of the base class change. NFC intended.
llvm-svn: 244812
2015-08-12 21:10:41 +00:00
Alex Lorenz d858f874fa MIR Serialization: Serialize the GOT pseudo source values.
llvm-svn: 244809
2015-08-12 21:00:22 +00:00
Philip Reames 9ac4e38a16 [RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm
When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :)

This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed.  Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement.  That change will follow in the near future.  

llvm-svn: 244808
2015-08-12 21:00:20 +00:00
Alex Lorenz 46e9558ac6 MIR Serialization: Serialize the stack pseudo source values.
llvm-svn: 244806
2015-08-12 20:44:16 +00:00
Sanjay Patel e24c60eb54 fix typo; NFC
llvm-svn: 244805
2015-08-12 20:36:18 +00:00
Dan Gohman 70e51cb19d Update a comment; Emscripten no longer uses le32 and le64. NFC.
llvm-svn: 244804
2015-08-12 20:34:40 +00:00
Alex Lorenz 91097a3ffa MIR Serialization: Serialize the constant pool pseudo source values.
llvm-svn: 244803
2015-08-12 20:33:26 +00:00
Lenny Maiorani 1230a54970 Fix missing space in libfuzzer's help text.
llvm-svn: 244800
2015-08-12 20:00:10 +00:00
Hans Wennborg 9aaf0a90e9 Docs: keep copyright years up-to-date.
llvm-svn: 244789
2015-08-12 18:27:23 +00:00
Chandler Carruth 19ac7d5b29 [PM/AA] Add missing static dependency edges from DSE and memdep to TLI.
I forgot to add these in r244780 and r244778. Sorry about that.

Also order the static dependencies in a lexicographical order.

llvm-svn: 244787
2015-08-12 18:10:45 +00:00