Commit Graph

18755 Commits

Author SHA1 Message Date
Aaron Ballman c681c3d890 Silencing a spurious -Wreturn-type warning; NFC.
llvm-svn: 238099
2015-05-23 14:46:49 +00:00
Duncan P. N. Exon Smith 68b3f30778 AsmPrinter: Remove the vtable-entry from DIEValue
Remove all virtual functions from `DIEValue`, dropping the vtable
pointer from its layout.  Instead, create "impl" functions on the
subclasses, and use the `DIEValue::Type` to implement the dynamic
dispatch.

This is necessary -- obviously not sufficient -- for passing `DIEValue`s
around by value.  However, this change stands on its own: we make tons
of these.  I measured a drop in memory usage from 888 MB down to 860 MB,
or around 3.2%.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)

llvm-svn: 238084
2015-05-23 01:45:07 +00:00
Duncan P. N. Exon Smith d5aa33525c CodeGen: Remove redundant DIETypeSignature::dump(), NFC
We already have this in `DIEValue`; no reason to shadow it.

llvm-svn: 238082
2015-05-23 01:26:26 +00:00
Akira Hatanaka ddf76aa36f Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions.
This is part of the work to remove TargetMachine::resetTargetOptions.

In this patch, instead of updating global variable NoFramePointerElim in
resetTargetOptions, its use in DisableFramePointerElim is replaced with a call
to TargetFrameLowering::noFramePointerElim. This function determines on a
per-function basis if frame pointer elimination should be disabled.

There is no change in functionality except that cl:opt option "disable-fp-elim"
can now override function attribute "no-frame-pointer-elim". 

llvm-svn: 238080
2015-05-23 01:14:08 +00:00
Akira Hatanaka bd881834c5 Simplify and rename function overrideFunctionAttributes. NFC.
This is in preparation to making changes needed to stop resetting
NoFramePointerElim in resetTargetOptions.

llvm-svn: 238079
2015-05-23 01:12:26 +00:00
Ahmed Bougacha 236f9040d0 [AArch64][CGP] Sink zext feeding stxr/stlxr into the same block.
The usual CodeGenPrepare trickery, on a target-specific intrinsic.
Without this, the expansion of atomics will usually have the zext
be hoisted out of the loop, defeating the various patterns we have
to catch this precise case.

Differential Revision: http://reviews.llvm.org/D9930

llvm-svn: 238054
2015-05-22 21:37:17 +00:00
Puyan Lotfi bb457b973d Compile time improvements to VirtRegRewriter.
This change to VirtRegRewriter::addMBBLiveIns adds live-in registers for each
MachineBasicBlock's LiveIns set without isLiveIn checks as they are being added
because doing so is expensive. After all live-in registers are added, the LiveIn
vectors are sorted and uniqued.

llvm-svn: 238008
2015-05-22 08:11:26 +00:00
NAKAMURA Takumi 263b27997d Revert r237954, "Resubmit r237708 (MIR Serialization: print and parse LLVM IR using MIR format)."
It brought cyclic dependencies between LLVMCodeGen and LLVMMIR.

llvm-svn: 238007
2015-05-22 07:17:07 +00:00
Duncan P. N. Exon Smith 0c54197d31 SDAG: Give SDDbgValues their own allocator (and reset it)
Previously `SDDbgValue`s used the general allocator that lives for all
of `SelectionDAG`.  Instead, give them their own allocator, and reset it
whenever `SDDbgInfo::clear()` is called, plugging a spiritual leak.

This drops `SelectionDAGBuilder::visitIntrinsicCall()` off of my heap
profile (was at around 2% of `llc` for codegen of `-flto -g`).  Thanks
to Pete Cooper for spotting the problem and suggesting the fix.

llvm-svn: 237998
2015-05-22 05:45:19 +00:00
Duncan P. N. Exon Smith 1f0c1c4f47 SDAG: Cleanup initialization of SDDbgValue, NFC
Cleanup how `SDDbgValue` is initialized, and rearrange the fields to
save two pointers in the struct layout.  No real functionality change
though (and I doubt the memory savings would show up in a profile).

llvm-svn: 237997
2015-05-22 05:35:53 +00:00
Quentin Colombet 7b73bfa67a [InlineSpiller] Fix rematerialization for bundles.
Prior to this patch, we could update the operand of another MI in the same
bundle.

Longer version:
Before InlineSpiller rematerializes a vreg, it iterates over operands of each MI
in a bundle, collecting all (MI, OpNo) pairs that reference that vreg.

Then if it does rematerialize, it goes through the pair list and replaces the
operands with the new (rematerialized) vreg.  The problem is, it tries to
replace all of these operands in the main MI ! This works fine for single MIs.
However, if we are processing a bundle of MIs and the list contains multiple
pairs - the rematerialization will either crash trying to access a non-existing
operand of the main MI, or silently corrupt one of the existing ones. It will
also ignore other MIs in the bundle.

The obvious fix is to use the MI pointers saved in collected (MI, OpNo) pairs.
This must have been the original intent of the pair list but somehow these
pointers got lost.

Patch by Dmitri Shtilman <dshtilman@icloud.com>!

Differential revision: http://reviews.llvm.org/D9904

<rdar://problem/21002163>

llvm-svn: 237964
2015-05-21 21:41:55 +00:00
Sanjay Patel f911484051 fix typo in comment; NFC
llvm-svn: 237962
2015-05-21 21:29:13 +00:00
Alex Lorenz c37baf82a9 Resubmit r237708 (MIR Serialization: print and parse LLVM IR using MIR format).
This commit is a 2nd attempt at committing the initial MIR serialization patch.
The first commit (r237708) made the incremental buildbots unstable and was 
reverted in r237730. The original commit didn't add a terminating null 
character to the LLVM IR source which was passed to LLParser, and this 
sometimes caused the test 'llvmIR.mir' to fail with a parsing error because 
the LLVM IR source didn't have a null character immediately after the end 
and thus LLLexer encountered some garbage characters that ultimately caused 
the error.

This commit also includes the other test fixes I committed in
r237712 (llc path fix) and r237723 (remove target triple) which
also got reverted in r237730.

--Original Commit Message--

MIR Serialization: print and parse LLVM IR using MIR format.

This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR 
using the MIR format. This pass is then added as a last pass when a 
'stop-after' option is used in llc. The new library adds the initial 
functionality for parsing of MIR files as well. This commit also 
extends the llc tool so that it can recognize and parse MIR input files.

Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames

Differential Revision: http://reviews.llvm.org/D9616

llvm-svn: 237954
2015-05-21 20:54:45 +00:00
Rafael Espindola 0709a7bd1a Move alignment from MCSectionData to MCSection.
This starts merging MCSection and MCSectionData.

There are a few issues with the current split between MCSection and
MCSectionData.

* It optimizes the the not as important case. We want the production
of .o files to be really fast, but the split puts the information used
for .o emission in a separate data structure.

* The ELF/COFF/MachO hierarchy is not represented in MCSectionData,
leading to some ad-hoc ways to represent the various flags.

* It makes it harder to remember where each item is.

The attached patch starts merging the two by moving the alignment from
MCSectionData to MCSection.

Most of the patch is actually just dropping 'const', since
MCSectionData is mutable, but MCSection was not.

llvm-svn: 237936
2015-05-21 19:20:38 +00:00
Sanjay Patel f69f4e42ce use range-based for-loops; NFCI
llvm-svn: 237918
2015-05-21 17:43:26 +00:00
Sanjay Patel 99b3aa3505 use range-based for-loops; NFCI
llvm-svn: 237917
2015-05-21 17:22:45 +00:00
Sanjay Patel f8c028c0b0 use range-based for-loop
llvm-svn: 237914
2015-05-21 17:04:17 +00:00
Sanjay Patel 490aca92be use range-based for-loop; NFCI
llvm-svn: 237908
2015-05-21 16:00:50 +00:00
Manuel Klimek b00d42c10c std::sort must be called with a strict weak ordering.
Found by a debug enabled stl.

llvm-svn: 237906
2015-05-21 15:38:25 +00:00
Simon Pilgrim e054199354 [X86][SSE] Improve support for 128-bit vector sign extension
This patch improves support for sign extension of the lower lanes of vectors of integers by making use of the SSE41 pmovsx* sign extension instructions where possible, and optimizing the sign extension by shifts on pre-SSE41 targets (avoiding the use of i64 arithmetic shifts which require scalarization).

It converts SIGN_EXTEND nodes to SIGN_EXTEND_VECTOR_INREG where necessary, that more closely matches the pmovsx* instruction than the default approach of using SIGN_EXTEND_INREG which splits the operation (into an ANY_EXTEND lowered to a shuffle followed by shifts) making instruction matching difficult during lowering. Necessary support for SIGN_EXTEND_VECTOR_INREG has been added to the DAGCombiner.

Differential Revision: http://reviews.llvm.org/D9848

llvm-svn: 237885
2015-05-21 10:05:03 +00:00
Duncan P. N. Exon Smith 0b73d71abb AsmPrinter: Compute absolute label difference directly
Create a low-overhead path for `EmitLabelDifference()` that emits a
emits an absolute number when (1) the output is an object stream and (2)
the two symbols are in the same data fragment.

This drops memory usage on Mach-O from 975 MB down to 919 MB (5.8%).
The only call is when `!doesDwarfUseRelocationsAcrossSections()` --
i.e., on Mach-O -- since otherwise an absolute offset from the start of
the section needs a relocation.  (`EmitLabelDifference()` is cheaper on
ELF anyway, since it creates 1 fewer temp symbol, and it gets called far
less often.  It's not clear to me if this is even a bottleneck there.)

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)

llvm-svn: 237876
2015-05-21 02:41:23 +00:00
Andrew Kaylor cafb89df1e Fix build error
llvm-svn: 237859
2015-05-20 23:58:44 +00:00
Andrew Kaylor 69fc4418ab Fix build warning
llvm-svn: 237855
2015-05-20 23:28:03 +00:00
Andrew Kaylor a6c5b9682e [WinEH] C++ EH state numbering fixes
Differential Revision: http://reviews.llvm.org/D9787

llvm-svn: 237854
2015-05-20 23:22:24 +00:00
Pete Cooper a05c082866 Don't generate comments in the DebugLocStream unless required. NFC.
The ByteStreamer here wasn't taking account of whether the asm streamer was text based and verbose.  Only with that combination should we emit comments.

This change makes sure that we only actually convert a Twine to a string using Twine::str() if we need the comment.  This saves about 10000 small allocations on a test case involving the verify-use_list-order bitcode going through llc with debug info.

Note, this is NFC as the comments would ultimately never be emitted unless required.

Reviewed by Duncan Exon Smith and David Blaikie.

llvm-svn: 237851
2015-05-20 22:51:27 +00:00
Pete Cooper 477300d333 Revert "Add bool to DebugLocDwarfExpression to control emitting comments."
This reverts commit 0037b6bcbc874aa1b93d7ce3ad8dba3753ee2d9d (r237827).

David Blaikie suggested some alternatives to this which are better.  Reverting to apply a better solution later.

llvm-svn: 237849
2015-05-20 22:37:48 +00:00
Pete Cooper 35522001fa Add bool to DebugLocDwarfExpression to control emitting comments.
DebugLocDwarfExpression::EmitOp was creating temporary strings by concatenating Twine's.

When emitting to object files, these comments are thrown away.

This commit adds a boolean to the constructor of the DwarfExpression to control whether it will actually emit
any comments.  This prevents it from even generating the temporary comments which would have been thrown away anyway.

llvm-svn: 237827
2015-05-20 19:50:03 +00:00
Matthias Braun 56a781495a DAGCombiner: Continue combining if FoldConstantArithmetic() fails.
DAG.FoldConstantArithmetic() can fail even though both operands are
Constants if OpaqueConstants are involved. Continue trying other combine
possibilities in tis case.

Differential Revision: http://reviews.llvm.org/D6946

Somewhat related to PR21801 / rdar://19211454

llvm-svn: 237822
2015-05-20 18:54:02 +00:00
Pawel Bylica 8011da9628 Fix icmp lowering
Summary:
During icmp lowering it can happen that a constant value can be larger than expected (see the code around the change).
APInt::getMinSignedBits() must be checked again as the shift before can change the constant sign to positive.
I'm not sure it is the best fix possible though.

Test Plan: Regression test included.

Reviewers: resistor, chandlerc, spatel, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D9147

llvm-svn: 237812
2015-05-20 17:21:09 +00:00
Pete Cooper 9e1d335697 Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC.
Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method.

This updates all users which were either using an unsigned to store it, or had a now unnecessary cast.

llvm-svn: 237810
2015-05-20 17:16:39 +00:00
Daniel Sanders 69c6008e49 Revert r237789 - [mips] The naming convention for private labels is ABI dependant.
It works, but I've noticed that I missed several callers of createMCAsmInfo()
and many don't have a TargetMachine to provide.

llvm-svn: 237792
2015-05-20 14:18:59 +00:00
Daniel Sanders b718eca643 [mips] The naming convention for private labels is ABI dependant.
Summary:
For N32/N64, private labels begin with '.L' but for O32 they begin with '$'.

MCAsmInfo now has an initializer function which can be used to provide information from the TargetMachine to control the assembly syntax.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: jfb, sandeep, llvm-commits, rafael

Differential Revision: http://reviews.llvm.org/D9821

llvm-svn: 237789
2015-05-20 13:16:42 +00:00
Igor Laevsky 423bc9ec4c [StatepointLowering] Support of the gc.relocates for invoke statepoints.
This change implements support for lowering of the gc.relocates tied to the invoke statepoint.
This is acomplished by storing frame indices of the lowered values in "StatepointRelocatedValues" map inside FunctionLoweringInfo instead of storing them in per-basic block structure StatepointLowering.
After this change StatepointLowering is used only during "LowerStatepoint" call and it is not necessary to store it as a field in SelectionDAGBuilder anymore.

Differential Revision: http://reviews.llvm.org/D7798

llvm-svn: 237786
2015-05-20 11:37:25 +00:00
Swaroop Sridhar 665bc9c936 Add a GCStrategy for CoreCLR
This change adds a new GC strategy for supporting the CoreCLR runtime.

This strategy is currently identical to Statepoint-example GC, 
but is necessary for several upcoming changes specific to CoreCLR, such as:

1. Base-pointers not explicitly reported for interior pointers
2. Different format for stack-map encoding
3. Location of Safe-point polls: polls are only needed before loop-back edges and before tail-calls (not needed at function-entry)
4. Runtime specific handshake between calls to managed/unmanaged functions.

llvm-svn: 237753
2015-05-20 01:07:23 +00:00
Philip Reames 7738dd68cf Remove a stale comment
The todo was implemented a while ago; I just forgot to remove the comment.  

llvm-svn: 237736
2015-05-19 22:26:33 +00:00
Alex Lorenz de1970fe66 Revert r237708 (MIR serialization) - incremental buildbots became unstable.
The incremental buildbots entered a pass-fail cycle where during the fail
cycle one of the tests from this commit fails for an unknown reason. I
have reverted this commit and will investigate the cause of this problem.

llvm-svn: 237730
2015-05-19 21:41:28 +00:00
Matthias Braun 07066cca20 MachineInstr: Remove unused parameter.
llvm-svn: 237726
2015-05-19 21:22:20 +00:00
Sanjay Patel 03abbb48a4 use 'auto *' for pointers; clearer usage, no deep copying
llvm-svn: 237719
2015-05-19 20:10:16 +00:00
Sanjay Patel ad11415962 tidy up
1. remove duplicate local variable
2. add local variable with name to match comment
3. remove useless comment

llvm-svn: 237715
2015-05-19 19:10:57 +00:00
Sanjay Patel 64a6da947a use range-based for-loop
llvm-svn: 237711
2015-05-19 18:24:33 +00:00
Alex Lorenz c5e0d4d146 MIR Serialization: print and parse LLVM IR using MIR format.
This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR 
using the MIR format. This pass is then added as a last pass when a 
'stop-after' option is used in llc. The new library adds the initial 
functionality for parsing of MIR files as well. This commit also 
extends the llc tool so that it can recognize and parse MIR input files.

Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames

Differential Revision: http://reviews.llvm.org/D9616

llvm-svn: 237708
2015-05-19 18:17:39 +00:00
Matthias Braun 7e10e53f14 RegisterCoalescer: Improve a comment.
Explain the relation of the example to the variables in the code,
explain what bad behaviour the code avoids in this case.

llvm-svn: 237706
2015-05-19 17:52:32 +00:00
Sanjay Patel 3c9e370ec0 use range-based for loop
llvm-svn: 237705
2015-05-19 17:49:14 +00:00
Matthias Braun 20683efd47 SelectionDAG: Cleanup and simplify FoldConstantArithmetic
This cleans up the FoldConstantArithmetic code by factoring out the case
of two ConstantSDNodes into an own function. This avoids unnecessary
complexity for many callers who already have ConstantSDNode arguments.

This also avoids an intermeidate SmallVector datastructure and a loop
over that datastructure.

llvm-svn: 237651
2015-05-19 01:40:21 +00:00
Matthias Braun 887fdfb759 DAGCombiner: Factor common pattern into isOneConstant() function. NFC
llvm-svn: 237645
2015-05-19 00:25:21 +00:00
Matthias Braun 033121981d DAGCombiner: Factor common pattern into isAllOnesConstant() function. NFC
llvm-svn: 237644
2015-05-19 00:25:20 +00:00
Matthias Braun 0542b5d1db DAGCombiner: Use isNullConstant() where possible
llvm-svn: 237643
2015-05-19 00:25:17 +00:00
Matthias Braun c545234772 Revert accidental change in r237633
llvm-svn: 237635
2015-05-18 23:18:13 +00:00
Matthias Braun 1505efb0bb DAGCombiner: Factor common pattern into isNullConstant() function. NFC
llvm-svn: 237633
2015-05-18 23:07:27 +00:00
David Blaikie ff6409d096 Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only
llvm-svn: 237624
2015-05-18 22:13:54 +00:00
Matthias Braun fa3872e7ad MachineInstr: Change return value of getOpcode() to unsigned.
This was previously returning int. However there are no negative opcode
numbers and more importantly this was needlessly different from
MCInstrDesc::getOpcode() (which even is the value returned here) and
SDValue::getOpcode()/SDNode::getOpcode().

llvm-svn: 237611
2015-05-18 20:27:55 +00:00
Jim Grosbach 6f482000e9 MC: Clean up method names in MCContext.
The naming was a mish-mash of old and new style. Update to be consistent
with the new. NFC.

llvm-svn: 237594
2015-05-18 18:43:14 +00:00
Hal Finkel 44b81ee40b Preserve the order of READ_REGISTER and WRITE_REGISTER
At the present time, we don't have a way to represent general dependency
relationships, so everything is represented using memory dependency. In order
to preserve the data dependency of a READ_REGISTER on WRITE_REGISTER, we need
to model WRITE_REGISTER as writing (which we had been doing) and model
READ_REGISTER as reading (which we had not been doing). Fix this, and also the
way that the chain operands were generated at the SDAG level.

Patch by Nicholas Paul Johnson, thanks! Test case by me.

llvm-svn: 237584
2015-05-18 16:42:10 +00:00
Oliver Stannard 6cb23465e0 Revert r237579, as it broke windows buildbots
llvm-svn: 237583
2015-05-18 16:39:16 +00:00
Oliver Stannard 0c553afe6a [LLVM - ARM/AArch64] Add ACLE special register intrinsics
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.

This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.

Patch by Luke Cheeseman.

Differential Revision: http://reviews.llvm.org/D9699

llvm-svn: 237579
2015-05-18 16:23:33 +00:00
Hal Finkel a60e633fdd [DAGCombine] Be more pedantic about use iteration in CombineToPreIndexedLoadStore
In CombineToPreIndexedLoadStore, when the offset is a constant, we have code
that looks for other uses of the pointer which are constant offset computations
so that they can be rewritten in terms of the updated pointer so that we don't
need to keep a copy of the base pointer to compute these constant offsets.

Unfortunately, when it iterated over the uses, it did so by SDNodes, and so we
could confuse ourselves if the base pointer was produced by a node that had
multiple results (because we would not immediately exclude uses of the other
node results). This was reported as PR22755. Unfortunately, we don't have a
test case (and I've also been unable to produce one thus far), but at least the
mistake is clear. The right way to fix this problem is to make use of the information
contained in the use iterators to filter out any uses of other results of the
node producing the base pointer.

This should be mostly NFC, but should also fix PR22755 (for which,
unfortunately, we have no in-tree test case).

llvm-svn: 237576
2015-05-18 15:46:02 +00:00
Andrew Trick 569dc65a60 MachineScheduler debug output clarity.
llvm-svn: 237545
2015-05-17 23:40:31 +00:00
Andrew Trick e02d5da8a7 RegisterPressureTracker: reword stale comments.
llvm-svn: 237544
2015-05-17 23:40:27 +00:00
Benjamin Kramer a48e0656b6 [WinEH] Push unique_ptr through the Action interface.
This was the source of many leaks in the past, this should fix them once and
for all.

llvm-svn: 237524
2015-05-16 15:40:03 +00:00
Craig Topper 9a9d58a238 Correct indentation. NFC
llvm-svn: 237512
2015-05-16 05:42:08 +00:00
Matthias Braun 352b89c460 MachineSink: Collect registers before clearing their killflags.
Currently whenever we sink any instruction, we do clearKillFlags for
every use of every use operand for that instruction, apparently there
are a lot of duplication, therefore compile time penalties.

This patch collect all the interested registers first, do clearKillFlags
for it all together at once at the end, so we only need to do
clearKillFlags once for one register, duplication is avoided.

Patch by Lawrence Hu!

Differential Revision: http://reviews.llvm.org/D9719

llvm-svn: 237510
2015-05-16 03:11:07 +00:00
James Molloy 7307cd57c5 [SDAGBuilder] Make the AArch64 builder happier.
I intended this loop to only unwrap SplitVector actions, but it
was more broad than that, such as unwrapping WidenVector actions,
which makes operations seem legal when they're not.

llvm-svn: 237457
2015-05-15 17:41:29 +00:00
James Molloy 7e9776b559 Add SDNodes for umin, umax, smin and smax.
This adds new SDNodes for signed/unsigned min/max. These nodes are built from
select/icmp pairs matched at SDAGBuilder stage.

This patch adds the nodes, as well as legalization support and sets them to
be "expand" for all targets.

NFC for now; this will be tested when I switch AArch64 to using these new
nodes.

llvm-svn: 237423
2015-05-15 09:03:15 +00:00
Akira Hatanaka ff86773f51 Stop resetting SanitizeAddress in TargetMachine::resetTargetOptions. NFC.
Instead of doing that, create a temporary copy of MCTargetOptions and reset its
SanitizeAddress field based on the function's attribute every time an InlineAsm
instruction is emitted in AsmPrinter::EmitInlineAsm. 

This is part of the work to remove TargetMachine::resetTargetOptions (the FIXME
added to TargetMachine.cpp in r236009 explains why this function has to be
removed).

Differential Revision: http://reviews.llvm.org/D9570

llvm-svn: 237412
2015-05-15 00:20:44 +00:00
Matthias Braun 7a247f709b Turn effective assert(0) into llvm_unreachable
llvm-svn: 237379
2015-05-14 18:33:29 +00:00
Matthias Braun 42e1e66e55 TargetSchedule: factor out common code; NFC
llvm-svn: 237376
2015-05-14 18:01:13 +00:00
Matthias Braun bff3a7eb3d Remove MCInstrItineraries includes in parts that don't use them anymore
llvm-svn: 237375
2015-05-14 18:01:11 +00:00
Ahmed Bougacha 6402ad27c0 [CodeGen] Use standard -not gnueabi- naming for f16 libcalls on Darwin.
Other targets probably should as well.  Since r237161, compiler-rt has
both, but I don't see why anything other than gnueabi would use a
gnueabi naming scheme.

llvm-svn: 237324
2015-05-14 01:00:51 +00:00
Nick Lewycky 37a175007b Revert r237046. See the testcase on the thread where r237046 was committed.
llvm-svn: 237317
2015-05-13 23:41:47 +00:00
Sergey Dmitrouk 46c4f02848 [DebugInfo] Debug locations for constant SD nodes
Several updates for [DebugInfo] Add debug locations to constant SD nodes (r235989).
Includes:

 *  re-enabling the change (disabled recently);
 *  missing change for FP constants;
 *  resetting debug location of constant node if it's used more than at one place
    to prevent emission of wrong locations in case of coalesced constants;
 *  a couple of additional tests.

Now all look ups in CSEMap are wrapped by additional method.

Comment in D9084 suggests that debug locations aren't useful for "target constants",
so there might be one more change related to this API (namely, dropping debug
locations for getTarget*Constant methods).

Differential Revision: http://reviews.llvm.org/D9604

llvm-svn: 237237
2015-05-13 08:58:03 +00:00
Sanjoy Das a1d39ba940 [Statepoints] Support for "patchable" statepoints.
Summary:
This change adds two new parameters to the statepoint intrinsic, `i64 id`
and `i32 num_patch_bytes`.  `id` gets propagated to the ID field
in the generated StackMap section.  If the `num_patch_bytes` is
non-zero then the statepoint is lowered to `num_patch_bytes` bytes of
nops instead of a call (the spill and reload code remains unchanged).
A non-zero `num_patch_bytes` is useful in situations where a language
runtime requires complete control over how a call is lowered.

This change brings statepoints one step closer to patchpoints.  With
some additional work (that is not part of this patch) it should be
possible to get rid of `TargetOpcode::STATEPOINT` altogether.

PlaceSafepoints generates `statepoint` wrappers with `id` set to
`0xABCDEF00` (the old default value for the ID reported in the stackmap)
and `num_patch_bytes` set to `0`.  This can be made more sophisticated
later.

Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9546

llvm-svn: 237214
2015-05-12 23:52:24 +00:00
Saleem Abdulrasool ee13fbe848 CodeGen: ignore DEBUG_VALUE nodes in KILL tagging
DEBUG_VALUE nodes do not take part in code generation.  Ignore them when
performing KILL updates.  Addresses PR23486.

llvm-svn: 237211
2015-05-12 23:36:18 +00:00
Pat Gavlin 08d7027cc1 [Statepoints] Clean up statepoint argument accessors.
Differential Revision: http://reviews.llvm.org/D9622

llvm-svn: 237191
2015-05-12 21:33:48 +00:00
Pete Cooper 833f34d837 Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.
We already had a method to iterate over all the incoming values of a PHI.  This just changes all eligible code to use it.

Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB.

llvm-svn: 237169
2015-05-12 20:05:31 +00:00
Pat Gavlin c7dc6d6ee7 [Statepoints] Split the calling convention and statepoint flags operand to STATEPOINT into two separate operands.
Differential Revision: http://reviews.llvm.org/D9623

llvm-svn: 237166
2015-05-12 19:50:19 +00:00
Igor Laevsky 87ef5eaf46 Reverse ordering of base and derived pointer during safepoint lowering.
According to the documentation in StackMap section for the safepoint we should have:
"The first Location in each pair describes the base pointer for the object. The second is the derived pointer actually being relocated."
But before this change we emitted them in reverse order - derived pointer first, base pointer second.

llvm-svn: 237126
2015-05-12 13:12:14 +00:00
Eric Christopher 824f42f209 Migrate existing backends that care about software floating point
to use the information in the module rather than TargetOptions.

We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.

For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.

Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.

llvm-svn: 237079
2015-05-12 01:26:05 +00:00
Andrew Kaylor 0ddaf2bfb9 Fixing memory leak
llvm-svn: 237072
2015-05-12 00:13:51 +00:00
Sanjoy Das 3d705e37c3 Refactoring gc_relocate related code in CodeGenPrepare.cpp
Summary:
The original code inserted new instructions by following a
Create->Remove->ReInsert flow. This patch removes the unnecessary
Remove->ReInsert part by setting up the InsertPoint correctly at the
very beginning. This change does not introduce any functionality change.

Patch by Chen Li!

Reviewers: reames, AndyAyers, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9687

llvm-svn: 237070
2015-05-11 23:47:30 +00:00
Andrew Kaylor cc14f387e8 [WinEH] Handle nested landing pads that return directly to the parent function.
Differential Revision: http://reviews.llvm.org/D9684

llvm-svn: 237063
2015-05-11 23:06:02 +00:00
Sanjay Patel 5b202966f5 propagate IR-level fast-math-flags to DAG nodes; 2nd try; NFC
This is a less ambitious version of:
http://reviews.llvm.org/rL236546

because that was reverted in:
http://reviews.llvm.org/rL236600

because it caused memory corruption that wasn't related to FMF
but was actually due to making nodes with 2 operands derive from a
plain SDNode rather than a BinarySDNode. 

This patch adds the minimum plumbing necessary to use IR-level
fast-math-flags (FMF) in the backend without actually using
them for anything yet. This is a follow-on to:
http://reviews.llvm.org/rL235997

...which split the existing nsw / nuw / exact flags and FMF
into their own struct.
 

llvm-svn: 237046
2015-05-11 21:07:09 +00:00
Andrew Kaylor ce6f907e2f Fixing build warnings
llvm-svn: 237042
2015-05-11 20:45:11 +00:00
Andrew Kaylor 762a6bea1f [WinEH] Update exception numbering to give handlers their own base state.
Differential Revision: http://reviews.llvm.org/D9512

llvm-svn: 237014
2015-05-11 19:41:19 +00:00
Sanjoy Das 89c5491a72 [RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to vector of pointers
Summary:
In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for
each relocated pointer, and the gc_relocate has the same type with the
pointer. During the creation of gc_relocate intrinsic, llvm requires to
mangle its type. However, llvm does not support mangling of all possible
types. RewriteStatepointsForGC will hit an assertion failure when it
tries to create a gc_relocate for pointer to vector of pointers because
mangling for vector of pointers is not supported.

This patch changes the way RewriteStatepointsForGC pass creates
gc_relocate. For each relocated pointer, we erase the type of pointers
and create an unified gc_relocate of type i8 addrspace(1)*. Then a
bitcast is inserted to convert the gc_relocate to the correct type. In
this way, gc_relocate does not need to deal with different types of
pointers and the unsupported type mangling is no longer a problem. This
change would also ease further merge when LLVM erases types of pointers
and introduces an unified pointer type.

Some minor changes are also introduced to gc_relocate related part in
InstCombineCalls, CodeGenPrepare, and Verifier accordingly.

Patch by Chen Li!

Reviewers: reames, AndyAyers, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9592

llvm-svn: 237009
2015-05-11 18:49:34 +00:00
Matthias Braun 5391754288 LiveRangeCalc: Improve error messages on malformed IR
llvm-svn: 237008
2015-05-11 18:47:47 +00:00
Simon Pilgrim e09584ca95 [SelectionDAG] Fixed constant folding issue when legalised types are smaller then the folded type.
Found when testing with llvm-stress on i686 targets.

llvm-svn: 236954
2015-05-10 14:14:51 +00:00
James Y Knight fca02be3c1 Fix MergeConsecutiveStore for non-byte-sized memory accesses.
The bug showed up as a compile-time assertion failure:
  Assertion `NumBits >= MIN_INT_BITS && "bitwidth too small"' failed
when building msan tests on x86-64.

Prior to r236850, this bug was masked due to a bogus alignment check,
which also accidentally rejected non-byte-sized accesses. Afterwards,
an invalid ElementSizeBytes == 0 got further into the function, and
triggered the assertion failure.

It would probably be a good idea to allow it to handle merging stores
of unusual widths as well, but for now, to un-break it, I'm just
making the minimal fix.

Differential Revision: http://reviews.llvm.org/D9626

llvm-svn: 236927
2015-05-09 03:13:37 +00:00
Tom Stellard f01af29f01 MachineCSE: Add a target query for the LookAheadLimit heurisitic
This is used to determine whether or not to CSE physical register
defs.

Differential Revision: http://reviews.llvm.org/D9472

llvm-svn: 236923
2015-05-09 00:56:07 +00:00
Pete Cooper d54fb89901 [Fast-ISel] Don't mark the first use of a remat constant as killed.
When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to
mark the vreg containing 1000 as killed.  Given that we go bottom up in fast-isel, a later
use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill.

However, rematerialised constant expressions aren't generated bottom up.  The local value save area
grows downwards.  This means that if you remat 2 constant expressions which both use 1000 then the
first will kill it, then the second, which is *lower* in the BB will read a killed register.

This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset.

Note that this commit only makes kill flag generation conservative.  There's nothing else obviously wrong with
the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions.

However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely.

llvm-svn: 236922
2015-05-09 00:51:03 +00:00
Arnold Schwaighofer f54b73d681 ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects
The code that builds the dependence graph assumes that two PseudoSourceValues
don't alias. In a tail calling function two FixedStackObjects might refer to the
same location. Worse 'immutable' fixed stack objects like function arguments are
not immutable and will be clobbered.

Change this so that a load from a FixedStackObject is not invariant in a tail
calling function and don't return a PseudoSourceValue for an instruction in tail
calling functions when building the dependence graph so that we handle function
arguments conservatively.

Fix for PR23459.

rdar://20740035

llvm-svn: 236916
2015-05-08 23:52:00 +00:00
Hans Wennborg ae0254dabc Switch lowering: cluster adjacent fall-through cases even at -O0
It's cheap to do, and codegen is much faster if cases can be merged
into clusters.

llvm-svn: 236905
2015-05-08 21:23:39 +00:00
Pete Cooper e4bb07ecff [Fast-ISel] Clear kill flags on registers replaced by updateValueMap.
When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register.  This is done via updateValueMap,.

However, its possible that the source register we are rewriting *to* to also have uses.  If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails.

This code checks for the case where the to register is also used, and if so it clears all kill on the from register.  This is conservative, but better that always clearing kills on the from register.

llvm-svn: 236897
2015-05-08 20:46:54 +00:00
Pat Gavlin cc0431d1c0 Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware.
This changes the shape of the statepoint intrinsic from:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args)

to:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args)

This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back.

In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation.

Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops.

Differential Revision: http://reviews.llvm.org/D9501

llvm-svn: 236888
2015-05-08 18:07:42 +00:00
Pete Cooper 85b1c48b20 Clear kill flags on all used registers when sinking instructions.
The test here was sinking the AND here to a lower BB:

	%vreg7<def> = ANDWri %vreg8, 0; GPR32common:%vreg7,%vreg8
	TBNZW %vreg8<kill>, 0, <BB#1>; GPR32common:%vreg8

which meant that vreg8 was read after it was killed.

This commit changes the code from clearing kill flags on the AND to clearing flags on all registers used by the AND.

llvm-svn: 236886
2015-05-08 17:54:32 +00:00
Pete Cooper ff5064a188 80 cols fix since i'm looking at this function anyway. NFC
llvm-svn: 236885
2015-05-08 17:54:29 +00:00
James Y Knight 284e7b3d6c Fix alignment checks in MergeConsecutiveStores.
1) check whether the alignment of the memory is sufficient for the
*merged* store or load to be efficient.

Not doing so can result in some ridiculously poor code generation, if
merging creates a vector operation which must be aligned but isn't.

2) DON'T check that the alignment of each load/store is equal. If
you're merging 2 4-byte stores, the first *might* have 8-byte
alignment, but the second certainly will have 4-byte alignment. We do
want to allow those to be merged.

llvm-svn: 236850
2015-05-08 13:47:01 +00:00
Igor Laevsky 9d3932bf96 Fix coding standart based on post submit comments.
Differential Revision: http://reviews.llvm.org/D7760

llvm-svn: 236849
2015-05-08 13:17:22 +00:00
Pete Cooper ba593ad3f3 Clear kill flags in tail duplication.
If we duplicate an instruction then we must also clear kill flags on any uses we rewrite.
Otherwise we might be killing a register which was used in other BBs.

For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated.

	%vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10
	%vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10

	The copy here is inserted after the add and so needs vreg10 to be live.

llvm-svn: 236782
2015-05-07 21:48:26 +00:00
Hans Wennborg 44faaa7aa4 Switch lowering: handle zero-weight branch probabilities
After r236617, branch probabilities are no longer guaranteed to be >= 1. This
patch makes the swich lowering code handle that correctly, without bumping the
branch weights by 1 which might cause overflow and skews the probabilities.

Covered by @zero_weight_tree in test/CodeGen/X86/switch.ll.

llvm-svn: 236739
2015-05-07 15:47:15 +00:00
Pete Cooper 27483915e8 Handle dead defs in the if converter.
We had code such as this:
  r2 = ...
  t2Bcc

label1:
  ldr ... r2

label2;
  return r2<dead, def>

The if converter was transforming this to
   r2<def> = ...
   return [pred] r2<dead,def>
   ldr <r2, kill>
   return

which fails the machine verifier because the ldr now reads from a dead def.

The fix here detects dead defs in stepForward and passes them back to the caller in the clobbers list.  The caller then clears the dead flag from the def is the value is live.

llvm-svn: 236660
2015-05-06 22:51:04 +00:00
Quentin Colombet 0ddd315db0 [RegisterCoalescer] Make sure each live-range has only one component, as
demanded by the machine verifier.
After shrinking a live-range to its uses, it is possible to create several
smaller live-ranges. When this happens, shrinkToUses returns true and we need to
split the different components into their own live-ranges.

The problem does not reproduce on any in-tree target but Jonas Paulsson
<jonas.paulsson@ericsson.com>, who reported the problem, checked that this patch
fixes the issue.

llvm-svn: 236658
2015-05-06 22:41:50 +00:00
Pete Cooper 54085cdc7b Fix incorrect kill flags in fastisel.
If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use.

Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere.

llvm-svn: 236650
2015-05-06 22:09:29 +00:00
Duncan P. N. Exon Smith c177fec93f MC: Skip names of temporary symbols in object streamer
Don't create names for temporary symbols when using an object streamer.
The names never make it to the output anyway.  From the starting point
of r236629, my heap profile says this drops peak memory usage from 1100
MB to 1058 MB for CodeGen of `verify-uselistorder`, a savings of almost
4% on peak memory, and removes `StringMap<bool, BumpPtrAllocator...>`
from the profile entirely.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`;
see r236629 for details.)

llvm-svn: 236642
2015-05-06 21:34:34 +00:00
Tim Northover e4310fe946 CodeGen: move over-zealous assert into actual if statement.
It's quite possible to encounter an insertvalue instruction that's more deeply
nested than the value we're looking for, but when that happens we really
mustn't compare beyond the end of the index array.

Since I couldn't see any guarantees about what comparisons std::equal makes, we
probably need to directly check the size beforehand. In practice, I suspect
most std::equal implementations would probably bail early, which would be OK.
But just in case...

rdar://20834485

llvm-svn: 236635
2015-05-06 20:07:38 +00:00
Duncan P. N. Exon Smith 653c1099b4 DwarfDebug: Emit number of bytes in .debug_loc entry directly
Emit the number of bytes in a `.debug_loc` entry directly.  The old code
created temp labels (expensive), emitted the difference between them,
and then emitted one on each side of the relevant bytes.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`
(the optimized version of ld64's `-save-temps` when linking the
`verify-uselistorder` executable in an LTO bootstrap).  I've hacked
`MCContext::Allocate()` to just call `malloc()` instead of using the
`BumpPtrAllocator` so that the heap profile is easier to read.  As far
as peak memory is concerned, `MCContext::Allocate()` is equivalent to a
leak, since it only gets freed at process teardown.

In my heap profile, this patch drops memory usage of
`DwarfDebug::emitDebugLoc()` from 132.56 MB (11.4%) down to 29.86 MB
(2.7%) at peak memory.  Some of that must be noise from `SmallVector`
(or other) allocations -- peak memory only dropped from 1160 MB down to
1100 MB -- but this nevertheless shaves 5% off the top.)

llvm-svn: 236629
2015-05-06 19:11:20 +00:00
Reid Kleckner d1b38c4b0b [WinEH] Improve fatal error message about failed demotion
llvm-svn: 236626
2015-05-06 18:45:24 +00:00
Sanjoy Das 6c0fe24bd1 [SelectionDAG] Delete SelectionDAGBuilder::removeValue. NFC.
SelectionDAGBuilder::removeValue is dead now, after rL236563.

llvm-svn: 236618
2015-05-06 18:02:10 +00:00
Diego Novillo 14f94de1ee Allow 0-weight branches in BranchProbabilityInfo.
Summary:
When computing branch weights in BPI, we used to disallow branches with
weight 0. This is a minor nuisance, because a branch with weight 0 is
different to "don't have information". In the context of
instrumentation, it may mean "never executed", in the context of
sampling, it means "never or seldom executed".

In allowing 0 weight branches, I ran into issues with the switch
expansion code in selection DAG. It is currently hardwired to not handle
branches with weight 0. To maintain the current behaviour, I changed it
to use 1 when it finds 0, but perhaps the algorithm needs changes to
tolerate branches with weight zero.

Reviewers: hansw

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9533

llvm-svn: 236617
2015-05-06 17:55:11 +00:00
Matt Arsenault 633dba4f41 Add ChangeTo* to MachineOperand for symbols
llvm-svn: 236612
2015-05-06 17:05:54 +00:00
NAKAMURA Takumi e452998b4b Reformat.
llvm-svn: 236601
2015-05-06 14:03:22 +00:00
NAKAMURA Takumi d7c0be9c42 Revert r236546, "propagate IR-level fast-math-flags to DAG nodes (NFC)"
It caused undefined behavior.

llvm-svn: 236600
2015-05-06 14:03:12 +00:00
Pawel Bylica 9f1fb9d1ef SelectionDAG: Handle out-of-bounds index in extract vector element
Summary: This patch correctly handles undef case of EXTRACT_VECTOR_ELT node where the element index is constant and not less than vector size.

Test Plan:
CodeGen for X86 test included.
Also one incorrect regression test fixed.

Reviewers: qcolombet, chandlerc, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D9250

llvm-svn: 236584
2015-05-06 10:19:14 +00:00
Sanjoy Das 4bfb472072 [Statepoint] Clean up StatepointLowering: symbolic constants.
For accessors in the `Statepoint` class, use symbolic constants for
offsets into the argument vector instead of literals.  This makes the
code intent clearer and simpler to change.

llvm-svn: 236566
2015-05-06 02:36:31 +00:00
Sanjoy Das 499d703f52 [Statepoint] Clean up Statepoint.h: accessor names.
Use getFoo() as accessors consistently and some other naming changes.

llvm-svn: 236564
2015-05-06 02:36:26 +00:00
Sanjoy Das c6bf3e9f12 [StatepointLowering] Don't create temporary instructions. NFCI.
Summary:
Instead of creating a temporary call instruction and lowering that, use
SelectionDAGBuilder::lowerCallOperands.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9480

llvm-svn: 236563
2015-05-06 02:36:20 +00:00
Ahmed Bougacha ed363c5dcb [WinEH] Reset WinEHPrepare::SEHExceptionCodeSlot when we're done.
This caused a use-after-free on test/CodeGen/X86/win32-eh.ll
No functional change intended.

llvm-svn: 236561
2015-05-06 01:28:58 +00:00
Sanjoy Das 1194d1e799 [SelectionDAG] Make an argument optional in RFV::getCopyToRegs. NFC.
Summary:
We default the value argument to nullptr.  The only use of the value is
in diagnosePossiblyInvalidConstraint and that seems to be resilient to
it being nullptr.

Reviewers: atrick, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9479

llvm-svn: 236555
2015-05-05 23:06:57 +00:00
Sanjoy Das 3936a97f11 [SelectionDAG] Move RegsForValue into SelectionDAGBuilder.h. NFC.
Summary:
The exported class will be used in later change, in
StatepointLowering.cpp.  It is still internal to SelectionDAG (not
exported via include/).

Reviewers: reames, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9478

llvm-svn: 236554
2015-05-05 23:06:54 +00:00
Sanjoy Das 84153c450a [SelectionDAG] Pass explicit type to lowerCallOperands. NFC.
Summary:
Currently this does not change anything, but change will be used in a
later change to StatepointLowering.cpp

Reviewers: reames, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9477

llvm-svn: 236553
2015-05-05 23:06:52 +00:00
Sanjoy Das 3fb91c0a0d [StatepointLowering] Rename variable, NFC.
Rename LoweredArgs to LoweredMetaArgs to clarify intent.

llvm-svn: 236552
2015-05-05 23:06:49 +00:00
Pete Cooper ce9ad757c7 Fix IfConverter to handle regmask machine operands.
Note, this is a recommit of r236515 after fixing an error in r236514.  The buildbot ran fast enough that it picked up r236514 prior to r236515 and threw an error.  r236515 itself ran 'make check' without errors.

Original commit message follows:

A regmask (typically seen on a call) clobbers the set of registers it lists.  The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier.  Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

llvm-svn: 236550
2015-05-05 22:09:41 +00:00
Sanjay Patel 801caff64d propagate IR-level fast-math-flags to DAG nodes (NFC)
This patch adds the minimum plumbing necessary to use IR-level
fast-math-flags (FMF) in the backend without actually using
them for anything yet. This is a follow-on to:
http://reviews.llvm.org/rL235997

...which split the existing nsw / nuw / exact flags and FMF
into their own struct.

There are 2 structural changes here:

1. The main diff is that we're preparing to extend the optimization
flags to affect more than just binary SDNodes. Eg, IR intrinsics 
( https://llvm.org/bugs/show_bug.cgi?id=21290 ) or non-binop nodes
that don't even exist in IR such as FMA, FNEG, etc.

2. The other change is that we're actually copying the FP fast-math-flags
from the IR instructions to SDNodes. 

Differential Revision: http://reviews.llvm.org/D8900

llvm-svn: 236546
2015-05-05 21:40:38 +00:00
Pete Cooper 7605e37a63 Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC
Note, this is a reapplication of r236515 with a fix to not assert on non-register operands, but instead only handle them until the subsequent commit.  Original commit message follows.

The code was basically the same here already.  Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs.

Will be used in the next commit to also handle regmasks.

llvm-svn: 236538
2015-05-05 20:14:22 +00:00
Ulrich Weigand 9958c489bb [DAGCombiner] Account for getVectorIdxTy() when narrowing vector load
This patch makes ReplaceExtractVectorEltOfLoadWithNarrowedLoad convert
the element number from getVectorIdxTy() to PtrTy before doing pointer
arithmetic on it.  This is needed on z, where element numbers are i32
but pointers are i64.

Original patch by Richard Sandiford.

llvm-svn: 236530
2015-05-05 19:34:10 +00:00
Ulrich Weigand af2c618e2b [DAGCombiner] Fix ReplaceExtractVectorEltOfLoadWithNarrowedLoad for BE
For little-endian, the function would convert (extract_vector_elt (load X), Y)
to X + Y*sizeof(elt).  For big-endian it would instead use
X + sizeof(vec) - Y*sizeof(elt).  The big-endian case wasn't right since
vector index order always follows memory/array order, even for big-endian.
(Note that the current handling has to be wrong for Y==0 since it would
access beyond the end of the vector.)

Original patch by Richard Sandiford.

llvm-svn: 236529
2015-05-05 19:33:37 +00:00
Ulrich Weigand 2693c0a491 [LegalizeVectorTypes] Allow single loads and stores for more short vectors
When lowering a load or store for TypeWidenVector, the type legalizer
would use a single load or store if the associated integer type was legal.
E.g. it would load a v4i8 as an i32 if i32 was legal.

This patch extends that behavior to promoted integers as well as legal ones.
If the integer type for the full vector width is TypePromoteInteger,
the element type is going to be TypePromoteInteger too, and it's still
better to use a single promoting load or truncating store rather than N
individual promoting loads or truncating stores.  E.g. if you have a v2i8
on a target where i16 is promoted to i32, it's better to load the v2i8 as
an i16 rather than load both i8s individually.

Original patch by Richard Sandiford.

llvm-svn: 236528
2015-05-05 19:32:57 +00:00
Pete Cooper 336d90b61b Revert "Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC"
This reverts commit 963cdbccf6e5578822836fd9b2ebece0ba9a60b7 (ie r236514)

This is to get the bots green while i investigate.

llvm-svn: 236518
2015-05-05 18:49:08 +00:00
Pete Cooper 05b84d4168 Revert "Fix IfConverter to handle regmask machine operands."
This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515).

This is to get the bots green while i investigate the failures.

llvm-svn: 236517
2015-05-05 18:49:05 +00:00
Pete Cooper 6ebc207703 Fix IfConverter to handle regmask machine operands.
A regmask (typically seen on a call) clobbers the set of registers it lists.  The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier.  Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

llvm-svn: 236515
2015-05-05 18:31:36 +00:00
Pete Cooper bbd1c727d1 Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC
The code was basically the same here already.  Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs.

Will be used in the next commit to also handle regmasks.

llvm-svn: 236514
2015-05-05 18:31:31 +00:00
Reid Kleckner 0738a9c02e Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236360.

This change exposed a bug in WinEHPrepare by opting win32 code into EH
preparation. We already knew that WinEHPrepare has bugs, and is the
status quo for x64, so I don't think that's a reason to hold off on this
change. I disabled exceptions in the sanitizer tests in r236505 and an
earlier revision.

llvm-svn: 236508
2015-05-05 17:44:16 +00:00
Quentin Colombet 61b305edfd [ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.

As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.


** Context **

Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.


** Motivating example **

Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b)  {
 %tmp = alloca i32, align 4
 %tmp2 = icmp slt i32 %a, %b
 br i1 %tmp2, label %true, label %false

true:
 store i32 %a, i32* %tmp, align 4
 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
 br label %false

false:
 %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
 ret i32 %tmp.0
}

On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f:                                     ; @f
; BB#0:
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
LBB0_2:                                 ; %false
  mov  sp, x29
  ldp x29, x30, [sp], #16
  ret

With shrink-wrapping we could generate:
_f:                                     ; @f
; BB#0:
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
  add sp, x29, #16            ; =16
  ldp x29, x30, [sp], #16
LBB0_2:                                 ; %false
  ret

Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.


** Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.

The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.


** Design Decisions **

1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
  pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.

Differential Revision: http://reviews.llvm.org/D9210

<rdar://problem/3201744>

llvm-svn: 236507
2015-05-05 17:38:16 +00:00
Tim Northover 851ff69b42 CodeGen: match up correct insertvalue indices when assessing tail calls.
When deciding whether a value comes from the aggregate or inserted value of an
insertvalue instruction, we compare the indices against those of the location
we're interested in. One of the lists needs reversing because the input data is
backwards (so that modifications take place at the end of the SmallVector), but
we were reversing both before leading to incorrect results.

Should fix PR23408

llvm-svn: 236457
2015-05-04 20:41:51 +00:00
Pete Cooper 300069a019 ScheduleDAGInstrs should toggle kill flags on bundled instrs.
ScheduleDAGInstrs wasn't setting or clearing the kill flags on instructions inside bundles.  This led to code such as this

%R3<def> = t2ANDrr %R0
BUNDLE %ITSTATE<imp-def,dead>, %R0<imp-use,kill>
  t2IT 1, 24, %ITSTATE<imp-def>
  R6<def,tied6> = t2ORRrr %R0<kill>, ...

being transformed to

BUNDLE %ITSTATE<imp-def,dead>, %R0<imp-use>
  t2IT 1, 24, %ITSTATE<imp-def>
  R6<def,tied6> = t2ORRrr %R0<kill>, ...
%R3<def> = t2ANDrr %R0<kill>

where the kill flag was removed from the BUNDLE instruction, but not the t2ORRrr inside it.  The verifier then thought that
R0 was undefined when read by the AND.

This change make the toggleKillFlags method also check for bundles and toggle flags on bundled instructions.
Setting the kill flag is special cased as we only want to set the kill flag on the last instruction in the bundle.

llvm-svn: 236428
2015-05-04 16:52:06 +00:00
Elena Demikhovsky 1b60ed7069 Masked gather and scatter intrinsics - enabled codegen for KNL.
llvm-svn: 236394
2015-05-03 07:12:25 +00:00
Simon Pilgrim 017ca19384 [DAGCombiner] Enabled vector float/double -> int constant folding
llvm-svn: 236387
2015-05-02 13:04:07 +00:00
David Blaikie 72d03efa6d DebugInfo: Use low_pc relative debug_ranges under fission when the CU has a low_pc
Seems we were setting the base address on the wrong DwarfCompileUnit
object so it wasn't being used when generating the ranges.

llvm-svn: 236377
2015-05-02 02:31:49 +00:00
Jim Grosbach bfe3a9c318 Fix spelling.
llvm-svn: 236367
2015-05-02 00:44:07 +00:00
Reid Kleckner 83d89fa546 Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236359. Things are still broken despite testing. :(

llvm-svn: 236360
2015-05-01 22:50:14 +00:00
Reid Kleckner 51476acd77 Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236340.

llvm-svn: 236359
2015-05-01 22:40:25 +00:00
Reid Kleckner 2747d3d55a Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236339, it breaks the win32 clang-cl self-host.

llvm-svn: 236340
2015-05-01 20:14:04 +00:00
Reid Kleckner 4856fc61b4 [WinEH] Add an EH registration and state insertion pass for 32-bit x86
This pass is responsible for constructing the EH registration object
that gets linked into fs:00, which is all it does in this change. In the
future, it will also insert stores to update the EH state number.

I considered keeping this functionality in WinEHPrepare, but it's pretty
separable and X86 specific. It has conceptually very little to do with
the task of WinEHPrepare, which is currently outlining.  WinEHPrepare is
also in theory useful on ARM, but this logic is pretty x86 specific.

Reviewers: andrew.w.kaylor, majnemer

Differential Revision: http://reviews.llvm.org/D9422

llvm-svn: 236339
2015-05-01 20:04:54 +00:00
Simon Pilgrim 9fb06bca67 [SelectionDAG] Unary vector constant folding integer legality fixes
This patch fixes issues with vector constant folding not correctly handling scalar input operands if they require implicit truncation - this was tested with llvm-stress as recommended by Patrik H Hagglund.

The patch ensures that integer input scalars from a build vector are correctly truncated before folding, and that constant integer scalar results are promoted to a legal type before inclusion in the new folded build vector.

I have added another crash test case and also a test for UINT_TO_FP / SINT_TO_FP using an non-truncated scalar input, which was failing before this patch.

Differential Revision: http://reviews.llvm.org/D9282

llvm-svn: 236308
2015-05-01 08:20:04 +00:00
Matt Arsenault 59d2ca1cba Fix typo
llvm-svn: 236283
2015-04-30 23:20:56 +00:00
Pete Cooper 451755d370 Commute the internal flag on MachineOperands.
When commuting a thumb instruction in the size reduction pass, thumb
instructions are represented as a bundle and so some operands may be marked
as internal.  The internal flag has to move with the operand when commuting.

This test is sensitive to register allocation so can't specifically check that
this error was happening, but so long as it continues to pass with -verify then
hopefully its still ok.

rdar://problem/20752113

llvm-svn: 236282
2015-04-30 23:14:14 +00:00
Andrea Di Biagio c84b5bdd69 Fix for PR23103. Correctly propagate the 'IsUndef' flag to the register operands of a commuted instruction.
Revision 220239 exposed a latent bug in method
'TargetInstrInfo::commuteInstruction'. When commuting the operands of a machine
instruction, method 'commuteInstruction' didn't correctly propagate the
'IsUndef' flag to the register operands of the new (commuted) instruction.

Before this patch, the following instruction:
  %vreg4<def> = VADDSDrr  %vreg14, %vreg5<undef>; FR64:%vreg4,%vreg14,%vreg5

was wrongly converted by method 'commuteInstruction' into:
  %vreg4<def> = VADDSDrr  %vreg5, %vreg14<undef>; FR64:%vreg4,%vreg5,%vreg14

The correct instruction should have been:
  %vreg4<def> = VADDSDrr  %vreg5<undef>, %vreg14; FR64:%vreg4,%vreg5,%vreg14

This patch fixes the problem in method 'TargetInstrInfo::commuteInstruction'.
When swapping the operands of a machine instruction, we now make sure that
'IsUndef' flags are correctly set.
Added test case 'pr23103.ll'.

Differential Revision: http://reviews.llvm.org/D9406

llvm-svn: 236258
2015-04-30 21:03:29 +00:00
Matt Arsenault ee5c2ab734 MachineVerifier: Don't crash if MachineOperand has no parent
If you somehow added a MachineOperand to an instruction
that did not have the parent set, the verifier would
crash since it attempts to use the operand's parent.

llvm-svn: 236249
2015-04-30 19:35:41 +00:00
Pete Cooper 4d8d2ec3eb Don't rewrite jumps to empty BBs to landing pads.
In the test case here, the 'unreachable' BB was removed by BranchFolding because its empty.

It then rewrote the jump from 'entry' to jump to its fallthrough, which was a landing pad.

This results in 'entry' jumping to 2 different landing pads, which fails the machine verifier.

rdar://problem/20750162

llvm-svn: 236248
2015-04-30 18:58:23 +00:00
Reid Kleckner 582786b6cc Add a note about permitting default member initializers
Use them in WinEHPrepare so that we can spot any toolchain bugs that
come up.

llvm-svn: 236244
2015-04-30 18:17:12 +00:00
Jan Vesely 808fff585b Reinstate revisions r234755, r234759, r234760
changes:
  Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO
  Add location to getConstant
  Drop comment about the ops being turned into expand

llvm-svn: 236240
2015-04-30 17:15:56 +00:00
Daniel Jasper 0366cd23ac Inline local variable to silence unused warning.
llvm-svn: 236212
2015-04-30 08:51:13 +00:00
Elena Demikhovsky e1eda8a9e6 Masked gather and scatter - added DAGCombine visitors
and AVX-512 instruction selection patterns.
All other patches, including tests will follow.

http://reviews.llvm.org/D7665

llvm-svn: 236211
2015-04-30 08:38:48 +00:00
Owen Anderson d8a029c81b Semantically revert r236031, which is not a good idea for in-order targets.
At the least it should be guarded by some kind of target hook.
It also introduced catastrophic compile time and code quality
regressions on some out of tree targets (test case still being
reduced/sanitized).

Sanjay agreed with reverting this patch until these issues can be
resolved.

llvm-svn: 236199
2015-04-30 04:06:32 +00:00
Hans Wennborg 4b828d35fd Switch lowering: use profile info to build weight-balanced binary search trees
This will cause hot nodes to appear closer to the root.

The literature says building the tree like this makes it a near-optimal (in
terms of search time given key frequencies) binary search tree. In LLVM's case,
we can do up to 3 comparisons in each leaf node, so it might be better to opt
for lower tree height in some cases; that's something to look into in the
future.

Differential Revision: http://reviews.llvm.org/D9318

llvm-svn: 236192
2015-04-30 00:57:37 +00:00
Reid Kleckner bcda1cd45a [WinEH] Start EH preparation for 32-bit x86, it uses no arguments
32-bit x86 MSVC-style exceptions are functionaly similar to 64-bit, but
they take no arguments. Instead, they implicitly use the value of EBP
passed in by the caller as a pointer to the parent's frame. In LLVM, we
can represent this as llvm.frameaddress(1), and feed that into all of
our calls to llvm.framerecover.

The next steps are:
- Add an alloca to the fs:00 linked list of handlers
- Add something like llvm.sjlj.lsda or generalize it to store in the
  alloca
- Move state number calculation to WinEHPrepare, arrange for
  FunctionLoweringInfo to call it
- Use the state numbers to insert explicit loads and stores in the IR

llvm-svn: 236172
2015-04-29 22:49:54 +00:00
Sanjay Patel 04b0e92766 generalize binop reassociation; NFC
Move the fold introduced in r236031:
http://reviews.llvm.org/rL236031

to its own helper function, so we can use it for other binops.

This is a preliminary step before partially solving:
https://llvm.org/bugs/show_bug.cgi?id=21768
https://llvm.org/bugs/show_bug.cgi?id=23116

llvm-svn: 236171
2015-04-29 22:30:02 +00:00
Pat Gavlin 022c5acad8 Run StatepointLowering.{cpp,h} through clang-format.
llvm-svn: 236166
2015-04-29 21:52:45 +00:00
David Blaikie f64246be72 [opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space
Many of the callers already have the pointer type anyway, and for the
couple of callers that don't it's pretty easy to call PointerType::get
on the pointee type and address space.

This avoids LLParser from using PointerType::getElementType when parsing
GlobalAliases from IR.

llvm-svn: 236160
2015-04-29 21:22:39 +00:00
Sanjay Patel caf5180ff7 tidy up; NFC
llvm-svn: 236156
2015-04-29 21:01:41 +00:00
Sanjay Patel ee6678119d too much space again; NFC
llvm-svn: 236150
2015-04-29 20:38:02 +00:00
Sanjay Patel 435efaadff too much space; NFC
llvm-svn: 236147
2015-04-29 20:32:57 +00:00
Andrew Kaylor a33f159056 [WinEH] Fix minor bug in begincatch block splitting
llvm-svn: 236129
2015-04-29 17:21:26 +00:00
Duncan P. N. Exon Smith a9308c49ef IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

llvm-svn: 236120
2015-04-29 16:38:44 +00:00
Jan Vesely 7539548738 CodeGen: Default overflow operations to expand so we don't have to assume targets are lying
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: ab
Differential Revision: http://reviews.llvm.org/D9265

llvm-svn: 236119
2015-04-29 16:30:46 +00:00
Elena Demikhovsky ac969012ef Fixed masked gather/scatter switch-case
llvm-svn: 236092
2015-04-29 08:38:53 +00:00
Elena Demikhovsky 744fe0de33 fixed comments, blanks, nullptr; NFC
llvm-svn: 236086
2015-04-29 06:49:50 +00:00
Matthias Braun 5295793bca RegisterCoalescer: hide terminal rule option by default
llvm-svn: 236062
2015-04-28 23:55:11 +00:00
Andrew Kaylor 91307434f4 Style updates
llvm-svn: 236048
2015-04-28 22:01:51 +00:00
Andrew Kaylor 046f7b42f2 [WinEH] Split blocks at calls to llvm.eh.begincatch
Differential Revision: http://reviews.llvm.org/D9311

llvm-svn: 236046
2015-04-28 21:54:14 +00:00
Sanjay Patel 2fbc4e5c49 transform fadd chains to increase parallelism
This is a compromise: with this simple patch, we should always handle a chain of exactly 3
operations optimally, but we're not generating the optimal balanced binary tree for a longer
sequence.

In general, this transform will reduce the dependency chain for a sequence of instructions
using N operands from a worst case N-1 dependent operations to N/2 dependent operations. 
The optimal balanced binary tree would reduce the chain to log2(N).

The trade-off for not dealing with longer sequences is: (1) we have less complexity in the
compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we
don't need to worry about the increased register pressure required to parallelize longer
sequences. It also seems unlikely that we would ever encounter really long strings of
dependent ops like that in the wild, but I'm not sure how to verify that speculation.
FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math
and this patch.

We can extend this patch to cover other associative operations such as fmul, fmax, fmin, 
integer add, integer mul.

This is a partial fix for:
https://llvm.org/bugs/show_bug.cgi?id=17305

and if extended:
https://llvm.org/bugs/show_bug.cgi?id=21768
https://llvm.org/bugs/show_bug.cgi?id=23116

The issue also came up in:
http://reviews.llvm.org/D8941

Differential Revision: http://reviews.llvm.org/D9232

llvm-svn: 236031
2015-04-28 21:03:22 +00:00
Sanjay Patel ba55804ea3 move IR-level optimization flags into their own struct
This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900).

In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment, 
we should eventually share this data between the IR passes and the backend.

We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to
use the new struct.

The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a
data member to the subclass. This means we don't have to repeat all of the get/set methods per flag,
but we're potentially adding size to all nodes of this subclassi type.

In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an
SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at
least one free byte to play with due to struct alignment.

Differential Revision: http://reviews.llvm.org/D9325

llvm-svn: 235997
2015-04-28 16:39:12 +00:00
Sergey Dmitrouk 842a51bad8 Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"
[DebugInfo] Add debug locations to constant SD nodes

This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235989
2015-04-28 14:05:47 +00:00
Daniel Jasper 48e93f7181 Revert "[DebugInfo] Add debug locations to constant SD nodes"
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870

llvm-svn: 235987
2015-04-28 13:38:35 +00:00
Sergey Dmitrouk adb4c69d5c [DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235977
2015-04-28 11:56:37 +00:00
Elena Demikhovsky 584ce378ab Masked gather and scatter: Added code for SelectionDAG.
All other patches, including tests will follow.

http://reviews.llvm.org/D7665

llvm-svn: 235970
2015-04-28 07:57:37 +00:00
Hans Wennborg 7bf4d4eee0 Switch lowering: use uint32_t for weights everywhere
I previously thought switch clusters would need to use uint64_t in case
the weights of multiple cases overflowed a 32-bit int. It turns
out that the weights on a terminator instruction are capped to allow for
being added together, so using a uint32_t should be safe.

llvm-svn: 235945
2015-04-27 23:52:19 +00:00
Hans Wennborg 67c03759e4 Switch lowering: Take branch weight into account when ordering for fall-through
Previously, the code would try to put a fall-through case last,
even if that meant moving a case with much higher branch weight
further down the chain.

Ordering by branch weight is most important, putting a fall-through
block last is secondary.

llvm-svn: 235942
2015-04-27 23:35:22 +00:00
Hans Wennborg ba6d2568f9 Switch lowering: order bit tests by branch weight.
llvm-svn: 235912
2015-04-27 20:21:17 +00:00
Philip Reames 20c24f1da5 Make the message associated with a fatal error slightly more helpful
Looking into 23095, my best guess is that the CodeGen library itself isn't getting linked and initialized properly.  To make this slightly more obvious to consumers of LLVM, emit a different error message if we can tell that the registry is empty vs you've simply happened to name a collector which hasn't been registered.  

llvm-svn: 235824
2015-04-26 22:00:34 +00:00
Andrew Kaylor 8c384bbb35 Fix build error from accidental change
llvm-svn: 235792
2015-04-24 23:34:46 +00:00
Andrew Kaylor 8c79411203 [WinEH] Find correct cloned entry block for outlined handler functions.
llvm-svn: 235791
2015-04-24 23:27:32 +00:00
Andrew Kaylor 5dacfd8b8a [WinEH] Find correct cloned entry block for outlined handler functions.
llvm-svn: 235789
2015-04-24 23:10:38 +00:00
Quentin Colombet 8229145961 [DAGCombiner] Fix the type used in canFoldInAddressingMode to account for the
right scaling.

In the function canFoldInAddressingMode, VT is computed as the type of the
destination/source of a LOAD/STORE operations, instead of the memory type of the
operation.
On targets with a scaling factor on the offset of the LOAD/STORE operations, the
function may return false for actually valid cases. This may then prevent the
selection of profitable pre or post indexed load/store operations, and instead
select pre or post indexed load/store for unprofitable cases.

Patch by Francois de Ferriere <francois.de-ferriere@st.com>!

Differential Revision: http://reviews.llvm.org/D9146

llvm-svn: 235780
2015-04-24 21:28:00 +00:00
Kaelyn Takata 5e5524bc25 Remove an unused variable to prevent -Werror build failures.
llvm-svn: 235773
2015-04-24 21:02:18 +00:00
Reid Kleckner cfbfe6f29c [SEH] Implement GetExceptionCode in __except blocks
This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered
by copying the EAX value live into whatever basic block it is called
from. Obviously, this only works if you insert it late during codegen,
because otherwise mid-level passes might reschedule it.

llvm-svn: 235768
2015-04-24 20:25:05 +00:00
Lang Hames 9ff69c8f4d [AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr.
AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a
reference for this is crufty.

llvm-svn: 235752
2015-04-24 19:11:51 +00:00
Hans Wennborg ec679a8b3b Switch lowering: fix APInt overflow causing infinite loop / OOM
llvm-svn: 235729
2015-04-24 16:53:55 +00:00
Reid Kleckner 2c3ccaacb7 [WinEH] Split the landingpad BB instead of cloning it
This means we don't have to RAUW the landingpad instruction and
landingpad BB, which is a nice win.

llvm-svn: 235725
2015-04-24 16:22:19 +00:00
Matthias Braun f2a08dcaf6 RegisterCoalescer: implicit phsreg uses are fine when rematerializing
The target hooks should have already checked them. This change is
necessary to enable the remateriailzation on R600.

llvm-svn: 235673
2015-04-24 00:01:37 +00:00
Matthias Braun 43fb8a157b RegisterCoalescer: Avoid unnecessary register class widening for some rematerializations
I couldn't provide a testcase as none of the public targets has wide
register classes with alot of subregisters and at the same time an
instruction which "ReMaterializable" and "AsCheapAsAMove" (could
probably be added for R600).

llvm-svn: 235668
2015-04-23 23:24:36 +00:00
Reid Kleckner 5c5facc2ce Re-commit "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works"
This reverts commit r235617.

r235649 should have addressed the problems.

llvm-svn: 235667
2015-04-23 23:22:33 +00:00
Andrew Kaylor 20ae2a311f [WinEH] Ignore filter clauses while mapping landing pad blocks.
llvm-svn: 235656
2015-04-23 22:38:36 +00:00
Reid Kleckner 1ac225219c Remove trivial assert to fix NDEBUG Werror builds
llvm-svn: 235652
2015-04-23 21:36:32 +00:00
Reid Kleckner e3af86e9d9 [WinEH] Replace more lpad value uses with undef
We were asserting on code like this:
  extern "C" unsigned long _exception_code();
  void might_crash(unsigned long);
  void foo() {
    __try {
      might_crash(0);
    } __except(1) {
      might_crash(_exception_code());
    }
  }

Gtest and many other libraries get the exception code from the __except
block. What's supposed to happen here is that EAX is live into the
__except block, and it contains the exception code. Eventually we'll
represent that as a use of the landingpad ehptr value, but for now we
can replace it with undef.

llvm-svn: 235649
2015-04-23 21:22:30 +00:00
Quentin Colombet 796d906e06 [MachineCopyPropagation] Handle undef flags conservatively so that we do not
remove copies that are useful after breaking some hardware dependencies.
In other words, handle this kind of situations conservatively by assuming reg2
is redefined by the undef flag.
reg1 = copy reg2
= inst reg2<undef>
reg2 = copy reg1
Copy propagation used to remove the last copy.
This is incorrect because the undef flag on reg2 in inst, allows next
passes to put whatever trashed value in reg2 that may help.
In practice we end up with this code:
reg1 = copy reg2
reg2 = 0
= inst reg2<undef>
reg2 = copy reg1

This fixes PR21743.

llvm-svn: 235647
2015-04-23 21:17:39 +00:00
Andrew Kaylor 5f715522f1 [WinEH] Handle stubs for outlined functions that have only unreached terminators.
llvm-svn: 235618
2015-04-23 18:37:39 +00:00
Reid Kleckner 909ea7e6b8 Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works"
We still have some "uses remain after removal" issues in -O0 builds.

This reverts commit r235557.

llvm-svn: 235617
2015-04-23 18:34:01 +00:00
Hans Wennborg 0867b151c9 Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.

llvm-svn: 235608
2015-04-23 16:45:24 +00:00
Aaron Ballman 0be238cebd Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range.
llvm-svn: 235597
2015-04-23 13:41:59 +00:00
Simon Pilgrim 86b034bae9 [DAGCombiner] Remove extra bitcasts surrounding vector shuffles
Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1)

Differential Revision: http://reviews.llvm.org/D9097

llvm-svn: 235578
2015-04-23 08:43:13 +00:00
Andrew Kaylor 43e1d76278 [WinEH] Don't skip landing pads that end with an unreachable instruction.
llvm-svn: 235563
2015-04-23 00:20:44 +00:00
Hans Wennborg 15823d49b6 Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
This is a re-commit of r235101, which also fixes the problems with the previous patch:

- Switches with only a default case and non-fallthrough were handled incorrectly

- The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here.

> This is a major rewrite of the SelectionDAG switch lowering. The previous code
> would lower switches as a binary tre, discovering clusters of cases
> suitable for lowering by jump tables or bit tests as it went along. To increase
> the likelihood of finding jump tables, the binary tree pivot was selected to
> maximize case density on both sides of the pivot.
>
> By not selecting the pivot in the middle, the binary trees would not always
> be balanced, leading to performance problems in the generated code.
>
> This patch rewrites the lowering to search for clusters of cases
> suitable for jump tables or bit tests first, and then builds the binary
> tree around those clusters. This way, the binary tree will always be balanced.
>
> This has the added benefit of decoupling the different aspects of the lowering:
> tree building and jump table or bit tests finding are now easier to tweak
> separately.
>
> For example, this will enable us to balance the tree based on profile info
> in the future.
>
> The algorithm for finding jump tables is quadratic, whereas the previous algorithm
> was O(n log n) for common cases, and quadratic only in the worst-case. This
> doesn't seem to be major problem in practice, e.g. compiling a file consisting
> of a 10k-case switch was only 30% slower, and such large switches should be rare
> in practice. Compiling e.g. gcc.c showed no compile-time difference.  If this
> does turn out to be a problem, we could limit the search space of the algorithm.
>
> This commit also disables all optimizations during switch lowering in -O0.
>
> Differential Revision: http://reviews.llvm.org/D8649

llvm-svn: 235560
2015-04-22 23:14:56 +00:00
Reid Kleckner 64a2a6a473 [SEH] Remove the old __C_specific_handler code now that WinEHPrepare works
This removes the -sehprepare flag and makes __C_specific_handler
functions always to use WinEHPrepare.

This was tested by building all of chromium_builder_tests and running a
few tests that use SEH, but if something breaks, we can revert this.

llvm-svn: 235557
2015-04-22 22:13:09 +00:00
Reid Kleckner fd7df284b8 [WinEH] Demote values and phis live across exception handlers up front
In particular, this handles SSA values that are live *out* of a handler.
The existing code only handles values that are live *in* to a handler.

It also handles phi nodes in the block where normal control should
resume after the end of a catch handler.  When EH return points have phi
nodes, we need to split the return edge. It is impossible for phi
elimination to emit copies in the previous block if that block gets
outlined. The indirectbr that we leave in the function is only notional,
and is eliminated from the MachineFunction CFG early on.

Reviewers: majnemer, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D9158

llvm-svn: 235545
2015-04-22 21:05:21 +00:00
Luqman Aden c76f470c2d Test commit: fix typo in comment.
llvm-svn: 235526
2015-04-22 17:42:37 +00:00
Olivier Sallenave c587bee405 Fixed logic to enable complex FMA formation.
llvm-svn: 235508
2015-04-22 14:07:26 +00:00
Hal Finkel 0d49cf2645 [DAGCombine] Disable select(c, load,load) for indexed loads
This turned up after r235333, but was a pre-existing bug. The optimization
which transforms select(c, load, load) into a load of a select of the addresses
does not handle indexed loads (pre/post inc/dec). However, it did not check for
them either, leading to a crash if it tried to transform one of them.

llvm-svn: 235497
2015-04-22 11:32:25 +00:00
Lang Hames 65613a634a [patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and the
X86 backend.

The code generated for symbolic targets is identical to the code generated for
constant targets, except that a relocation is emitted to fix up the actual
target address at link-time. This allows IR and object files containing
patchpoints to be cached across JIT-invocations where the target address may
change.

llvm-svn: 235483
2015-04-22 06:02:31 +00:00
Reid Kleckner f14787dad8 [WinEH] Correctly handle inlined __finally blocks with captures
We should also teach the inliner to collapse framerecover of
frameaddress of the current frame down to an alloca, but that can happen
later.

llvm-svn: 235459
2015-04-22 00:07:52 +00:00
Duncan P. N. Exon Smith aa861aa483 DebugInfo: Remove DIArray and DITypeArray typedefs
Remove the `DIArray` and `DITypeArray` typedefs, preferring the
underlying types (`DebugNodeArray` and `MDTypeRefArray`, respectively).

llvm-svn: 235413
2015-04-21 20:07:38 +00:00
Duncan P. N. Exon Smith 60635e39b6 DebugInfo: Drop rest of DIDescriptor subclasses
Delete the remaining subclasses of (the already deleted) `DIDescriptor`.
Part of PR23080.

llvm-svn: 235404
2015-04-21 18:44:06 +00:00
Duncan P. N. Exon Smith d4a19a396d DebugInfo: Assert dbg.declare/value insts are valid
Remove early returns for when `getVariable()` is null, and just assert
that it never happens.  The Verifier already confirms that there's a
valid variable on these intrinsics, so we should assume the debug info
isn't broken.  I also updated a check for a `!dbg` attachment, which the
Verifier similarly guarantees.

llvm-svn: 235400
2015-04-21 18:24:23 +00:00
Reid Kleckner d2a1a51996 Re-land r235154-r235156 under the existing -sehprepare flag
Keep the old SEH fan-in lowering on by default for now, since projects
rely on it.  This will make it easy to test this change with a simple
flag flip.

llvm-svn: 235399
2015-04-21 18:23:57 +00:00
Simon Pilgrim 860f08779c CONCAT_VECTOR of BUILD_VECTOR - minor fix
Fixed issue with the combine of CONCAT_VECTOR of 2 BUILD_VECTOR nodes - the optimisation wasn't ensuring that the scalar operands of both nodes were the same type/size for implicit truncation.

Test case spotted by Patrik Hagglund

llvm-svn: 235371
2015-04-21 08:05:43 +00:00
Pawel Bylica 57c2f7c756 Fix generic shift expansion when shift amount is 0
Summary:
This fixes http://llvm.org/bugs/show_bug.cgi?id=16439. 

This is one possible way to approach this. The other would be to split InL>>(nbits-Amt) into (InL>>(nbits-1-Amt))>>1, which is also valid since since we only need to care about Amt up nbits-1. It's hard to tell which one is better since the shift might be expensive if this stage of expansion is not yet a legal machine integer, whereas comparisons with zero are relatively cheap at all sizes, but more expensive than a shift if the shift is on a legal machine type. 

Patch by Keno Fischer!

Test Plan: regression test from http://reviews.llvm.org/D7752

Reviewers: chfast, resistor

Reviewed By: chfast, resistor

Subscribers: sanjoy, resistor, chfast, llvm-commits

Differential Revision: http://reviews.llvm.org/D4978

llvm-svn: 235370
2015-04-21 06:28:36 +00:00
Andrew Kaylor 00e5d9ee5f [WinEH] Fix problem with landing pad return values used in PHI nodes during outlining.
llvm-svn: 235358
2015-04-20 22:53:42 +00:00
Duncan P. N. Exon Smith 2fbe13540a DebugInfo: Delete subclasses of DIScope
Delete subclasses of (the already defunct) `DIScope`, updating users to
use the raw pointers from the `Metadata` hierarchy directly.

llvm-svn: 235356
2015-04-20 22:10:08 +00:00
Andrew Kaylor 41758517bf [WinEH] Fix problem with mapping shared empty handler blocks.
Differential Revision: http://reviews.llvm.org/D9125

llvm-svn: 235354
2015-04-20 22:04:09 +00:00
Duncan P. N. Exon Smith c62468859a DebugInfo: Delete old subclasses of DIType
Delete subclasses of (the already deleted) `DIType` in favour of
directly using pointers from the `Metadata` hierarchy.

While `DICompositeType` wraps `MDCompositeTypeBase` and `DIDerivedType`
wraps `MDDerivedTypeBase`, most uses of each really meant the more
specific `MDCompositeType` and `MDDerivedType`.

llvm-svn: 235351
2015-04-20 21:17:32 +00:00
Duncan P. N. Exon Smith 698df36ab7 DwarfUnit: Split MDSubroutineType version of constructTypeDIE()
The version of `constructTypeDIE()` for `MDSubroutineType` is unrelated
to (and has different callers than) the `MDCompositeType`.  Split the
two in half.

This simplifies an upcoming patch to delete `DICompositeType`.  There
shouldn't be any real functionality change here.  `createTypeDIE()` is
`cast<>`'ing where it didn't need to before, but that function in turn
is only called for true `MDCompositeType`s.

llvm-svn: 235349
2015-04-20 21:04:33 +00:00
Duncan P. N. Exon Smith d89ef16aa9 DwarfUnit: Cleanup comments
Update comment style in `DwarfUnit`.

  - Drop duplicated comments at definition, and update the comments at
    the declaration where the definition comments looked newer or more
    complete.
  - Drop the `functionName -` prefix.
  - Add `\brief` in a few places.
  - Remove a few comments entirely that weren't adding value (just
    turned the function name and arguments into a sentence).

llvm-svn: 235345
2015-04-20 20:29:51 +00:00
Olivier Sallenave b99c2eb0f0 Refactoring and enhancement to FMA combine.
llvm-svn: 235344
2015-04-20 20:29:40 +00:00
Tom Stellard 69a7b91e95 DAGCombine: Remove redundant NaN checks around ISD::FSQRT
This folds:

(select (setcc x, -0.0, *lt), NaN, (fsqrt x)) -> ( fsqrt x)

llvm-svn: 235333
2015-04-20 19:38:27 +00:00
Duncan P. N. Exon Smith 9928a909c6 DebugInfo: Remove DIType
This is the last major parent class, so I'll probably start deleting
classes in batches now.  Looks like many of the references to the DI*
hierarchy were updated organically along the way.

llvm-svn: 235331
2015-04-20 18:52:06 +00:00
Andrew Kaylor f18771bdfd [WinEH] Fix memory leak with catch-all mapping.
llvm-svn: 235328
2015-04-20 18:48:45 +00:00
Duncan P. N. Exon Smith be9e4fe768 DebugInfo: Remove DIScope
Replace uses of `DIScope` with `MDScope*`.  There was one spot where
I've left an `MDScope*` uninitialized (where `DIScope` would have been
default-initialized to `nullptr`) -- this is intentional, since the
if/else that follows should unconditional assign it to a value.

llvm-svn: 235327
2015-04-20 18:32:29 +00:00
Duncan P. N. Exon Smith 848af387d8 DebugInfo: Remove typedefs for DITypeRef, etc.
Remove typedefs for type refs:

  - DITypeRef => MDTypeRef
  - DIScopeRef => MDScopeRef
  - DIDescriptorRef => DebugNodeRef

llvm-svn: 235323
2015-04-20 18:20:03 +00:00
Hal Finkel 1e5733bbed [InlineAsm] Remove EarlyClobber on registers that are also inputs
When an inline asm call has an output register marked as early-clobber, but
that same register is also an input operand, what should we do? GCC accepts
this, and is documented to accept this for read/write operands saying,
"Furthermore, if the earlyclobber operand is also a read/write operand, then
that operand is written only after it's used." For write-only operands, the
situation seems less clear, but I have at least one existing codebase that
assumes this will work, in part because it has syscall macros like this:

({                                                                         \
  register uint64_t r0 __asm__ ("r0") = (__NR_ ## name);                   \
  register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0));               \
  register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1));               \
  register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2));               \
  __asm__ __volatile__                                                     \
  ("sc"                                                                    \
   : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5)                               \
   :   "0"(r0),  "1"(r3),  "2"(r4),  "3"(r5)                               \
   : "r6","r7","r8","r9","r10","r11","r12","cr0","memory");                \
  r3;                                                                      \
})

Furthermore, with register aliases and subregister relationships that only the
backend knows about, rejecting this in the frontend seems like a difficult
proposition (if we wanted to do so). However, keeping the early-clobber flag on
the INLINEASM MI does not work for us, because it will cause the register's
live interval to end to soon (so it will not appear defined to be used as an
input).

Fortunately, fixing this does not seem hard: When forming the INLINEASM MI,
check to see if any of the early-clobber outputs are also inputs, and if so,
remove the early-clobber flag.

llvm-svn: 235283
2015-04-20 00:01:30 +00:00
Eric Christopher d2e3ddad14 Remove CFIFuncName from TargetOptions as it is currently unused.
llvm-svn: 235268
2015-04-19 03:21:04 +00:00
Eric Christopher 78804ab2df Remove the CFIEnforcing flag from TargetOptions as it is unused.
llvm-svn: 235267
2015-04-19 03:20:59 +00:00
Ahmed Bougacha 279e3ee954 [GlobalMerge] Look at uses to create smaller global sets.
Instead of merging everything together, look at the users of
GlobalVariables, and try to group them by function, to create
sets of globals used "together".

Using that information, a less-aggressive alternative is to keep merging
everything together *except* globals that are only ever used alone, that
is, those for which it's clearly non-profitable to merge with others.

In my testing, grouping by Function is too aggressive, but grouping by
BasicBlock is too conservative.  Anything in-between isn't trivially
available, so stick with Function grouping for now.

cl::opts are added for testing; both enabled by default.

A few of the testcases aren't testing the merging proper, but just
various edge cases when merging does occur.  Update them to use the
previous grouping behavior. Also, one of the tests is unrelated to
GlobalMerge; change it accordingly.
While there, switch to r234666' flags rather than the brutal -O3.

Differential Revision: http://reviews.llvm.org/D8070

llvm-svn: 235249
2015-04-18 01:21:58 +00:00
Duncan P. N. Exon Smith 7c60f20e49 DebugInfo: Delete DIDescriptor (but not its subclasses)
Delete `DIDescriptor` and update the remaining users.  I'll follow-up by
deleting subclasses in manageable groups (top-down).

llvm-svn: 235248
2015-04-18 00:35:36 +00:00
Andrew Kaylor 761fb44efe Fix build wanrings and line endings
llvm-svn: 235241
2015-04-17 23:20:24 +00:00
Duncan P. N. Exon Smith ed557b55ee DebugInfo: Remove DIDescriptor from the DebugInfo API
Stop using `DIDescriptor` and its subclasses in the `DebugInfoFinder`
API, as well as the rest of the API hanging around in `DebugInfo.h`.

llvm-svn: 235240
2015-04-17 23:20:10 +00:00
Andrew Kaylor ea8df61d4d [WinEH] Fixes for a few cppeh failures.
Differential Review: http://reviews.llvm.org/D9065

llvm-svn: 235239
2015-04-17 23:05:43 +00:00
Duncan P. N. Exon Smith 364a3005f2 AsmPrinter: Create a unified .debug_loc stream
This commit removes `DebugLocList` and replaces it with
`DebugLocStream`.

  - `DebugLocEntry` no longer contains its byte/comment streams.
  - The `DebugLocEntry` list for a variable/inlined-at pair is allocated
    on the stack, and released right after `DebugLocEntry::finalize()`
    (possible because of the refactoring in r231023).  Now, only one
    list is in memory at a time now.
  - There's a single unified stream for the `.debug_loc` section that
    persists, stored in the new `DebugLocStream` data structure.

The last point is important: this collapses the nested `SmallVector<>`s
from `DebugLocList` into unified streams.  We previously had something
like the following:

    vec<tuple<Label, CU,
              vec<tuple<BeginSym, EndSym,
                        vec<Value>,
                        vec<char>,
                        vec<string>>>>>

A `SmallVector` can avoid allocations, but is statically fairly large
for a vector: three pointers plus the size of the small storage, which
is the number of elements in small mode times the element size).
Nesting these is expensive, since an inner vector's size contributes to
the element size of an outer one.  (Nesting any vector is expensive...)

In the old data structure, the outer vector's *element* size was 632B,
excluding allocation costs for when the middle and inner vectors
exceeded their small sizes.  312B of this was for the "three" pointers
in the vector-tree beneath it.  If you assume 1M functions with an
average of 10 variable/inlined-at pairs each (in an LTO scenario),
that's almost 6GB (besides inner allocations), with almost 3GB for the
"three" pointers.

This came up in a heap profile a little while ago of a `clang -flto -g`
bootstrap, with `DwarfDebug::collectVariableInfo()` using something like
10-15% of the total memory.

With this commit, we have:

    tuple<vec<tuple<Label, CU, Offset>>,
          vec<tuple<BeginSym, EndSym, Offset, Offset>>,
          vec<char>,
          vec<string>>

The offsets are used to create `ArrayRef` slices of adjacent
`SmallVector`s.  This reduces the number of vectors to four (unrelated
to the number of variable/inlined-at pairs), and caps the number of
allocations at the same number.

Besides saving memory and limiting allocations, this is NFC.

I don't know my way around this code very well yet, but I wonder if we
could go further: why stream to a side-table, instead of directly to the
output stream?

llvm-svn: 235229
2015-04-17 21:34:47 +00:00
Duncan P. N. Exon Smith 237662429d Remove dead code, NFC
llvm-svn: 235225
2015-04-17 21:06:49 +00:00
David Majnemer dcd89368cb [WinEH] Reusing HandlerType entries leads to small CatchHigh values
CatchHigh may be smaller than TryHigh if we reuse an outlined catch
handler for two different invokes with different EH states.  We have no
evidence which shows that CatchHigh must be greater than TryHigh or
TryLow.  We can revisit this if we turn out to be wrong.

llvm-svn: 235223
2015-04-17 20:12:09 +00:00
Pirama Arumuga Nainar 50604a69e9 Fix build errors introduced by r235215
Summary:
- Handle TypePromoteFloat in switch statements
- Move an expression into an assert to avoid unused variable in
  non-assert builds.

Reviewers: srhines, ab

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9086

llvm-svn: 235220
2015-04-17 19:51:44 +00:00
Pirama Arumuga Nainar db7c07e2bf Add support to promote f16 to f32
Summary:
This patch adds legalization support to operate on FP16 as a load/store type
and do operations on it as floats.

Tests for ARM are added to test/CodeGen/ARM/fp16-promote.ll

Reviewers: srhines, t.p.northover

Differential Revision: http://reviews.llvm.org/D8755

llvm-svn: 235215
2015-04-17 18:36:25 +00:00
David Majnemer 2be05eef31 [WinEH] Allow CatchHigh to be equal to TryHigh
Catch blocks which are empty may be in the same state as their try
blocks.  It is not meaningful to give the catch block its own state
number in this case because it can't do anything exceptional.

llvm-svn: 235212
2015-04-17 17:20:30 +00:00
Duncan P. N. Exon Smith c0f7dd72b7 AsmPrinter: Store MDExpression directly instead of MDNode, NFC
Clean up `DebugLocEntry::Value::Expression`'s type while I'm messing
around in here anyway.

llvm-svn: 235203
2015-04-17 16:36:10 +00:00
Duncan P. N. Exon Smith 546c8be967 AsmPrinter: Stop storing MDLocalVariable in DebugLocEntry
Stop storing the `MDLocalVariable` in the `DebugLocEntry::Value`s.  We
generate the list of `DebugLocEntry`s separately for each
variable/inlined-at pair, so the variable never actually changes here.

This is effectively NFC (aside from saving some memory and CPU time).

llvm-svn: 235202
2015-04-17 16:33:37 +00:00
Duncan P. N. Exon Smith fba25d6e9b AsmPrinter: Calculate type upfront for location lists, NFC
We can calculate the variable type up front before calling
`DebugLocEntry::finalize()`.  In fact, since we only care about the type
if it's an `MDBasicType`, don't even bother resolving it using the type
identifier map.

llvm-svn: 235201
2015-04-17 16:28:58 +00:00
James Molloy a4ff7b2713 Fix TRUNCATE splitting helper logic.
This is a followon to r233681 - I'd misunderstood the semantics of FTRUNC,
and had confused it with (FP_ROUND ..., 0).

Thanks for Ahmed Bougacha for his post-commit review!

llvm-svn: 235191
2015-04-17 13:51:40 +00:00
Nico Weber a762fa6c98 Revert r235154-r235156, they cause asserts when building win64 code (http://crbug.com/477988)
llvm-svn: 235170
2015-04-17 09:10:43 +00:00
Reid Kleckner 69afb1f8ef Fix unused variable warning
llvm-svn: 235155
2015-04-17 01:03:30 +00:00
Reid Kleckner d4523e3c51 [SEH] Reimplement x64 SEH using WinEHPrepare
This now emits simple, unoptimized xdata tables for __C_specific_handler
based on the handlers listed in @llvm.eh.actions calls produced by
WinEHPrepare.

This adds support for running __finally blocks when exceptions are
thrown, and removes the old landingpad fan-in codepath.

I ran some manual execution tests on small basic test cases with and
without optimization, as well as on Chrome base_unittests, which uses a
small amount of SEH.  I'm sure there are bugs, and we may need to
revert.

llvm-svn: 235154
2015-04-17 01:01:27 +00:00
Duncan P. N. Exon Smith 7bb480dbc2 DebugInfo: Fix UserValue::match() in LiveDebugVariables after r235050
r235050 dropped the inlined-at field from `MDLocalVariable`, deferring
to the `!dbg` attachments.  Fix `UserValue` to take the `!dbg` into
account when differentiating between variables.

llvm-svn: 235140
2015-04-16 22:27:54 +00:00
Duncan P. N. Exon Smith 9f25633170 AsmPrinter: Remove dead code, NFC
llvm-svn: 235139
2015-04-16 22:14:20 +00:00