Commit Graph

20318 Commits

Author SHA1 Message Date
Sanjoy Das ac53dc7520 [StatepointLowering] Don't do two DenseMap lookups; nfci
llvm-svn: 264130
2016-03-23 02:24:15 +00:00
Sanjoy Das 7edbef316b [StatepointLowering] Minor NFC cleanups
- Use auto
 - Name variables in LLVM style
 - Use llvm::find instead of std::find
 - Blank lines between declarations

llvm-svn: 264129
2016-03-23 02:24:13 +00:00
Sanjoy Das 4cd746ebe0 [StatepointLowering] Minor nfc refactoring
Now that StatepointLoweringInfo represents base pointers, derived
pointers and gc relocates as SmallVectors and not ArrayRefs, we no
longer need to allocate "backing storage" on stack in LowerStatepoint.
So elide the backing storage, and inline the trivial body of
getIncomingStatepointGCValues.

llvm-svn: 264128
2016-03-23 02:24:10 +00:00
Sanjoy Das e58ca59cf4 [StatepointLowering] Schedule gc relocates before uniqueing them
Otherwise we can see an "unexpected" gc.relocate that we uniqued away.

llvm-svn: 264127
2016-03-23 02:24:07 +00:00
George Burgess IV d4febd1612 Keep CodeGenPrepare from preserving the domtree.
CGP modifies the domtree in some cases, so saying that it preserves the
domtree is a lie. We'll be able to selectively preserve it with the new
pass manager.

Differential Revision: http://reviews.llvm.org/D16893

llvm-svn: 264099
2016-03-22 21:25:08 +00:00
Simon Pilgrim c6f5fe3d69 [SelectionDAG] Ensure constant folded legalized vector element types are compatible with the BUILD_VECTOR type
Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected.

llvm-svn: 264085
2016-03-22 19:59:53 +00:00
Tim Northover b49a8a9dbb CodeGen: check return types match when emitting tail call to builtin.
We were just completely ignoring the types when determining whether we could
safely emit a libcall as a tail call. This is clearly wrong.

Theoretically, we could dig deeper looking for incidental matches (much like
the generic code in Analysis.cpp does), but it's probably not worth it for the
few libcalls that exist.

llvm-svn: 264084
2016-03-22 19:14:38 +00:00
Sanjoy Das eb5037cadc Allow lowering call sites with both funclets and deopt state
Lowering funclets is a no-op, so we can just go ahead and lower the
deopt state.

llvm-svn: 264078
2016-03-22 18:10:39 +00:00
Sanjoy Das 6b535630a1 Add a hasOperandBundlesOtherThan helper, and use it; NFC
llvm-svn: 264072
2016-03-22 17:51:25 +00:00
Simon Pilgrim 25fb4177fb [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardware
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.

We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.

Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).

Reapplied with a fix for PR26953 (missing vector widening legalization).

Differential Revision: http://reviews.llvm.org/D17932

llvm-svn: 264062
2016-03-22 16:22:08 +00:00
Sanjoy Das 38bfc22161 Add "first class" lowering for deopt operand bundles
Summary:
After this change, deopt operand bundles can be lowered directly by
SelectionDAG into STATEPOINT instructions (which are then lowered to a
call or sequence of nop, with an associated __llvm_stackmaps entry0.
This obviates the need to round-trip deoptimization state through
gc.statepoint via RewriteStatepointsForGC.

Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin

Subscribers: sanjoy, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D18257

llvm-svn: 264015
2016-03-22 00:59:13 +00:00
Silviu Baranga 46030585b3 [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes
Summary:
extract_vector_elt can cause an implicit any_ext if the types don't
match. When processing the following pattern:

  (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c)

DAGCombine was ignoring the possible extend, and sometimes removing
the AND even though it was required to maintain some of the bits
in the result to 0, resulting in a miscompile.

This change fixes the issue by limiting the transformation only to
cases where the extract_vector_elt doesn't perform the implicit
extend.

Reviewers: t.p.northover, jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18247

llvm-svn: 263935
2016-03-21 11:43:46 +00:00
Craig Topper ea87eae4ca Suppress a -Wunused-variable warning in release builds.
llvm-svn: 263892
2016-03-20 01:17:54 +00:00
Saleem Abdulrasool 2854666263 CodeGen: use range based for loop
Convert a loop to use a range based style loop.  NFC.

llvm-svn: 263884
2016-03-19 16:35:32 +00:00
Manman Ren 2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Matthias Braun 0d208fc9f6 MILexer: Add ErrorCallbackType typedef; NFC
llvm-svn: 263829
2016-03-18 20:41:11 +00:00
Reid Kleckner fbd7787d7e [codeview] Only emit function ids for inlined functions
We aren't referencing any other kind of function currently.
Should save a bit on our debug info size.

llvm-svn: 263817
2016-03-18 18:54:32 +00:00
Peter Collingbourne a1f8625662 DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual functions.
A virtual index of -1u indicates that the subprogram's virtual index is
unrepresentable (for example, when using the relative vtable ABI), so do
not emit a DW_AT_vtable_elem_location attribute for it.

Differential Revision: http://reviews.llvm.org/D18236

llvm-svn: 263765
2016-03-17 23:58:03 +00:00
Sanjoy Das 3a02019fbc [SelectionDAG] Remove visitStatepoint; NFC
This way we have a single entry point into StatepointLowering.  The
method was a direct dispatch to LowerStatepoint anyway.

llvm-svn: 263682
2016-03-17 00:47:14 +00:00
Sanjoy Das 43e33d61c6 Fix indentation; NFC
llvm-svn: 263672
2016-03-16 23:11:21 +00:00
Sanjoy Das 70697ff74d Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC
Summary:
This is a step towards implementing "direct" lowering of calls and
invokes with deopt operand bundles into STATEPOINT nodes (as opposed to
having them mandatorily pass through RewriteStatepointsForGC, which is
the case today).

This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint`
helper function that is able to lower a "statepoint like thing", and
uses it to lower `gc.statepoint` calls.  This is an NFC now, but in a
later change we will use `LowerAsStatepoint` to directly lower calls and
invokes with operand bundles without going through an intermediate
`gc.statepoint` IR representation.

FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add
support for lowering non gc.statepoints, right now it is fairly tightly
coupled with an IR level `gc.statepoint`.

Reviewers: reames, pgavlin, JosephTremoulet

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18106

llvm-svn: 263671
2016-03-16 23:08:00 +00:00
James Y Knight f44fc5219f Tweak some atomics functions in preparation for larger changes; NFC.
- Rename getATOMIC to getSYNC, as llvm will soon be able to emit both
  '__sync' libcalls and '__atomic' libcalls, and this function is for
  the '__sync' ones.

- getInsertFencesForAtomic() has been replaced with
  shouldInsertFencesForAtomic(Instruction), so that the decision can be
  made per-instruction. This functionality will be used soon.

- emitLeadingFence/emitTrailingFence are no longer called if
  shouldInsertFencesForAtomic returns false, and thus don't need to
  check the condition themselves.

llvm-svn: 263665
2016-03-16 22:12:04 +00:00
Sanjoy Das 19c6159833 [SelectionDAG] Extract out populateCallLoweringInfo; NFC
SelectionDAGBuilder::populateCallLoweringInfo is now used instead of
SelectionDAGBuilder::lowerCallOperands.  The populateCallLoweringInfo
interface is more composable in face of design changes like
http://reviews.llvm.org/D18106

llvm-svn: 263663
2016-03-16 20:49:31 +00:00
Simon Pilgrim b5a20f0fec Removed trailing whitespace
llvm-svn: 263650
2016-03-16 18:37:44 +00:00
Lang Hames 1b640e05ba [MachO] Add MachO alt-entry directive support.
This patch adds support for the MachO .alt_entry assembly directive, and uses
it for global aliases with non-zero GEP offsets. The alt_entry flag indicates
that a symbol should be layed out immediately after the preceding symbol.
Conceptually it introduces an alternate entry point for a function or data
structure. E.g.:

safe_foo:
  // check preconditions for foo
.alt_entry fast_foo
fast_foo:
  // body of foo, can assume preconditions.

The .alt_entry flag is also implicitly set on assembly aliases of the form:

a = b + C

where C is a non-zero constant, since these have the same effect as an
alt_entry symbol: they introduce a label that cannot be moved relative to the
preceding one. Setting the alt_entry flag on aliases of this form fixes
http://llvm.org/PR25381.

llvm-svn: 263521
2016-03-15 01:43:05 +00:00
Sanjoy Das c11460e051 [StatepointLowering] Move an assertion; NFCI
Instead of running an explicit loop over `gc.relocate` calls hanging off
of a `gc.statepoint`, assert the validity of the type of the value being
relocated in `visitRelocate`.

llvm-svn: 263516
2016-03-15 01:16:31 +00:00
Eric Christopher da8b3f1914 Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on
pre-SSE41 hardware" as it seems to be causing crashes during code
generation in halide. PR forthcoming.

This reverts commit r263303.

llvm-svn: 263512
2016-03-14 23:59:57 +00:00
Amaury Sechet eae09c2c2a Factor out MachineBlockPlacement::fillWorkLists. NFC
Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii.

Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18077

llvm-svn: 263495
2016-03-14 21:24:11 +00:00
Quentin Colombet 40ce25b68b [SpillPlacement] Fix a quadratic behavior in spill placement.
The bad behavior happens when we have a function with a long linear chain of
basic blocks, and have a live range spanning most of this chain, but with very
few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the uses are,
and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two
new nodes to the network due to the completely linear shape of the CFG.
Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes,
which adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening on the
frontier where new nodes are being added. The internal nodes in the network are
not likely to be flip-flopping much, or they will at least settle down very
quickly. This means that while `SpillPlacer->iterate()` is recomputing all the
nodes in the network, it is probably only the two frontier nodes that are
changing their output.

Instead of recomputing the whole network on each iteration, we can maintain a
SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its neighbors are
  added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should converge
to a local minimum, but there is no guarantee that it will find a global
minimum. It is possible that updating nodes in a different order will cause us
to switch to a different local minimum. In other words, this is not NFC, but
although I saw a few runtime improvements and regressions when I benchmarked
this change, those were side effects and actually the performance change is in
the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks,
guidance and time for the review.

llvm-svn: 263460
2016-03-14 18:21:25 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Benjamin Kramer 1082fa66a5 Revert "Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379."
This reverts commit r263424. Breaks self-host.

llvm-svn: 263437
2016-03-14 14:58:36 +00:00
Amjad Aboud ab0378b16c Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26715 at r263379.

llvm-svn: 263424
2016-03-14 12:03:20 +00:00
David Majnemer b9456a5eb3 [CodeView] Consistently handle overly large symbol names
Overly large symbol names weren't correctly handled for leaf function
records.

llvm-svn: 263408
2016-03-14 05:15:09 +00:00
David Majnemer 1256125fb7 [CodeView] Truncate display names
Fundamentally, the length of a variable or function name is bound by the
maximum size of a record: 0xffff.  However, the name doesn't live in a
vacuum; other data is associated with the name, lowering the bound
further.

We would naively attempt to emit the name, causing us to assert because
the record would no-longer fit in 16-bits.  Instead, truncate the name
but preserve as much as we can.

While I have tested this locally, I've decided to not commit it due to
the test's size.

N.B.  While this behavior is undesirable, it is better than MSVC's
behavior.  They seem to truncate to ~4000 characters.

llvm-svn: 263378
2016-03-13 10:53:30 +00:00
Sanjoy Das ecf96c9516 Make gc relocates more strongly typed; NFC
Don't use a `Value *` where we can use a stronger `GCRelocateInst *`
type.

llvm-svn: 263327
2016-03-12 02:54:27 +00:00
Simon Pilgrim 33d57c7547 [X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.

We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.

Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).

Differential Revision: http://reviews.llvm.org/D17932

llvm-svn: 263303
2016-03-11 22:18:05 +00:00
Quentin Colombet dd4b137364 [IRTranslator] Translate unconditional branches.
llvm-svn: 263265
2016-03-11 17:28:03 +00:00
Quentin Colombet f9b4934d1d [MachineIRBuilder] Rework buildInstr API to maximize code reuse.
llvm-svn: 263264
2016-03-11 17:27:58 +00:00
Quentin Colombet e225e2541b [IRTranslator] Update getOrCreateVReg API to use references.
A value that we want to keep in a virtual register cannot be null.
Reflect that in the API.

llvm-svn: 263263
2016-03-11 17:27:54 +00:00
Quentin Colombet 000b580b13 [MachineIRBuilder] Rename the setter of MF for consistency with the getter.
llvm-svn: 263262
2016-03-11 17:27:51 +00:00
Quentin Colombet 91ebd71e26 [MachineIRBuilder] Rename the setter for MBB for consistency with the getter.
llvm-svn: 263261
2016-03-11 17:27:47 +00:00
Quentin Colombet 53237a9e64 [IRTranslator] Update getOrCreateBB API to use references.
A null basic block is invalid, so just pass a reference.

llvm-svn: 263260
2016-03-11 17:27:43 +00:00
Chad Rosier ac216fd9d5 [misched] Fix a truncation issue from r263021.
The truncation was causing the sorting algorithm to behave oddly when comparing
positive and negative offsets.  Fortunately, this doesn't currently happen in
practice and was exposed by a WIP.  Thus, I can't test this change now, but the
follow on patch will.

llvm-svn: 263255
2016-03-11 16:54:07 +00:00
Junmo Park 6098cbbd2c Minor code cleanups. NFC.
llvm-svn: 263200
2016-03-11 07:05:32 +00:00
Junmo Park 4ba6cf69e4 Minor code cleanup. NFC.
llvm-svn: 263196
2016-03-11 05:07:07 +00:00
Pete Cooper adebb9379a Remove llvm::getDISubprogram in favor of Function::getSubprogram
llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself.

Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function.

Ideally this should be NFC, but in reality its possible that a function:

has no !dbg (in which case there's likely a bug somewhere in an opt pass), or
that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will

Reviewed by Duncan Exon Smith.

Differential Revision: http://reviews.llvm.org/D18074

llvm-svn: 263184
2016-03-11 02:14:16 +00:00
Marianne Mailhot-Sarrasin eddc5b130e Test commit access
llvm-svn: 263165
2016-03-10 21:54:25 +00:00
Simon Pilgrim 61eb49e437 [X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion).

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 263159
2016-03-10 20:40:26 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Philip Reames ac115ed72f [CGP] Duplicate addressing computation in cold paths if required to sink addressing mode
This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store.

In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path.

This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had.

Differential Revision: http://reviews.llvm.org/D17652

llvm-svn: 263074
2016-03-09 23:13:12 +00:00
Tom Stellard 9f2e00de7b SelectionDAG: Fix a crash on inline asm when output register supports multiple types
Summary:
The code in SelectionDAG did not handle the case where the
register type and output types were different, but had the same size.

Reviewers: arsenm, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17940

llvm-svn: 263022
2016-03-09 16:02:52 +00:00
Chad Rosier c27a18f39f [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.
http://reviews.llvm.org/D17967

llvm-svn: 263021
2016-03-09 16:00:35 +00:00
Krzysztof Parzyszek cd99e364e3 Invoke DAG postprocessing in the post-RA scheduler
This was inadvertently omitted from r262774, which added the mutation
interface.

llvm-svn: 262939
2016-03-08 16:54:20 +00:00
Hans Wennborg e00b6e7249 Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG"
This caused PR26870.

llvm-svn: 262935
2016-03-08 16:21:41 +00:00
Krzysztof Parzyszek 1a1d78b86f Add DAG mutation interface to the DFA packetizer
llvm-svn: 262930
2016-03-08 15:33:51 +00:00
Justin Bogner 671febc0f7 Re-apply "SelectionDAG: Store SDNode operands in an ArrayRecycler"
This re-applies r262886 with a fix for 32 bit platforms that have 8 byte
pointer alignment, effectively reverting r262892.

Original Message:

  Currently some SDNode operands are malloc'd, some are stored inline in
  subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
  This scheme is complex, inconsistent, and makes refactoring SDNodes
  fairly difficult.

  Instead, we can allocate all of the operands using an ArrayRecycler
  that wraps a BumpPtrAllocator. This keeps the cache locality when
  iterating operands, improves locality when iterating SDNodes without
  looking at operands, and vastly simplifies the ownership semantics.

  It also means we stop overallocating SDNodes by 2-3x and will make it
  simpler to fix the rampant undefined behaviour we have in how we
  mutate SDNodes from one kind to another (See llvm.org/pr26808).

  This is NFC other than the changes in memory behaviour, and I ran some
  LNT tests to make sure this didn't hurt compile time. Not many tests
  changed: there were a couple of 1-2% regressions reported, but there
  were more improvements (of up to 4%) than regressions.

llvm-svn: 262902
2016-03-08 03:14:29 +00:00
Quentin Colombet 5e63e78ca9 [MIR] Change the token name for '<' and '>' to be consitent with the LLVM IR parser.
Thanks to Ahmed Bougacha for noticing!

llvm-svn: 262899
2016-03-08 02:00:43 +00:00
Quentin Colombet 39293d3aaa [GlobalISel] Introduce initializer method to support start/stop-after features.
llvm-svn: 262896
2016-03-08 01:38:55 +00:00
Quentin Colombet 050b211820 [MIR] Teach the parser/printer that generic virtual registers do not need a register class.
llvm-svn: 262893
2016-03-08 01:17:03 +00:00
Justin Bogner 7e6f09c28f Revert "SelectionDAG: Store SDNode operands in an ArrayRecycler"
Looks like the largest SDNode is different between 32 and 64 bit now,
so this is breaking 32 bit bots. Reverting while I figure out a fix.

This reverts r262886.

llvm-svn: 262892
2016-03-08 01:07:03 +00:00
Quentin Colombet 287c6bb571 [MIR] Teach the parser how to parse complex types of generic machine instructions.
By complex types, I mean aggregate or vector types.

llvm-svn: 262890
2016-03-08 00:57:31 +00:00
Justin Bogner 6543a9385f SelectionDAG: Store SDNode operands in an ArrayRecycler
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.

Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.

It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).

This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.

llvm-svn: 262886
2016-03-08 00:39:51 +00:00
Quentin Colombet d655483944 [MIR] Teach the printer how to print complex types for generic machine instructions.
Before this change, we would get the type definition in the middle
of the instruction.
E.g., %0(48) = G_ADD %struct_alias = type { i32, i16 } %edi, %edi

Now, we have just the expected type name:
%0(48) = G_ADD %struct_alias %edi, %edi

llvm-svn: 262885
2016-03-08 00:38:01 +00:00
Quentin Colombet 12350a8e13 [MIR] Print the type of generic machine instructions.
llvm-svn: 262880
2016-03-08 00:29:15 +00:00
Quentin Colombet 851996778f [MIR] Teach the mir parser about types on generic machine instructions.
llvm-svn: 262879
2016-03-08 00:20:48 +00:00
Quentin Colombet 41bea872dd [MachineInstr] Get rid of some GlobalISel ifdefs.
Now the type API is always available, but when global-isel is not
built the implementation does nothing.

Note: The implementation free of ifdefs is WIP and tracked here in PR26576.
llvm-svn: 262873
2016-03-07 22:47:23 +00:00
Quentin Colombet 4e14a497a3 [MIR] Teach the MIPrinter about size for generic virtual registers.
llvm-svn: 262867
2016-03-07 21:57:52 +00:00
Quentin Colombet 2a831fb826 [MIR] Teach the parser how to handle the size of generic virtual registers.
llvm-svn: 262862
2016-03-07 21:48:43 +00:00
Quentin Colombet 1bd7504ef3 [MachineRegisterInfo] Add a method to set the size of a virtual register a posteriori.
This is required for mir testing.

llvm-svn: 262861
2016-03-07 21:41:39 +00:00
Quentin Colombet 70a9670d80 [MachineRegisterInfo] Get rid of the global-isel ifdefs.
One additional pointer is not a big deal size-wise and it makes
the code much nicer!

llvm-svn: 262856
2016-03-07 21:22:09 +00:00
Matt Arsenault ceb2c06cbd DAGCombiner: Check legality before creating extract_vector_elt
Problem not hit by any in tree target.

llvm-svn: 262852
2016-03-07 21:10:09 +00:00
Craig Topper 267bdb2094 [CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC
llvm-svn: 262815
2016-03-07 07:29:12 +00:00
Krzysztof Parzyszek 5c61d11a6d Add DAG mutation interface to the post-RA scheduler
Differential Revision: http://reviews.llvm.org/D17868

llvm-svn: 262774
2016-03-05 15:45:23 +00:00
Matthias Braun 4797ec95e4 RegisterCoalescer: Remap subregister lanemasks before exchanging operands
Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.

This fixes http://llvm.org/PR26805

llvm-svn: 262768
2016-03-05 04:36:13 +00:00
Matthias Braun 8de09aa0c5 RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flags
copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.

llvm-svn: 262767
2016-03-05 04:36:10 +00:00
Matthias Braun 2cbfd9fff5 RegisterPressure: Small cleanup
llvm-svn: 262766
2016-03-05 04:36:08 +00:00
Michael Kuperstein b89f0fa2a2 [DAGCombine] Fix divrem combine not to assume div/rem type is simple.
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization 
combine.

This fixes PR26835.

Differential Revision: http://reviews.llvm.org/D17878

llvm-svn: 262746
2016-03-04 21:23:29 +00:00
Renato Golin 175c6d6d95 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Teresa Johnson d84c7decb6 Change split code gen to use ThreadPool
Part of D15390.

llvm-svn: 262719
2016-03-04 15:39:13 +00:00
Benjamin Kramer 4dbf3371bb Make headers self-contained again.
llvm-svn: 262702
2016-03-04 10:49:30 +00:00
Simon Pilgrim 91dd0a796c [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 262599
2016-03-03 09:43:28 +00:00
Renato Golin 3d78271eac Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

llvm-svn: 262594
2016-03-03 08:57:44 +00:00
Junmo Park 6ba96fb431 [BranchFolding] Change function name related with merging MMOs. NFC
Summary:
Removing MMOs is not our prefer behavior any more.

Reviewers: mcrosier, reames
   
Differential Revision: http://reviews.llvm.org/D17668

llvm-svn: 262580
2016-03-03 03:57:20 +00:00
Philip Reames ae27b2380f [MBP] Renaming a confusing variable and add clarifying comments
Was discussed as part of http://reviews.llvm.org/D17830

llvm-svn: 262571
2016-03-03 00:58:43 +00:00
Philip Reames 23d933982a [MBP] Avoid placing random blocks between loop preheader and header
If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.

The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.

Differential Revision: http://reviews.llvm.org/D17830

llvm-svn: 262547
2016-03-03 00:01:42 +00:00
David Majnemer 1ef654024f [X86] Don't give catch objects a displacement of zero
Catch objects with a displacement of zero do not initialize a catch
object.  The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.

If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.

Address this by creating our catch objects as fixed objects.  We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.

Differential Revision: http://reviews.llvm.org/D17823

llvm-svn: 262546
2016-03-03 00:01:25 +00:00
Philip Reames 02e1132afb [MBP] Remove overly verbose debug output
llvm-svn: 262531
2016-03-02 22:40:51 +00:00
Philip Reames b9688f4382 [MBP] Adjust debug output to be more focused and approachable
llvm-svn: 262522
2016-03-02 21:45:13 +00:00
Renato Golin 93e42d9934 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507
2016-03-02 19:35:45 +00:00
Justin Bogner b2ecee9c31 SelectionDAG: Use correctly sized allocation functions for SDNodes
The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.

Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.

Part of llvm.org/PR26808.

llvm-svn: 262500
2016-03-02 19:01:11 +00:00
Matt Arsenault 7d0a77b979 DAGCombiner: Make sure an integer is being truncated
llvm-svn: 262446
2016-03-02 01:36:51 +00:00
Matt Arsenault b36d462fac DAGCombiner: Turn truncate of a bitcasted vector to an extract
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.

This fixes some test failures in a future commit when i64 loads
are changed to promote.

llvm-svn: 262397
2016-03-01 21:31:53 +00:00
Vasileios Kalintiris 36901dd1c3 Revert "[mips] Promote the result of SETCC nodes to GPR width."
This reverts commit r262316.

It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.

llvm-svn: 262387
2016-03-01 20:25:43 +00:00
Justin Lebar b5ca00a58d [NVPTX] Use different, convergent MIs for convergent calls.
Summary:
Calls sometimes need to be convergent.  This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.

Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs.  But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.

Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.

Reviewers: jholewinski

Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17423

llvm-svn: 262373
2016-03-01 19:24:03 +00:00
Matt Arsenault 03dac8d8e4 DAGCombiner: Turn extract of bitcasted integer into truncate
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.

llvm-svn: 262358
2016-03-01 18:01:37 +00:00
Rafael Espindola 5cd721ae12 Refactor duplicated code for linking with pthread.
llvm-svn: 262344
2016-03-01 15:54:40 +00:00
Vasileios Kalintiris 3a8f7f9e31 [mips] Promote the result of SETCC nodes to GPR width.
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.

The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D10970

llvm-svn: 262316
2016-03-01 10:08:01 +00:00
Matt Arsenault a67c4916cf LegalizeDAG: Use correct ptr type when expanding unaligned load/store
This fixes regressions exposed in existing AMDGPU tests in a
future commit when all loads are custom lowered.

llvm-svn: 262299
2016-03-01 05:13:35 +00:00
David Majnemer cb305dea1c [WinEH] Allocate the registration node before the catch objects
The CatchObjOffset is relative to the end of the EH registration node
for 32-bit x86 WinEH targets.  A special sentinel value, 0, is used to
indicate that no catch object should be initialized.

This means that a catch object allocated immediately before the
registration node would be assigned a CatchObjOffset of 0, leading the
runtime to believe that a catch object should not be initialized.

To handle this, allocate the registration node prior to any other frame
object.  This will ensure that catch objects will not be allocated
before the registration node.

This fixes PR26757.

Differential Revision: http://reviews.llvm.org/D17689

llvm-svn: 262294
2016-03-01 04:30:16 +00:00
Adrian Prantl dba58fbdd9 Improve the debug output of DwarfDebug::buildLocationList().
llvm-svn: 262265
2016-02-29 22:28:22 +00:00
Adrian Prantl fb2add2be1 Fix PR26585 by improving the promotion of DBG_VALUEs to DW_AT_locations.
When a variable is described by a single DBG_VALUE instruction we can
often use a more efficient inline DW_AT_location instead of using a
location list.

This commit makes the heuristic that decides when to apply this
optimization stricter by also verifying that the DBG_VALUE is live at the
entry of the function (instead of just checking that it is valid until
the end of the function).

<rdar://problem/24611008>

llvm-svn: 262247
2016-02-29 19:49:46 +00:00
Adrian Prantl 693e8de0fa fix typo in comment
llvm-svn: 262236
2016-02-29 17:06:46 +00:00
Duncan P. N. Exon Smith ebcce78f65 CodeGen: Remove an iterator => pointer conversion, NFC
Part of PR26753.

llvm-svn: 262154
2016-02-27 20:27:44 +00:00
Duncan P. N. Exon Smith d6ebd07b8d CodeGen: Use MachineInstr& in InlineSpiller::rematerializeFor()
InlineSpiller::rematerializeFor() never uses its parameter as an
iterator, so take it by reference instead.  This removes an implicit
conversion from MachineBasicBlock::iterator to MachineInstr*.

llvm-svn: 262152
2016-02-27 20:23:14 +00:00
Duncan P. N. Exon Smith be8f8c4478 CodeGen: Update LiveIntervalAnalysis API to use MachineInstr&, NFC
These parameters aren't expected to be null, so take them by reference.

llvm-svn: 262151
2016-02-27 20:14:29 +00:00
Duncan P. N. Exon Smith fd8cc23220 CodeGen: Change MachineInstr to use MachineInstr&, NFC
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null.  Slowly inching
toward being able to fix PR26753.

llvm-svn: 262149
2016-02-27 20:01:33 +00:00
Matt Arsenault 982224cfb8 DAGCombiner: Don't unnecessarily swap operands in ReassociateOps
In the case where op = add, y = base_ptr, and x = offset, this
transform:

(op y, (op x, c1)) -> (op (op x, y), c1)

breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.

This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.

llvm-svn: 262148
2016-02-27 19:57:45 +00:00
Duncan P. N. Exon Smith d3a7467221 CodeGen: Use MachineInstr& in HashMachineInstr, NFC
Also update HashEndOfMBB to take MachineBasicBlock&.

llvm-svn: 262146
2016-02-27 19:48:01 +00:00
Duncan P. N. Exon Smith 5e6e8c7a0a CodeGen: Use MachineInstr& in AntiDepBreaker API, NFC
Take parameters as MachineInstr& instead of MachineInstr* in
AntiDepBreaker API, since these are required to be non-null.  No
functionality change intended.  Looking toward PR26753.

llvm-svn: 262145
2016-02-27 19:33:37 +00:00
Duncan P. N. Exon Smith bd529fbb4a CodeGen: Assert valid MI in AntiDepBreaker::UpdateDbgValue
This already assumes a valid MI, since it dereferences the MI in an
assertion before checking for null.  At an explicit assert.

llvm-svn: 262144
2016-02-27 19:23:34 +00:00
Duncan P. N. Exon Smith 5702287809 CodeGen: Update DFAPacketizer API to take MachineInstr&, NFC
In all but one case, change the DFAPacketizer API to take MachineInstr&
instead of MachineInstr*.  In DFAPacketizer::endPacket(), take
MachineBasicBlock::iterator.  Besides cleaning up the API, this is in
search of PR26753.

llvm-svn: 262142
2016-02-27 19:09:00 +00:00
Duncan P. N. Exon Smith f9ab416d70 WIP: CodeGen: Use MachineInstr& in MachineInstrBundle.h, NFC
Update APIs in MachineInstrBundle.h to take and return MachineInstr&
instead of MachineInstr* when the instruction cannot be null.  Besides
being a nice cleanup, this is tacking toward a fix for PR26753.

llvm-svn: 262141
2016-02-27 17:05:33 +00:00
Matt Arsenault 360d244d5b DAGCombiner: Relax sqrt NaN folding check
This is OK for +0 since compares to +/-0 give the same result.

llvm-svn: 262125
2016-02-27 09:38:05 +00:00
Duncan P. N. Exon Smith 3ac9cc6156 CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC
Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals.  The MachineInstrs here are
never null, so this cleans up the API a bit.  It also incidentally
removes a few implicit conversions from MachineInstrBundleIterator to
MachineInstr* (see PR26753).

At a couple of call sites it was convenient to convert to a range-based
for loop over MachineBasicBlock::instr_begin/instr_end, so I added
MachineBasicBlock::instrs.

llvm-svn: 262115
2016-02-27 06:40:41 +00:00
Junmo Park 272a2bc365 Minor code cleanup. NFC.
llvm-svn: 262096
2016-02-27 01:10:43 +00:00
Cong Hou e0eb8bfe37 Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause assertion failure on AArch64.
llvm-svn: 262091
2016-02-26 23:25:30 +00:00
Amaury Sechet b2055c53ba Fix warning in DwarfCFIException. NFC
llvm-svn: 262061
2016-02-26 20:49:07 +00:00
Amaury Sechet 7067ad3c27 Extract the method to begin and end a fragment in AsmPrinterHandler in their own method. NFC
Summary: This is extracted from D17555

Reviewers: davidxl, reames, sanjoy, MatzeB, pete

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17580

llvm-svn: 262058
2016-02-26 20:30:37 +00:00
Quentin Colombet 87e23e5733 [GlobalISel] Fix a ranlib warning about empty TOC.
Fixes PR26733

llvm-svn: 262057
2016-02-26 20:05:02 +00:00
Reid Kleckner 70c9bc71d4 [WinEH] Fix funclet return block clobber mask placement
MBB slot index intervals are half open, not closed. getMBBEndIndex()
returns the slot index of the start of the next block in layout order.
Placing a register mask there is incorrect if the successor of the
funclet return is not laid out after the return. Clang generates IR for
catch bodies before generating the following normal code, so we never
noticed this issue until the D frontend authors filed a bug about it.

Instead, we can put the clobber mask on the last instruction of the
funclet return block. We still aren't using a register mask operand on
the CATCHRET instruction because it would cause PEI to spill all CSRs,
including XMM regs, in the prologue.

Fixes PR26679.

llvm-svn: 262035
2016-02-26 16:53:19 +00:00
Matthias Braun 9dcd65f478 MachineCopyPropagation: Catch copies of the form A<-B;A<-B
Differential Revision: http://reviews.llvm.org/D17475

llvm-svn: 261966
2016-02-26 03:18:55 +00:00
Matthias Braun e39ff70685 MachineCopyPropagation: Keep scanning through instructions with regmasks
This also simplifies the code by removing the overly conservative
NoInterveningSideEffect() function. This function checked:
- That the two copies belong to the same block: We only process one
  block at a time and clear our maps in between it is impossible to find a
  copy from a different block.
- There is no terminator between the two copy instructions: This is not
  allowed anyway (the MachineVerifier would complain)
- Does not have instructions with hasUnmodeledSideEffects() or isCall()
  set: Even for those instructuction we must have all clobbers/defs of
  registers explicit as an operand. If the register is explicitely
  clobbered we would never come to the point of checking for
  NoInterveningSideEffect() anyway.

(I also checked this with a temporary build of the test-suite with all
 potentially failing conditions in NoInterveningSideEffect() turned into
 asserts)

Differential Revision: http://reviews.llvm.org/D17474

llvm-svn: 261965
2016-02-26 03:18:50 +00:00
Junmo Park 820e392601 Minor code cleanups. NFC.
llvm-svn: 261955
2016-02-26 02:07:36 +00:00
David Majnemer 08dd52dc75 [WinEH] Don't remove unannotated inline-asm calls
Inline-asm calls aren't annotated with funclet bundle operands because
they don't throw and cannot be inlined through.  We shouldn't require
them to bear an funclet bundle operand.

llvm-svn: 261942
2016-02-26 00:04:25 +00:00
Hongbin Zheng 751337faa7 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261903
2016-02-25 17:54:15 +00:00
Hongbin Zheng 3f97840721 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261902
2016-02-25 17:54:07 +00:00
Hongbin Zheng 66b19fbc4e Revert "Introduce analysis pass to compute PostDominators in the new pass manager. NFC"
This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018.

llvm-svn: 261891
2016-02-25 16:45:53 +00:00
Hongbin Zheng ad782ce3f7 Revert "Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC"
This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2.

llvm-svn: 261890
2016-02-25 16:45:46 +00:00
Hongbin Zheng 237197ba63 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261883
2016-02-25 16:33:15 +00:00
Hongbin Zheng a0273a04f5 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261882
2016-02-25 16:33:06 +00:00
Junmo Park 161dc1c605 [CodeGenPrepare] Remove load-based heuristic
Summary:
Both the hardware and LLVM have changed since 2012.
Now, load-based heuristic don't show big differences any more on OoO cores.

There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5).

Reviewers: spatel, zansari
    
Differential Revision: http://reviews.llvm.org/D16836

llvm-svn: 261809
2016-02-25 00:23:27 +00:00
Cong Hou 4ce0280a41 Detecte vector reduction operations just before instruction selection.
(This is the second attemp to commit this patch, after fixing pr26652 & pr26653).

This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261804
2016-02-24 23:40:36 +00:00
Matthias Braun aca625a4fe MachineInstr: Respect register aliases in clearRegiserKills()
This fixes bugs in copy elimination code in llvm. It slightly changes the
semantics of clearRegisterKills(). This is appropriate because:
- Users in lib/CodeGen/MachineCopyPropagation.cpp and
  lib/Target/AArch64RedundantCopyElimination.cpp and
  lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it
  (see included testcase).
- All other users in llvm are unaffected (they pass TRI==nullptr)
- (Kill flags are optional anyway so removing too many shouldn't hurt.)

Differential Revision: http://reviews.llvm.org/D17554

llvm-svn: 261763
2016-02-24 19:21:48 +00:00
Artur Pilipenko 31bcca47d3 NFC. Move isDereferenceable to Loads.h/cpp
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.   

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16180

llvm-svn: 261736
2016-02-24 12:49:04 +00:00
Hans Wennborg d3661cd140 Revert r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
This and the corresponding Clang change caused PR26715.

llvm-svn: 261671
2016-02-23 19:17:03 +00:00
Amjad Aboud fc8f296782 Supporting all entities declared in lexical scope in LLVM debug info.
Differential Revision: http://reviews.llvm.org/D15976

llvm-svn: 261633
2016-02-23 13:36:51 +00:00
David Majnemer 17525aba8a [WinEH] Visit 'unwind to caller' catchswitches nested in catchswitches
We had the right logic for the nested cleanuppad case but omitted it for
catchswitches.

llvm-svn: 261615
2016-02-23 07:18:15 +00:00
Dehao Chen f84b630044 Add prefix based function layout when profile is available.
Summary: If a function is hot, put it in text.hot section.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17532

llvm-svn: 261607
2016-02-23 03:39:24 +00:00
Duncan P. N. Exon Smith 6307eb5518 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00
Duncan P. N. Exon Smith b3613fce19 Revert "Add prefix based function layout when profile is available."
This reverts commit r261582, since this bot has been broken for four
hours:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/19399/

llvm-svn: 261604
2016-02-23 02:28:40 +00:00
Dehao Chen 25527b6c3f Include ProfileData as CodeGen's required library.
Summary: Fixing buildbot failure introduced by http://reviews.llvm.org/D17460

Reviewers: davidxl, hans

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17524

llvm-svn: 261588
2016-02-22 22:54:14 +00:00
David Majnemer 964b70d559 [X86] Create mergeable constant pool entries for AVX
We supported creating mergeable constant pool entries for smaller
constants but not for 32-byte AVX constants.

llvm-svn: 261584
2016-02-22 22:23:11 +00:00
Dehao Chen c5f76f7347 Add prefix based function layout when profile is available.
Summary: If a function is hot, put it in text.hot section.

Reviewers: davidxl

Subscribers: eraman, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17460

llvm-svn: 261582
2016-02-22 22:14:14 +00:00
Matt Arsenault 0c6bd7b0d3 SelectionDAG: Use correct addrspace when lowering memcpy
This was causing assertions later from using the wrong pointer
size with LDS operations. getOptimalMemOpType should also have
address space arguments later.

This avoids assertions in existing tests exposed by
a future commit.

llvm-svn: 261580
2016-02-22 22:01:42 +00:00
Tim Northover d32f8e60bf ARM: sink atomic release barrier as far as possible into cmpxchg.
DMB instructions can be expensive, so it's best to avoid them if possible. In
atomicrmw operations there will always be an attempted store so a release
barrier is always needed, but in the cmpxchg case we can delay the DMB until we
know we'll definitely try to perform a store (and so need release semantics).

In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX
instructions to skip the barrier on subsequent iterations. The basic outline
becomes:

        ldrex rOld, [rAddr]
        cmp rOld, rDesired
        bne Ldone
        dmb
    Lloop:
        strex rRes, rNew, [rAddr]
        cbz rRes Ldone
        ldrex rOld, [rAddr]
        cmp rOld, rDesired
        beq Lloop
    Ldone:

So we'll skip this version for strong operations in "minsize" functions.

llvm-svn: 261568
2016-02-22 20:55:50 +00:00
Duncan P. N. Exon Smith c5b668deb8 Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"
This reverts commit r261504, since it's not obvious the new name is
better:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html

I'll recommit if we get consensus that it's the right direction.

llvm-svn: 261567
2016-02-22 20:49:58 +00:00
Justin Lebar 46123a8891 Revert "[ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv."
This reverts r261543.  Accidental commit (not LGTM'ed).

llvm-svn: 261547
2016-02-22 18:17:27 +00:00
Justin Lebar f62b165a04 [ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv.
Summary:
Also add a comment briefly explaining what ifcnv is.

No functional changes.

Reviewers: resistor

Subscribers: echristo, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17430

llvm-svn: 261543
2016-02-22 17:51:30 +00:00
Justin Lebar 3a7bc57e63 [ifcnv] Use unique_ptr in IfConversion. NFC
Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17466

llvm-svn: 261541
2016-02-22 17:51:28 +00:00
Justin Lebar 5b82c9ba31 Don't tail-duplicate blocks that contain convergent instructions.
Summary:
Convergent instrs shouldn't be made control-dependent on other values,
but this is basically the whole point of tail duplication.  So just bail
if we see a convergent instruction.

Reviewers: iteratee

Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits

Differential Revision: http://reviews.llvm.org/D17320

llvm-svn: 261540
2016-02-22 17:50:52 +00:00
Duncan P. N. Exon Smith e59c8af705 Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261510, effectively reapplying r261509.  The
original commit missed a caller in AArch64ConditionalCompares.

Original commit message:

Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261511
2016-02-22 03:33:28 +00:00
Duncan P. N. Exon Smith 0cc90a9147 Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261509.  I'm not sure how this compiled locally,
but something was out of whack.

llvm-svn: 261510
2016-02-22 03:12:42 +00:00
Duncan P. N. Exon Smith 83d3476fd2 CodeGen: Use references in MachineTraceMetrics::Trace, NFC
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261509
2016-02-22 03:07:49 +00:00
Duncan P. N. Exon Smith 395bd9cd63 CodeGen: Explicitly convert from iterator to pointer, NFC
llvm-svn: 261508
2016-02-22 02:53:42 +00:00
Duncan P. N. Exon Smith dc0848c029 CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.

- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
  that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator().  This matches the
  naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator().  This is explicitly called
  "bundle" (not matching MachineBasicBlock) to disintinguish it clearly
  from ilist_node::getIterator().
- Update all calls.  Some of these I switched to `auto` to remove
  boiler-plate, since the new name is clear about the type.

There was one call I updated that looked fishy, but it wasn't clear what
the right answer was.  This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp.  I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.

llvm-svn: 261504
2016-02-21 22:58:35 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Duncan P. N. Exon Smith a848c47130 ADT: Stop using getNodePtrUnchecked on end() iterators
Stop using `getNodePtrUnchecked()` when building IR.  Eventually a
dereference will be required to get at the downcast node, since the
iterator will only store an `ilist_node_base` of some sort.

This should have no functionality change for now, but is a path towards
removing some more UB from ilist.

llvm-svn: 261495
2016-02-21 19:52:15 +00:00
Duncan P. N. Exon Smith 7b269642d2 CodeGen: Avoid getNodePtrUnchecked() where we need a Value, NFC
`ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being
for internal use only, but CodeGenPrepare was using it anyway.  This
code relies on pulling out the `Value*` pointer even after the lifetime
of the iterator is over.  But having this pointer available in
ilist_iterator depends on UB in the first place.

Instead, safely pull out the `Value*` when the iterator is alive and
stop using the internal-only API.

There should be no functionality change here.

llvm-svn: 261493
2016-02-21 19:37:45 +00:00
David Majnemer a3ea407d48 [X86] Use the correct alignment for COMDAT constant pool entries
COFF doesn't have sections with mergeable contents.  Instead, each
constant pool entry ends up in a COMDAT section.  The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections.  You just get whatever alignment was on the section.

If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.

Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.

This fixes PR26680.

llvm-svn: 261462
2016-02-21 01:30:30 +00:00
Dan Gohman d1c5a3aa21 Don't scan for SSA register operands to update when not in SSA form.
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.

llvm-svn: 261450
2016-02-20 21:28:18 +00:00
Simon Pilgrim c5199aae82 [DAGCombiner] Use getBitcast helper when possible. NFCI.
llvm-svn: 261437
2016-02-20 15:05:29 +00:00
Matthias Braun c65e904be8 MachineCopyPropagation: Introduce Reg2MIMap typedef; NFC
llvm-svn: 261408
2016-02-20 03:56:41 +00:00
Matthias Braun bd18d751de MachineCopyPropagation: Move variables from function to pass
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.

llvm-svn: 261407
2016-02-20 03:56:39 +00:00
Matthias Braun 273575dcbe MachineCopyPropagation: Use ranged for, cleanup; NFC
llvm-svn: 261406
2016-02-20 03:56:36 +00:00
Matthias Braun 57b5f11aa7 MachineCopyPropagation: Use assert() instead of if{report_error()} for 'impossible' condition
llvm-svn: 261405
2016-02-20 03:56:33 +00:00
Quentin Colombet e611698e84 [RegAllocFast] Properly track the physical register definitions on calls.
PR26485

llvm-svn: 261384
2016-02-20 00:32:29 +00:00
Sanjoy Das ffb7bd11f7 [StatepointLowering] Minor non-semantic cleanups
Use auto, bring file up to coding standards etc.

llvm-svn: 261358
2016-02-19 19:37:07 +00:00
Sanjoy Das f6fee29ceb [StatepointLowering] Update StatepointMaxSlotsRequired correctly
Now that we don't always add an element to AllocatedStackSlots if we
don't find a pre-existing unallocated stack slot, bumping
StatepointMaxSlotsRequired to `NumSlots + 1` is not correct.  Instead
bump the statistic near the push_back, to
Builder.FuncInfo.StatepointStackSlots.size().

llvm-svn: 261348
2016-02-19 18:15:56 +00:00
Sanjoy Das e8019df552 [StatepointLowering] Fix a mistake in rL261336
The check on MFI->getObjectSize() has to be on the FrameIndex, not on
the index of the FrameIndex in AllocatedStackSlots.  Weirdly, the tests
I added in rL261336 didn't catch this.

llvm-svn: 261347
2016-02-19 18:15:53 +00:00
Sanjoy Das 171313c69a [StatepointLowering] Change AllocatedStackSlots to use SmallBitVector
NFCI.  They key motivation here is that I'd like to use
SmallBitVector::all() in a later change.  Also, using a bit vector here
seemed better in general.

The only interesting change here is that in the failure case of
allocateStackSlot, we no longer (the equivalent of) push_back(true) to
AllocatedStackSlots.  As far as I can tell, this is fine, since we'd
never re-use those slots in the same StatepointLoweringState instance.

Technically there was no need to change the operator[] type accesses to
set() and test(), but I thought it'd be nice to make it obvious that
we're using something other than a std::vector like thing.

llvm-svn: 261337
2016-02-19 17:15:26 +00:00
Sanjoy Das d2db73ba59 [StatepointLowering] Fix bug in allocateStackSlot
allocateStackSlot did not consider the size of the value to be spilled
before deciding to re-use a spill slot.  This was originally okay (since
originally we'd only ever spill pointers), but it became not okay when
we changed our scheme to directly spill vectors of pointers.

While this change fixes the bug pointed out, it has two performance
caveats:

 - It matches spill slot and spillee size exactly, while in theory we
   can spill, e.g., an 8 byte pointer into a 16 byte slot.  This is
   slightly complicated to fix since in the stackmaps section, we report
   the size of the spill slot as the size of the "indirect value"; and
   if they're no longer equivalent, we'll have to keep track of the
   (indirect) value size separately from the stack slot size.

 - It will "spuriously run out" of reusable slots, since we now have an
   second check in the search loop in addition to the availablity
   check (e.g. you had two free scalar slots, and you first ask for a
   vector slot followed by a scalar slot).  I'll fix this in a later
   commit.

llvm-svn: 261336
2016-02-19 17:15:22 +00:00
Sanjoy Das 7b2e91fb59 [StatepointLowering] Clean up allocateStackSlot
This removes the unusual loop structure in allocateStackSlot in favor of
something more straightforward.  I've also removed the cautionary
comment in the function, which I suspect is historical cruft now, and
confuses more than it enlightens.

llvm-svn: 261335
2016-02-19 17:15:17 +00:00
David Majnemer 693f13156e Shuffle header file as per the Coding Standards
llvm-svn: 261308
2016-02-19 04:46:48 +00:00
David Majnemer b61fd7fc6d [SjLjEHPrepare] Simplify/cleanup code
No functional change is intended.

llvm-svn: 261307
2016-02-19 04:46:06 +00:00
Matthias Braun 848e79c578 LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputs
llvm-svn: 261306
2016-02-19 04:44:19 +00:00
David Majnemer bd1b8c0889 [SjLjEHPrepare] Don't grab pointers to functions in doInitialization
Certain optimization passes (like globaldce) can prune function
declaration that SjLjEHPrepare assumed would exit when it'd
runOnFunction.

This fixes PR26669.

llvm-svn: 261303
2016-02-19 03:13:40 +00:00
Justin Lebar c75d566f56 When printing MIR, output to errs() rather than outs().
Summary:
Without this, this command

  $ llvm-run llc -stop-after machine-cp -o - <( echo '' )

outputs an error, because we close stdout twice -- once when closing the
file opened for "-o", and again when closing outs().

Also clarify in the outs() definition that you can't ever call it if you
want to open your own raw_fd_ostream on stdout.

Reviewers: jroelofs, tstellarAMD

Subscribers: jholewinski, qcolombet, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17422

llvm-svn: 261286
2016-02-19 00:18:46 +00:00
Philip Reames 1960cfd323 [IR] Extend cmpxchg to allow pointer type operands
Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives:
1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative)
2) It pushes work onto the frontend authors for no real gain

This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes.

Differential Revision: http://reviews.llvm.org/D17413

llvm-svn: 261281
2016-02-19 00:06:41 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Philip Reames 367fdd990c Restrict scope of variables [NFC]
llvm-svn: 261250
2016-02-18 19:45:31 +00:00
Benjamin Kramer 3a16e2a26a Make header self-contained. NFC.
llvm-svn: 261234
2016-02-18 18:02:48 +00:00
Xinliang David Li 1153f194bd Stop creating covmap as note section on ELF
covmap needs to created as non allocatable, but not with
SHT_NOTE. The latter was needed to workaround a problem
of BFD linker with gc, which is no longer needed. (A more
proper longer term fix requires changing FE driver to force
referencing the section using linker script).

Differential Revision: http://reviews.llvm.org/D17309

llvm-svn: 261228
2016-02-18 17:20:22 +00:00
Matthias Braun ac697c5d8e Revert "LiveIntervalAnalysis: Remove LiveVariables requirement" and LiveIntervalTest
The commit breaks stage2 compilation on PowerPC. Reverting for now while
this is analyzed. I also have to revert the LiveIntervalTest for now as
that depends on this commit.

Revert "LiveIntervalAnalysis: Remove LiveVariables requirement"
This reverts commit r260806.
Revert "Remove an unnecessary std::move to fix -Wpessimizing-move warning."
This reverts commit r260931.
Revert "Fix typo in LiveIntervalTest"
This reverts commit r260907.
Revert "Add unittest for LiveIntervalAnalysis::handleMove()"
This reverts commit r260905.

llvm-svn: 261189
2016-02-18 05:21:43 +00:00
Adrian Prantl 3b89e6634c DwarfDebug: Don't drop the DIExpression just because a variable is
described by an immediate.

Found via http://reviews.llvm.org/D16867
Thanks to Paul Robinson for pointing this out.

<rdar://problem/24456528>

llvm-svn: 261168
2016-02-17 22:20:08 +00:00
Adrian Prantl 6f4746b11a DbgVariable: Add an accessor for the common case of a single expression
belonging to a single DBG_VALUE instruction.

NFC

llvm-svn: 261167
2016-02-17 22:19:59 +00:00
Nico Weber e6154ffbe0 Revert r261070, it caused PR26652 / PR26653.
llvm-svn: 261127
2016-02-17 18:47:29 +00:00
Cong Hou bbd4e3b400 Detecte vector reduction operations just before instruction selection.
This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261070
2016-02-17 06:37:04 +00:00
Reid Kleckner 9a593ee7d2 [codeview] Bail on a DBG_VALUE register operand with no register
This apparently comes up when the register allocator decides that a
variable will become undef along a certain path.

Also improve the error message we emit when we can't map from LLVM
register number to CV register number.

llvm-svn: 261016
2016-02-16 21:49:26 +00:00
Reid Kleckner 6e0d5f573c [codeview] Fix assertion on non-memory, non-register DBG_VALUE instructions
Eventually we should find a way to describe constant variables, but it
is not obvious how to do this at the moment.

llvm-svn: 261010
2016-02-16 21:14:51 +00:00
Quentin Colombet ba2a01645b [GlobalISel] Re-apply r260922-260923 with MSVC-friendly code.
Original message:
Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.

llvm-svn: 260998
2016-02-16 19:26:02 +00:00
Aaron Ballman c6a2f2140b A signed bitfield's range is [-1,0], so assigning 1 is technically an overflow. However, the other bitfield requires a signed value (it supports negative offsets), so it is slightly better to retain a signed 1-bit bitfield and use -1 instead of 1. Silences an MSVC warning.
llvm-svn: 260973
2016-02-16 15:35:51 +00:00
Aaron Ballman fc64ef1a15 Reverting r260922-260923; they cause link failures with MSVC.
http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc2015/builds/15436/steps/build/logs/stdio
http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc18-DA/builds/961/steps/build_llvm/logs/stdio

llvm-svn: 260972
2016-02-16 15:29:06 +00:00
Quentin Colombet 1ce38545fb [GlobalISel] Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.

llvm-svn: 260922
2016-02-16 00:57:44 +00:00
Zia Ansari 30a02384f7 Implemented stack symbol table ordering/packing optimization to improve data locality and code size from SP/FP offset encoding.
Differential Revision: http://reviews.llvm.org/D15393

llvm-svn: 260917
2016-02-15 23:44:13 +00:00
Matthias Braun 4a6c728cc0 LiveIntervalAnalysis: Support moving of subregister defs in handleMove
This is an updated version which fixes a bug that happened with
uses tied to an earlyclobber operand which end at an unusual slotindex.

If two definitions write to independent subregisters then they can be
put in any order. LiveIntervalAnalysis::handleMove() did not support
this previously because it looks like moving a definition of a vreg past
another one.

This is a modified version of a patch proposed (two years ago) by
Vincent Lejeune! This version does not touch the read-undef flags and is
extended for the case of moving a subregister def behind all uses - this
can happen for subregister defs that are completely unused.

Differential Revision: http://reviews.llvm.org/D9067

llvm-svn: 260906
2016-02-15 19:25:36 +00:00
Matthias Braun b3aefc3a69 MachineVerifier: Add parameter to choose if MachineFunction::verify() aborts
The abort on error behaviour is unpractical for debugger and unittest
usage.

llvm-svn: 260904
2016-02-15 19:25:31 +00:00
Ahmed Bougacha 93cff7fb82 [CodeGen] Document and use getConstant's splat-building feature. NFC.
Differential Revision: http://reviews.llvm.org/D17229

llvm-svn: 260901
2016-02-15 18:07:29 +00:00
Jonas Paulsson 98963fec41 [ScheduleDAGInstrs] isUnsafeMemoryObject() removed
This function was basically useless, since volatile memacesses or MIs with
unmodelled sideffects become global memory objects, and the other little
checks are also done elsewhere.

Reviewed by Andy Trick
http://reviews.llvm.org/D16881

llvm-svn: 260899
2016-02-15 16:43:15 +00:00
Benjamin Kramer 7f75e9403d [AggressiveAntiDepBreaker] Skip some unnecessary BitVector copies.
llvm-svn: 260825
2016-02-13 16:39:39 +00:00
Matthias Braun bbb528f189 LiveIntervalAnalysis: Remove LiveVariables requirement
This requirement was a huge hack to keep LiveVariables alive because it
was optionally used by TwoAddressInstructionPass and PHIElimination.
However we have AnalysisUsage::addUsedIfAvailable() which we can use in
those passes.

llvm-svn: 260806
2016-02-13 04:35:31 +00:00
Pirama Arumuga Nainar 7476bc89e9 Don't combine fp_round (fp_round x) if f80 to f16 is generated
Summary:
This patch skips DAG combine of fp_round (fp_round x) if it results in
an fp_round from f80 to f16.

fp_round from f80 to f16 always generates an expensive (and as yet,
unimplemented) libcall to __truncxfhf2.  This prevents selection of
native f16 conversion instructions from f32 or f64.  Moreover, the first
(value-preserving) fp_round from f80 to either f32 or f64 may become a
NOP in platforms like x86.

Reviewers: ab

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D17221

llvm-svn: 260769
2016-02-13 00:08:05 +00:00
Reid Kleckner 876330d53a [codeview] Describe local variables in registers
llvm-svn: 260746
2016-02-12 21:48:30 +00:00
Andrew Kaylor d1188ddd33 [WinEH] Prevent EH state numbering from skipping nested cleanup pads that never return
Differential Revision: http://reviews.llvm.org/D17208

llvm-svn: 260733
2016-02-12 21:10:16 +00:00
Quentin Colombet 232f447782 Get rid of some GLOBAL_ISEL ifdefs that should be harmless for code size.
More to come, but those were easy.

llvm-svn: 260723
2016-02-12 20:41:24 +00:00
Mehdi Amini 40b369cf5a GlobalISel is always built since r260566, reflect it in LLVMBuild.txt
Other component could not depends on an optional library in llvm-config

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260701
2016-02-12 18:43:14 +00:00
Quentin Colombet ccd7725808 [IRTranslator] Use a single virtual register to represent any Value.
PR26161.

llvm-svn: 260602
2016-02-11 21:48:32 +00:00
Quentin Colombet 8fd6718700 [Target] Add a helper function to check if an opcode is invalid after isel.
llvm-svn: 260590
2016-02-11 21:16:56 +00:00
Matthias Braun c67f5a6ab1 Revert "LiveIntervalAnalysis: Support moving of subregister defs in handleMove"
This is broke a bot:

http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/4703/steps/test-suite/logs/test.log

Reverting while I investigate.

This reverts commit r260565.

llvm-svn: 260586
2016-02-11 21:07:44 +00:00
Sanjay Patel e5df1dfb14 [SelectionDAG] change getConstant() to use the input SDLoc when building splat vectors
The code change is simple enough: instead of attaching an anonymous SDLoc to splatted
vector constants, use the scalar constant's existing SDLoc since that is what is passed 
into getConstant() as a param. But this changes instruction scheduling, so I'll explain
why that happens.

The motivation for this patch starts near:
http://reviews.llvm.org/rL258833
...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'.
But when I made that change locally, several x86 codegen tests wiggled.

It turns out that the lack of SDLoc consistency in getConstant() changes the way 
ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG
scheduler algorithms use IROrder for tie-breaking.

Differential Revision: http://reviews.llvm.org/D16972

llvm-svn: 260582
2016-02-11 20:21:24 +00:00
Quentin Colombet fd9d0a07d8 [GlobalISel] Add the necessary plumbing to lower formal arguments.
llvm-svn: 260579
2016-02-11 19:59:41 +00:00
Peter Collingbourne 7c384ccea2 DwarfDebug: emit type units immediately.
Rather than storing type units in a vector and emitting them at the end
of code generation, emit them immediately and destroy them, reclaiming the
memory we were using for their DIEs.

In one benchmark carried out against Chromium's 50 largest (by bitcode
file size) translation units, total peak memory consumption with type units
decreased by median 17%, or by 7% when compared against disabling type units.

Tested using check-{llvm,clang}, the GDB 7.5 test suite (with
'-fdebug-types-section') and by eyeballing llvm-dwarfdump output on those
Chromium translation units with split DWARF both disabled and enabled, and
verifying that the only changes were to addresses and abbreviation ordering.

Differential Revision: http://reviews.llvm.org/D17118

llvm-svn: 260578
2016-02-11 19:57:46 +00:00
Reid Kleckner 829365aeef [codeview] Fix bug around multi-level wrapper inlining
If there were wrapper functions with no instructions of their own in the
inlining tree, we would fail to emit InlineSite records for them.

llvm-svn: 260571
2016-02-11 19:41:47 +00:00
Quentin Colombet 2e00253750 Play nice with Visual Studio and attributes
llvm-svn: 260568
2016-02-11 19:33:21 +00:00
Quentin Colombet bde158cbc7 [CMake] Produce an empty library for GlobalISel when not building it.
The rational for this change is that LLVMBuild cannot express conditional 
dependencies. Therefore, when we start optionally using GlobalISel library for 
say AArch64, without that change, all the tools that use the AArch64 library 
would need to explicitly link with GlobalISel when we ask for it.

This does not scale.

Instead, we will set the dependencies between the target and GlobalISel and if 
we did not ask to build GlobalISel, the library will just be empty.

Thanks to Chris Bieneman and Mehdi Animi for the idea.

llvm-svn: 260566
2016-02-11 19:18:27 +00:00
Matthias Braun 33c641bddf LiveIntervalAnalysis: Support moving of subregister defs in handleMove
If two definitions write to independent subregisters then they can be
put in any order. LiveIntervalAnalysis::handleMove() did not support
this previously because it looks like moving a definition of a vreg past
another one.

This is a modified version of a patch proposed (two years ago) by
Vincent Lejeune! This version does not touch the read-undef flags and is
extended for the case of moving a subregister def behind all uses - this
can happen for subregister defs that are completely unused.

Differential Revision: http://reviews.llvm.org/D9067

llvm-svn: 260565
2016-02-11 19:03:53 +00:00
Quentin Colombet 74d7d2f00b [GlobalISel] Teach the IRTranslator how to lower returns.
llvm-svn: 260562
2016-02-11 18:53:28 +00:00
Quentin Colombet 9855111b77 [GlobalISel] Add a type to MachineInstr.
We actually need that information only for generic instructions, therefore it
would be nice not to have to pay the extra memory consumption for all
instructions. Especially because a typed non-generic instruction does not make
sense.

The question is then, is it possible to have that information in a union or
something?
My initial thought was that we could have a derived class GenericMachineInstr
with additional information, but in practice it makes little to no sense since
generic MachineInstrs are likely turned into non-generic ones by just switching
the opcode. In other words, we don't want to go through the process of creating
a new, non-generic MachineInstr, object each time we do this switch. The memory
benefit probably is not worth the extra compile time.

Another option would be to keep the type of the MachineInstr in a side table.
This would induce an extra indirection though.

Anyway, I will file a PR to discuss about it and remember we need to come back
to it at some point.

llvm-svn: 260558
2016-02-11 18:22:37 +00:00
Quentin Colombet 37a09a8428 [GlobalISel] Add a hook in TargetConfigPass to run GlobalISel.
llvm-svn: 260553
2016-02-11 17:57:22 +00:00
Quentin Colombet a7fae162e6 [GlobalISel][IRTranslator] Change the ownership of the MIRBuilder field.
llvm-svn: 260551
2016-02-11 17:53:23 +00:00
Quentin Colombet 4f0ec8d2b0 [GlobalISel][IRTranslator] Fix a typo in assert.
llvm-svn: 260550
2016-02-11 17:52:28 +00:00
Quentin Colombet 17c494b91c [GlobalISel][IRTranslator] Teach the pass how to translate Add instructions.
llvm-svn: 260549
2016-02-11 17:51:31 +00:00
Quentin Colombet 2ad1f851a1 [GlobalISel] Add a MachineIRBuilder class.
Helper class to build machine instrs. This is a higher abstraction
than MachineInstrBuilder.

llvm-svn: 260547
2016-02-11 17:44:59 +00:00
Benjamin Kramer e3b963d5ee Drop the hidden visibility from DebugHandlerBase for now.
If a class has hidden visibility all derived classes and all classes
that have it as a member must have hidden visibility too. That may
be fixable here but requires changes to quite a lot of debug info
classes.

This is also one of the things that GCC enforces aggressively while
clang ignores it, making testing more annoying than necessary.

llvm-svn: 260529
2016-02-11 15:41:56 +00:00
Quentin Colombet e1494c353a [GlobalISel][MachineRegisterInfo] Add a method to create generic vregs.
For now, generic virtual registers will not have a register class. We may want
to change that. For instance, if we want to use all the methods from
TargetRegisterInfo with generic virtual registers, we need to either have some
sort of generic register classes that do what we want, or teach those methods
how to deal with nullptr register class.

Although the latter seems easy enough to do, we may still want to differenciate
generic register classes from nullptr to catch cases where nullptr gets
introduced by a bug of some sort.

Anyway, I will file a PR to keep track of that.

llvm-svn: 260474
2016-02-11 00:19:17 +00:00
Quentin Colombet 36ce1b0157 [GlobalISel] Remember the size of generic virtual registers
llvm-svn: 260468
2016-02-10 23:43:48 +00:00
Quentin Colombet 2ecff3bff2 [GlobalISel] More detailed skeleton for the IRTranslator.
llvm-svn: 260456
2016-02-10 22:59:27 +00:00
Reid Kleckner f9c275fe0a [codeview] Describe int local variables using .cv_def_range
Summary:
Refactor common value, scope, and label tracking logic out of DwarfDebug
into a common base class called DebugHandlerBase.

Update an old LLVM IR test case to avoid an assertion in LexicalScopes.

Reviewers: dblaikie, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16931

llvm-svn: 260432
2016-02-10 20:55:49 +00:00
Ahmed Bougacha f8dfb47c02 [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
llvm-svn: 260316
2016-02-09 22:54:12 +00:00
Sanjay Patel 73200f72de [SelectionDAG] make getMemBasePlusOffset() accessible; NFCI
I reinvented this functionality in http://reviews.llvm.org/D16828 because it was
hidden away as a static function. The changes in x86 are not based on a complete
audit. I suspect there are other possible uses there, and there are almost certainly
more potential users in other targets.

llvm-svn: 260295
2016-02-09 21:42:04 +00:00
Andrew Kaylor 1224488e0c [regalloc][WinEH] Do not mark intervals as not spillable if they contain a regmask
Differential Revision: http://reviews.llvm.org/D16831

llvm-svn: 260164
2016-02-08 22:52:51 +00:00
Hans Wennborg 850ec6ca18 [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)
This matches GCC and MSVC's behaviour, and saves on code size.

We were already not extending i1 return values on x86_64 after r127766. This
takes that patch further by applying it to x86 target as well, and also for i8
and i16.

The ABI docs have been unclear about the required behaviour here. The new i386
psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return
vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being
updated to say the same [2].

Differential Revision: http://reviews.llvm.org/D16907

 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf
 [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ

llvm-svn: 260133
2016-02-08 19:34:30 +00:00
Matt Arsenault 2bba779272 SelectionDAG: Lower some range metadata to AssertZext
If a range has a lower bound of 0, add an AssertZext from the
nearest floor power of two.

This allows operations with some workitem intrinsics with known
maximum ranges to use fast 24-bit multiplies.

llvm-svn: 260109
2016-02-08 16:28:19 +00:00
Sanjoy Das 86d7d83f2a [StatepointLower] Use None instead of Optional<int>()
llvm-svn: 259956
2016-02-05 23:40:04 +00:00
Wei Mi a62f058989 Some stackslots are allocated to vregs which have no real reference.
LiveRangeEdit::eliminateDeadDef is used to remove dead define instructions
after rematerialization. To remove a VNI for a vreg from its LiveInterval,
LiveIntervals::removeVRegDefAt is used. However, after non-PHI VNIs are all
removed, PHI VNI are still left in the LiveInterval. Such unused vregs will
be kept in RegsToSpill[] at the end of InlineSpiller::reMaterializeAll and
spiller will allocate stackslot for them.

The fix is to get rid of unused reg by checking whether it has non-dbg
reference instead of whether it has non-empty interval.

llvm-svn: 259895
2016-02-05 18:14:24 +00:00
Matt Arsenault 5923973fe2 Fix printing of f16 machine operands
Only single and double FP immediates are correctly printed by
MachineInstr::print() during debug output. Half float type goes to
APFloat::convertToDouble() and hits assertion it is not a double
semantics. This diff prints half machine operands correctly.

This cannot currently be hit by any in-tree target.

Patch by Stanislav Mekhanoshin

llvm-svn: 259857
2016-02-05 00:50:18 +00:00
Nemanja Ivanovic e8cbae32e9 Enable the %s modifier in inline asm template string
This patch corresponds to review:
http://reviews.llvm.org/D16847

There are some files in glibc that use the output operand modifier even though
it was deprecated in GCC. This patch just adds support for it to prevent issues
with such files.

llvm-svn: 259798
2016-02-04 16:18:08 +00:00
Petar Jovanovic 23e44f5e39 [Power PC] softening long double type
This patch implements softening of long double type (ppcf128) on ppc32
architecture and enables operations for this type for soft float.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D15811

llvm-svn: 259791
2016-02-04 14:43:50 +00:00
Jonas Paulsson 2293685731 [ScheduleDagInstrs] Improved comments
llvm-svn: 259783
2016-02-04 13:08:48 +00:00
Sanjay Patel e9fa3363b4 rangify; NFCI
llvm-svn: 259722
2016-02-03 22:44:14 +00:00
Reid Kleckner eb3bcdd28b [codeview] Remove EmitLabelDiff in favor emitAbsoluteSymbolDiff
llvm-svn: 259700
2016-02-03 21:24:42 +00:00
Reid Kleckner dac21b43d5 [codeview] Use the MCStreamer interface directly instead of AsmPrinter
This is mostly about having shorter lines and standardizing on one
interface, but it also avoids some needless indirection.

No functional change.

llvm-svn: 259697
2016-02-03 21:15:48 +00:00
Keno Fischer 6c1e47a66b [DWARFDebug] Fix another case of overlapping ranges
Summary:
In r257979, I added code to ensure that we wouldn't merge DebugLocEntries if
the pieces they describe overlap. Unfortunately, I failed to cover the case,
where there may have multiple active Expressions in the entry, in which case we
need to make sure that no two values overlap before we can perform the merge.

This fixed PR26148.

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16742

llvm-svn: 259696
2016-02-03 21:13:33 +00:00
Tim Shen f99f0d5a7e [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behavior
This patch consists of two parts: a performance fix in DAGCombiner.cpp
and a correctness fix in SelectionDAG.cpp.

The test case tests the bug that's uncovered by the performance fix, and
fixed by the correctness fix.

The performance fix keeps the containers required by the
hasPredecessorHelper (which is a lazy DFS) and reuse them. Since
hasPredecessorHelper is called in a loop, the overall efficiency reduced
from O(n^2) to O(n), where n is the number of SDNodes.

The correctness fix keeps iterating the neighbor list even if it's time
to early return. It will return after finishing adding all neighbors to
Worklist, so that no neighbors are discarded due to the original early
return.

llvm-svn: 259691
2016-02-03 20:58:55 +00:00
Jonas Paulsson ac29f01788 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
Recommited, after some fixing with test cases.

Updated test cases:
test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
test/CodeGen/AArch64/tailcall_misched_graph.ll

Temporarily disabled test cases:
test/CodeGen/AMDGPU/split-vector-memoperand-offsets.ll
test/CodeGen/PowerPC/ppc64-fastcc.ll (partially updated)
test/CodeGen/PowerPC/vsx-fma-m.ll
test/CodeGen/PowerPC/vsx-fma-sp.ll

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259673
2016-02-03 17:52:29 +00:00
Jun Bum Lim 59df5e89c2 [MachineCopyPropagation] Fix comment. NFC
Reviewers: MatzeB, qcolombet, jmolloy, mcrosier

Subscribers: llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16806

llvm-svn: 259656
2016-02-03 15:56:27 +00:00
Marcello Maggioni bfe87568aa RegCoalescer: Making sure re-materialization defines all subranges
The register coalescer can rematerialize constants that define
more of a register than the copy it is going to replace was going
to do.
This is valid in the case the register was undef before the
copy happened.
This patch makes sure that all the subranges defined by the new
rematerialization instructions have at least a dead def.

Review: http://reviews.llvm.org/D16693
llvm-svn: 259614
2016-02-03 00:22:32 +00:00
David Majnemer 30579ec851 [codeview] Improve readability of codeview assembly output
Strictly speaking, this is not an improvement in functionality per se
but a usability improvement to those debugging codeview.

llvm-svn: 259601
2016-02-02 23:18:23 +00:00
Matthias Braun 1377fd6781 MachineVerifier: Check that defs/uses are live in subregisters as well.
llvm-svn: 259552
2016-02-02 20:04:51 +00:00
David Majnemer c9911f28e5 [codeview] Correctly handle inlining functions post-dominated by unreachable
CodeView requires us to accurately describe the extent of the inlined
code.  We did this by grabbing the next debug location in source order
and using *that* to denote where we stopped inlining.  However, this is
not sufficient or correct in instances where there is no next debug
location or the next debug location belongs to the start of another
function.

To get this correct, use the end symbol of the function to denote the
last possible place the inlining could have stopped at.

llvm-svn: 259548
2016-02-02 19:22:34 +00:00