Commit Graph

13 Commits

Author SHA1 Message Date
Daniel Jasper 559aa75382 Revert "r306529 - [X86] Correct dwarf unwind information in function epilogue"
I am 99% sure that this breaks the PPC ASAN build bot:
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio

If it doesn't go back to green, we can recommit (and fix the original
commit message at the same time :) ).

llvm-svn: 306676
2017-06-29 13:58:24 +00:00
Petar Jovanovic 7b3a38ec30 [X86] Correct dwarf unwind information in function epilogue
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.

Majority of the changes in this patch:

1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.

These changes are target independent and described below.

Changed CFI instructions so that they:

1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal

Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.

Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.

Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D18046

llvm-svn: 306529
2017-06-28 10:21:17 +00:00
Simon Pilgrim 46dd55f1e1 [X86][SSE] Change BUILD_VECTOR interleaving ordering to improve coalescing/combine opportunities
We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 1>
Step 2: unpcklps X, Y ==>    <3, 2, 1, 0>

The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc.

Instead, this patch unpacks progressively larger sequential vector elements together:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 2>
Step 2: unpcklpd X, Y ==>    <3, 2, 1, 0>

This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree.

Differential Revision: https://reviews.llvm.org/D33864

llvm-svn: 304688
2017-06-04 20:12:04 +00:00
Hans Wennborg b00ffd8cb7 Revert r302938 "Add LiveRangeShrink pass to shrink live range within BB."
This also reverts follow-ups r303292 and r303298.

It broke some Chromium tests under MSan, and apparently also internal
tests at Google.

llvm-svn: 303369
2017-05-18 18:50:05 +00:00
Dehao Chen 65dd23e273 Add LiveRangeShrink pass to shrink live range within BB.
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.

Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb

Reviewed By: MatzeB, andreadb

Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32563

llvm-svn: 302938
2017-05-12 19:29:27 +00:00
Matthias Braun 39c3c89cdc MCStreamer: Use "cfi" for CFI related temp labels.
Choosing a "cfi" name makes the intend a bit clearer in an assembly dump
and more importantly the assembly dumps are slightly more stable as the
numbers don't move around anymore when unrelated code calls
createTempSymbol() more or less often.
As they are temp labels the name doesn't influence the generated object
code.

Differential Revision: https://reviews.llvm.org/D27244

llvm-svn: 288290
2016-11-30 23:48:26 +00:00
Simon Pilgrim 2279e59573 [X86][SSE] Avoid specifying unused arguments in SHUFPD lowering
As discussed on PR26491, we are missing the opportunity to make use of the smaller MOVHLPS instruction because we set both arguments of a SHUFPD when using it to lower a single input shuffle.

This patch sets the lowered argument to UNDEF if that shuffle element is undefined. This in turn makes it easier for target shuffle combining to decode UNDEF shuffle elements, allowing combines to MOVHLPS to occur.

A fix to match against MOVHPD stores was necessary as well.

This builds on the improved MOVLHPS/MOVHLPS lowering and memory folding support added in D16956

Adding similar support for SHUFPS will have to wait until have better support for target combining of binary shuffles.

Differential Revision: https://reviews.llvm.org/D23027

llvm-svn: 279430
2016-08-22 12:56:54 +00:00
Simon Pilgrim ecd07946eb [X86][SSE] Regenerated HADD/HSUB tests
llvm-svn: 257992
2016-01-16 14:03:40 +00:00
Sanjay Patel a800b6c04b fixed to test features, not CPUs
llvm-svn: 231523
2015-03-06 21:50:27 +00:00
Andrea Di Biagio 972ff97f8c [X86] Teach how to combine AVX and AVX2 horizontal binop on packed 256-bit vectors.
This patch adds target combine rules to match:
 - [AVX] Horizontal add/sub of packed single/double precision floating point
   values from 256-bit vectors;
 - [AVX2] Horizontal add/sub of packed integer values from 256-bit vectors.

llvm-svn: 210761
2014-06-12 10:53:48 +00:00
Andrea Di Biagio c7af75f9a7 [X86] Refactor the logic to select horizontal adds/subs to a helper function.
This patch moves part of the logic implemented by the target specific
combine rules added at r210477 to a separate helper function.
This should make easier to add more rules for matching AVX/AVX2 horizontal
adds/subs.

This patch also fixes a problem caused by a wrong check performed on indices
of extract_vector_elt dag nodes in input to the scalar adds/subs.

New tests have been added to verify that we correctly check indices of
extract_vector_elt dag nodes when selecting a horizontal operation.

llvm-svn: 210644
2014-06-11 07:57:50 +00:00
Andrea Di Biagio fa508af0fe [X86] Improved target combine rules for selecting horizontal add/sub.
This patch slightly changes the algorithm introduced at revision 210477
to fix a problem where the algorithm was producing incorrect code for 
the VEX.256 encoded versions of horizontal add/sub.

For these cases, we now try to split the two 256-bit vectors into
128-bit chunks before emitting horizontal add/sub dag nodes.

Added a new test case into haddsub-2.ll.

llvm-svn: 210545
2014-06-10 16:42:57 +00:00
Andrea Di Biagio f99dd64f0a [X86] Add target combine rules for horizontal add/sub.
This patch adds new target specific combine rules to identify horizontal
add/sub idioms from BUILD_VECTOR dag nodes.

This patch also teaches the DAGCombiner how to canonicalize sequences of
insert_vector_elt dag nodes according to the following rule:

  (insert_vector_elt (insert_vector_elt A, I0), I1) ->
    (insert_vecto_elt (insert_vector_elt A, I1), I0)

This new canonicalization rule only triggers if the inner insert_vector
dag node has exactly one use; also, both indices must be known constants,
and I1 < I0.
This last rule made it possible to write a simpler algorithm to identify
horizontal add/sub patterns because now we don't have to worry about the
ordering of insert_vector_elt dag nodes.

llvm-svn: 210477
2014-06-09 16:54:41 +00:00