Commit Graph

20 Commits

Author SHA1 Message Date
Hans Wennborg e1a2e90ffa Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.

It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.

Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.

Differential Revision: http://reviews.llvm.org/D18627

llvm-svn: 265036
2016-03-31 18:33:38 +00:00
James Y Knight 3602286937 [SPARC] Fix stupid oversight in stack realignment support.
If you're going to realign %sp to get object alignment properly (which
the code does), and stack offsets and alignments are calculated going
down from %fp (which they are), then the total stack size had better
be a multiple of the alignment. LLVM did indeed ensure that.

And then, after aligning, the sparc frame code added 96 (for sparcv8)
to the frame size, making any requested alignment of 64-bytes or
higher *guaranteed* to be misaligned. The test case added with r245668
even tests this exact scenario, and asserted the incorrect behavior,
which I somehow failed to notice. D'oh.

This change fixes the frame lowering code to align the stack size
*after* adding the spill area, instead.

Differential Revision: http://reviews.llvm.org/D12349

llvm-svn: 246042
2015-08-26 17:57:51 +00:00
James Y Knight 667395f334 [Sparc] Support user-specified stack object overalignment.
Note: I do not implement a base pointer, so it's still impossible to
have dynamic realignment AND dynamic alloca in the same function.

This also moves the code for determining the frame index reference
into getFrameIndexReference, where it belongs, instead of inline in
eliminateFrameIndex.

[Begin long-winded screed]

Now, stack realignment for Sparc is actually a silly thing to support,
because the Sparc ABI has no need for it -- unlike the situation on
x86, the stack is ALWAYS aligned to the required alignment for the CPU
instructions: 8 bytes on sparcv8, and 16 bytes on sparcv9.

However, LLVM unfortunately implements user-specified overalignment
using stack realignment support, so for now, I'm going to go along
with that tradition. GCC instead treats objects which have alignment
specification greater than the maximum CPU-required alignment for the
target as a separate block of stack memory, with their own virtual
base pointer (which gets aligned). Doing it that way avoids needing to
implement per-target support for stack realignment, except for the
targets which *actually* have an ABI-specified stack alignment which
is too small for the CPU's requirements.

Further unfortunately in LLVM, the default canRealignStack for all
targets effectively returns true, despite that implementing that is
something a target needs to do specifically. So, the previous behavior
on Sparc was to silently ignore the user's specified stack
alignment. Ugh.

Yet MORE unfortunate, if a target actually does return false from
canRealignStack, that also causes the user-specified alignment to be
*silently ignored*, rather than emitting an error.

(I started looking into fixing that last, but it broke a bunch of
tests, because LLVM actually *depends* on having it silently ignored:
some architectures (e.g. non-linux i386) have smaller stack alignment
than spilled-register alignment. But, the fact that a register needs
spilling is not known until within the register allocator. And by that
point, the decision to not reserve the frame pointer has been frozen
in place. And without a frame pointer, stack realignment is not
possible. So, canRealignStack() returns false, and
needsStackRealignment() then returns false, assuming everyone can just
go on their merry way assuming the alignment requirements were
probably just suggestions after-all. Sigh...)

Differential Revision: http://reviews.llvm.org/D12208

llvm-svn: 245668
2015-08-21 04:17:56 +00:00
Matthias Braun 0256486532 PrologEpilogInserter: Rewrite API to determine callee save regsiters.
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():

- Rename the function to determineCalleeSaves()
- Pass a bitset of callee saved registers by reference, thus avoiding
  the function-global PhysRegUsed bitset in MachineRegisterInfo.
- Without PhysRegUsed the implementation is fine tuned to not save
  physcial registers which are only read but never modified.

Related to rdar://21539507

Differential Revision: http://reviews.llvm.org/D10909

llvm-svn: 242165
2015-07-14 17:17:13 +00:00
Alexander Kornienko f00654e31b Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Alexander Kornienko 70bc5f1398 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
Quentin Colombet 61b305edfd [ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.

As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.


** Context **

Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.


** Motivating example **

Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b)  {
 %tmp = alloca i32, align 4
 %tmp2 = icmp slt i32 %a, %b
 br i1 %tmp2, label %true, label %false

true:
 store i32 %a, i32* %tmp, align 4
 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
 br label %false

false:
 %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
 ret i32 %tmp.0
}

On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f:                                     ; @f
; BB#0:
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
LBB0_2:                                 ; %false
  mov  sp, x29
  ldp x29, x30, [sp], #16
  ret

With shrink-wrapping we could generate:
_f:                                     ; @f
; BB#0:
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
  add sp, x29, #16            ; =16
  ldp x29, x30, [sp], #16
LBB0_2:                                 ; %false
  ret

Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.


** Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.

The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.


** Design Decisions **

1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
  pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.

Differential Revision: http://reviews.llvm.org/D9210

<rdar://problem/3201744>

llvm-svn: 236507
2015-05-05 17:38:16 +00:00
Benjamin Kramer a7c40ef022 Canonicalize header guards into a common format.
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)

Changes made by clang-tidy with minor tweaks.

llvm-svn: 215558
2014-08-13 16:26:38 +00:00
Eric Christopher 55414d4379 Remove the storage and use of the subtarget out of the sparc frame
lowering code.

llvm-svn: 211809
2014-06-26 22:33:50 +00:00
Craig Topper b0c941bebd [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Sparc edition
llvm-svn: 207502
2014-04-29 07:57:13 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Venkatraman Govindaraju 1116868a0d [Sparc] Emit large negative adjustments to SP/FP with sethi+xor instead of sethi+or. This generates correct code for both sparc32 and sparc64.
llvm-svn: 195576
2013-11-24 20:23:25 +00:00
Venkatraman Govindaraju a54533ed78 Sparc: No functionality change. Cleanup whitespaces, comment formatting etc.,
llvm-svn: 183243
2013-06-04 18:33:25 +00:00
Venkatraman Govindaraju ca0fe2f57e [Sparc] Add support for leaf functions in sparc backend.
llvm-svn: 182822
2013-05-29 04:46:31 +00:00
Venkatraman Govindaraju 641b0b5a21 [Sparc] Implements hasReservedCallFrame and hasFP.
This is to generate correct framesetup code when the function
 has variable sized allocas.

llvm-svn: 182108
2013-05-17 15:14:34 +00:00
Jakob Stoklund Olesen 2cfe46fd34 Compute correct frame sizes for SPARC v9 64-bit frames.
The save area is twice as big and there is no struct return slot. The
stack pointer is always 16-byte aligned (after adding the bias).

Also eliminate the stack adjustment instructions around calls when the
function has a reserved stack frame.

llvm-svn: 179083
2013-04-09 04:37:47 +00:00
Eli Bendersky 8da87163ca Move the eliminateCallFramePseudoInstr method from TargetRegisterInfo
to TargetFrameLowering, where it belongs. Incidentally, this allows us
to delete some duplicated (and slightly different!) code in TRI.

There are potentially other layering problems that can be cleaned up
as a result, or in a similar manner.

The refactoring was OK'd by Anton Korobeynikov on llvmdev.

Note: this touches the target interfaces, so out-of-tree targets may
be affected.

llvm-svn: 175788
2013-02-21 20:05:00 +00:00
Benjamin Kramer 009b1c1cf1 Round 2 of dead private variable removal.
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.

llvm-svn: 158096
2012-06-06 19:47:08 +00:00
Jia Liu b22310fda6 Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Anton Korobeynikov 2f93128109 Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there.
llvm-svn: 123170
2011-01-10 12:39:04 +00:00