Commit Graph

454 Commits

Author SHA1 Message Date
David L Kreitzer 01a057a0c4 Add a pass to optimize patterns of vectorized interleaved memory accesses for
X86. The pass optimizes as a unit the entire wide load + shuffles pattern
produced by interleaved vectorization. This initial patch optimizes one pattern
(64-bit elements interleaved by a factor of 4). Future patches will generalize
to additional patterns.

Patch by Farhana Aleen

Differential revision: http://reviews.llvm.org/D24681

llvm-svn: 284260
2016-10-14 18:20:41 +00:00
Mehdi Amini f42454b94b Move the global variables representing each Target behind accessor function
This avoids "static initialization order fiasco"

Differential Revision: https://reviews.llvm.org/D25412

llvm-svn: 283702
2016-10-09 23:00:34 +00:00
Petr Hosek e023d62e76 [Triple] Add triple for Fuchsia
Fuchsia is a new operating system.

Differential Revision: https://reviews.llvm.org/D25116

llvm-svn: 283419
2016-10-06 05:17:26 +00:00
Sanjay Patel bfdbea6481 [Target] move reciprocal estimate settings from TargetOptions to TargetLowering
The motivation for the change is that we can't have pseudo-global settings for
codegen living in TargetOptions because that doesn't work with LTO.

Ideally, these reciprocal attributes will be moved to the instruction-level via
FMF, metadata, or something else. But making them function attributes is at least
an improvement over the current state.

The ingredients of this patch are:

    Remove the reciprocal estimate command-line debug option.
    Add TargetRecip to TargetLowering.
    Remove TargetRecip from TargetOptions.
    Clean up the TargetRecip implementation to work with this new scheme.
    Set the default reciprocal settings in TargetLoweringBase (everything is off).
    Update the PowerPC defaults, users, and tests.
    Update the x86 defaults, users, and tests.

Note that if this patch needs to be reverted, the related clang patch checked in
at r283251 should be reverted too.

Differential Revision: https://reviews.llvm.org/D24816

llvm-svn: 283252
2016-10-04 20:46:43 +00:00
Davide Italiano a9f85d68cc [CodeGen] Add support for emitting .init_array instead of .ctors on FreeBSD.
PR: 30494
llvm-svn: 282451
2016-09-26 22:53:15 +00:00
Eric Christopher 5653e5dffc Remove the default subtarget from the x86 port as it isn't necessary (or
correct) anymore.

llvm-svn: 282031
2016-09-20 22:19:33 +00:00
Eric Christopher ef579d2195 Remove a use of subtarget initialization in the X86 backend so we can get rid of the default subtarget.
NFC intended.

llvm-svn: 281982
2016-09-20 16:04:59 +00:00
Craig Topper f4151bea72 [AVX512] Add initial support for the Execution Domain fixing pass to change some EVEX instructions.
llvm-svn: 276393
2016-07-22 05:00:52 +00:00
Nico Weber 5bb284226b Don't optimize movs to pushes in -O0 builds.
https://reviews.llvm.org/D22362

llvm-svn: 275431
2016-07-14 15:40:22 +00:00
Nico Weber ead8f8ffdd Delete some trailing whitespace.
llvm-svn: 275429
2016-07-14 15:07:44 +00:00
Michael Kuperstein cfbac5f361 [X86] Disable FixupSetCC for CodeGenOpt::None
It is an optimization pass, and should not run at -O0. Especially since Fast RA
will not do the required register coalescing anyway, so it's a loss even from
the optimization standpoint.

This also works around (but doesn't quite fix) PR28489.

llvm-svn: 275099
2016-07-11 20:40:44 +00:00
Michael Kuperstein 3e3652aef2 Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setcc
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.
This fixes PR28146.

The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD)
which was not appreciated by fast regalloc on 32-bit.

llvm-svn: 274802
2016-07-07 22:50:23 +00:00
Michael Kuperstein edb38a94f8 Revert r274692 to check whether this is what breaks windows selfhost.
llvm-svn: 274771
2016-07-07 16:55:35 +00:00
Michael Kuperstein 1ef6c59b1d [X86] Transform setcc + movzbl into xorl + setcc
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.

This fixes PR28146.

Differential Revision: http://reviews.llvm.org/D21774

llvm-svn: 274692
2016-07-06 21:56:18 +00:00
David Majnemer 498f2fd11b Address post-review for r270246
This gets rid of some unnecessary SmallStrings in
X86TargetMachine::getSubtargetImpl.

No functionality change is intended.

llvm-svn: 270270
2016-05-20 20:41:24 +00:00
David Majnemer ca29023b02 [X86] Reduce memory allocations in X86TargetMachine::getSubtargetImpl
We performed a number of memory allocations each time getTTI was called,
remove them by using SmallString.
No functionality change intended.

llvm-svn: 270246
2016-05-20 18:16:06 +00:00
Rafael Espindola 8c34dd8257 Delete Reloc::Default.
Having an enum member named Default is quite confusing: Is it distinct
from the others?

This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.

llvm-svn: 269988
2016-05-18 22:04:49 +00:00
Hans Wennborg 8eb336c14e Re-commit r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions"
with an additional fix to make RegAllocFast ignore undef physreg uses. It would
previously get confused about the "push %eax" instruction's use of eax. That
method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate
as well, but since that runs after register-allocation, we didn't run into the
RegAllocFast issue before.

llvm-svn: 269949
2016-05-18 16:10:17 +00:00
Rafael Espindola 38af4d6347 Trivial cleanups.
This just clang formats and cleans comments in an area I am about to
post a patch for review.

llvm-svn: 269946
2016-05-18 16:00:24 +00:00
Hans Wennborg 759af30109 Revert r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions"
Seems to have broken the Windows ASan bot. Reverting while investigating.

llvm-svn: 269833
2016-05-17 20:38:56 +00:00
Hans Wennborg c3fb51171e X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions
This patch moves the expansion of WIN_ALLOCA pseudo-instructions
into a separate pass that walks the CFG and lowers the instructions
based on a conservative estimate of the offset between the stack
pointer and the lowest accessed stack address.

The goal is to reduce binary size and run-time costs by removing
calls to _chkstk. While it doesn't fix all the code quality problems
with inalloca calls, it's an incremental improvement for PR27076.

Differential Revision: http://reviews.llvm.org/D20263

llvm-svn: 269828
2016-05-17 20:13:29 +00:00
Matthias Braun 31d19d43c7 CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.

llvm-svn: 269011
2016-05-10 03:21:59 +00:00
Ahmed Bougacha 068ac4af39 [X86] Register and initialize the FixupBW pass.
That lets us use it in MIR tests.

llvm-svn: 268830
2016-05-07 01:11:10 +00:00
Nirav Dave 8dd66e5753 Remove HasFnAttribute guards to getFnAttribute calls
These checks are redundant and can be removed

Reviewers: hans

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D18564

llvm-svn: 264872
2016-03-30 15:41:12 +00:00
Paul Robinson f81836bd18 [PS4] Guarantee an instruction after a 'noreturn' call.
We need the "return address" of a noreturn call to be within the
bounds of the calling function; TrapUnreachable turns 'unreachable'
into a 'ud2' instruction, which has that desired effect.

Differential Revision: http://reviews.llvm.org/D18414

llvm-svn: 264224
2016-03-24 00:10:03 +00:00
Kevin B. Smith 6a83350bee [X86] New pass to change byte and word instructions to zero-extending versions.
Differential Revision: http://reviews.llvm.org/D17032

llvm-svn: 260572
2016-02-11 19:43:04 +00:00
Andrey Turetskiy 2396c38a8a [X86] Fix stack alignment for MCU target, by Anton Nadolskiy.
This patch fixes stack alignments for MCU (should be aligned to 4 bytes).

Differential Revision: http://reviews.llvm.org/D15646

llvm-svn: 260375
2016-02-10 11:57:06 +00:00
Yunzhong Gao eb959722a7 Revert r259576: Disable the vzeroupper insertion pass on PS4.
Will re-implement based on review feedback.

llvm-svn: 259615
2016-02-03 01:25:12 +00:00
Yunzhong Gao b76ccacfb1 Disable the vzeroupper insertion pass on PS4.
See comments in test/CodeGen/X86/avx-vzeroupper.ll for more explanation.

Original patch by: Sean Silva

llvm-svn: 259576
2016-02-02 21:39:23 +00:00
Alexey Bataev 7cf324772f LEA code size optimization pass (Part 1): Remove redundant address recalculations, by Andrey Turetsky
Add new x86 pass which replaces address calculations in load or store instructions with def register of existing LEA (must be in the same basic block), if the LEA calculates address that differs only by a displacement. Works only with -Os or -Oz.
Differential Revision: http://reviews.llvm.org/D13294

llvm-svn: 254712
2015-12-04 10:53:15 +00:00
Reid Kleckner ae44e871cd Revert "Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"""
This reverts commit r249794.

Apparently my checkouts are full of unexpected surprises today.

llvm-svn: 249796
2015-10-09 01:13:17 +00:00
Reid Kleckner b510401785 Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64""
This reverts commit r249032.

TODO write commit msg

llvm-svn: 249794
2015-10-09 01:11:37 +00:00
NAKAMURA Takumi 1ed20db720 Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"
It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll

llvm-svn: 249032
2015-10-01 17:00:56 +00:00
Reid Kleckner 6dec87a8a0 [WinEH] Emit int3 after noreturn calls on Win64
The Win64 unwinder disassembles forwards from each PC to try to
determine if this PC is in an epilogue. If so, it skips calling the EH
personality function for that frame. Typically, this means you cannot
catch an exception in the same frame that you threw it, because 'throw'
calls a noreturn runtime function.

Previously we avoided this problem with the TrapUnreachable
TargetOption, but that's a much bigger hammer than we need. All we need
is a 1 byte non-epilogue instruction right after the call.  Instead,
what we got was an unconditional branch to a shared block containing the
ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which
added TrapUnreachable, and replaces it with something better.

The new code pattern matches for invoke/call followed by unreachable and
inserts an int3 into the DAG. To be 100% watertight, we would need to
insert SEH_Epilogue instructions into all basic blocks ending in a call
with no terminators or successors, but in practice this is unlikely to
come up.

llvm-svn: 248959
2015-09-30 23:09:23 +00:00
Eric Christopher a4e5d3cf8e constify the Function parameter to the TTI creation callback and
propagate to all callers/users/etc.

llvm-svn: 247864
2015-09-16 23:38:13 +00:00
David Majnemer 0ad363eebc [WinEH] Calculate state numbers for the new EH representation
State numbers are calculated by performing a walk from the innermost
funclet to the outermost funclet.   Rudimentary support for the new EH
constructs has been added to the assembly printer, just enough to test
the new machinery.

Differential Revision: http://reviews.llvm.org/D12098

llvm-svn: 245331
2015-08-18 19:07:12 +00:00
Pat Gavlin b399095c3f Add a target environment for CoreCLR.
Although targeting CoreCLR is similar to targeting MSVC, there are
certain important differences that the backend must be aware of
(e.g. differences in stack probes, EH, and library calls).

Differential Revision: http://reviews.llvm.org/D11012

llvm-svn: 245115
2015-08-14 22:41:43 +00:00
Sanjay Patel 09b2c890af [x86] set default reciprocal (division and square root) codegen to match GCC
D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line 
options to allow reciprocal estimate instructions to be used in place of
divisions and square roots.

This patch changes the default settings for x86 targets to allow that recip
codegen (except for scalar division because that breaks too much code) when
using -ffast-math or its equivalent. 

This matches GCC behavior for this kind of codegen.

Differential Revision: http://reviews.llvm.org/D10396

llvm-svn: 240310
2015-06-22 18:29:44 +00:00
Daniel Sanders c81f450f1a Clean up redundant copies of Triple objects. NFC
Summary:

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10382

llvm-svn: 239823
2015-06-16 15:44:21 +00:00
Daniel Sanders 3e5de88dac Replace string GNU Triples with llvm::Triple in TargetMachine. NFC.
Summary:
For the moment, TargetMachine::getTargetTriple() still returns a StringRef.

This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: ted, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10362

llvm-svn: 239554
2015-06-11 19:41:26 +00:00
Sanjay Patel 08829bac81 [x86] Add a reassociation optimization to increase ILP via the MachineCombiner pass
This is a reimplementation of D9780 at the machine instruction level rather than the DAG.

Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a
starting point; see the TODO comments) to increase ILP when it's safe to do so.

The code is closely based on the existing MachineCombiner optimization that is implemented
for AArch64.

This patch should not cause the kind of spilling tragedy that led to the reversion of r236031.

Differential Revision: http://reviews.llvm.org/D10321

llvm-svn: 239486
2015-06-10 20:32:21 +00:00
Daniel Sanders a73f1fdb19 Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rafael

Reviewed By: rafael

Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10311

llvm-svn: 239467
2015-06-10 12:11:26 +00:00
Sanjay Patel 667a7e2a0f make reciprocal estimate code generation more flexible by adding command-line options (3rd try)
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.

The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 239001
2015-06-04 01:32:35 +00:00
Rafael Espindola cf8beece97 Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)"
This reverts commit r238842.

It broke -DBUILD_SHARED_LIBS=ON build.

llvm-svn: 238900
2015-06-03 05:32:44 +00:00
Sanjay Patel 6f031d848e make reciprocal estimate code generation more flexible by adding command-line options (2nd try)
The first try (r238051) to land this was reverted due to bot failures
that were hopefully addressed by r238788.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 238842
2015-06-02 15:28:15 +00:00
Reid Kleckner 5b8ebfbc25 Only add the EH state insertion pass on 32-bit Windows
llvm-svn: 238612
2015-05-29 20:43:10 +00:00
Rafael Espindola 445712264d Revert "make reciprocal estimate code generation more flexible by adding command-line options"
This reverts commit r238051.

It broke some bots:

http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190

llvm-svn: 238075
2015-05-23 00:22:44 +00:00
Sanjay Patel ba2ba80302 make reciprocal estimate code generation more flexible by adding command-line options
This patch adds a class for processing many recip codegen possibilities.
The TargetRecip class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 238051
2015-05-22 21:10:06 +00:00
Quentin Colombet 494eb606cd Reapply r238011 with a fix for the trap instruction.
The problem was that I slipped a change required for shrink-wrapping, namely I
used getFirstTerminator instead of the getLastNonDebugInstr that was here before
the refactoring, whereas the surrounding code is not yet patched for that.

Original message:
[X86] Refactor the prologue emission to prepare for shrink-wrapping.

- Add a late pass to expand pseudo instructions (tail call and EH returns).
 Instead of doing it in the prologue emission.
- Factor some static methods in X86FrameLowering to ease code sharing.

NFC.

Related to <rdar://problem/20821487>

llvm-svn: 238035
2015-05-22 18:10:47 +00:00
Tamas Berghammer 466692abdc Revert "[X86] Fix a variable name for r237977 so that it works with every compilers."
Revert "[X86] Refactor the prologue emission to prepare for shrink-wrapping."

This reverts commit 6b3b93fc8b68a2c806aa992ee4bd3d7f61898d4b.
This reverts commit ab0b15dff8539826283a59c2dd700a18a9680e0f.

llvm-svn: 238011
2015-05-22 10:01:56 +00:00