Commit Graph

85953 Commits

Author SHA1 Message Date
Nick Lewycky 947ca8ac52 Fix comment in typo. NFC
llvm-svn: 256761
2016-01-04 16:44:44 +00:00
Joseph Tremoulet 52f729a613 [WinEH] Update CoreCLR EH state numbering
Summary:
Fix the CLR state numbering to generate correct tables, and update the lit
test to verify them.

The CLR numbering assigns one state number to each catchpad and
cleanuppad.

It also computes two tree-like relations over states:
 1) Each state has a "HandlerParentState", which is the state of the next
    outer handler enclosing this state's handler (same as nearest ancestor
    per the ParentPad linkage on EH pads, but skipping over catchswitches).
 2) Each state has a "TryParentState", which:
    a) for a catchpad that's not the last handler on its catchswitch, is
       the state of the next catchpad on that catchswitch.
    b) for all other pads, is the state of the pad whose try region is the
       next outer try region enclosing this state's try region.  The "try
       regions are not present as such in the IR, but will be inferred
       based on the placement of invokes and pads which reach each other
       by exceptional exits.

Catchswitches do not get their own states, but each gets mapped to the
state of its first catchpad.

Table generation requires each state's "unwind dest" state to have a lower
state number than the given state.

Since HandlerParentState can be computed as a function of a pad's
ParentPad, and TryParentState can be computed as a function of its unwind
dest and the TryParentStates of its children, the CLR state numbering
algorithm first computes HandlerParentState in a top-down pass, then
computes TryParentState in a bottom-up pass.

Also reword some comments/names in the CLR EH table generation to make the
distinction between the different kinds of "parent" clear.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D15325

llvm-svn: 256760
2016-01-04 16:16:01 +00:00
Nicolai Haehnle e705aadd67 AMDGPU: Avoid assertions after SGPR spilling failed
Summary:
The comment explains it: emitError does not necessarily exit the compilation
process, and then using NoRegister leads to assertions later on.
This generates incorrect code, of course, but the user should know to not use
the result when an error has been emitted.

It would be nice to have a test-case for this inside the LLVM repository,
but llc exits on error. shader-db tests trigger the underlying issue at least
on Tonga.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15826

llvm-svn: 256757
2016-01-04 15:50:01 +00:00
Michael Zuckerman cf0b6db9ef [AVX512] add PSRAD and PSRAQ Intrinsic
Differential Revision: http://reviews.llvm.org/D15851

llvm-svn: 256754
2016-01-04 13:45:45 +00:00
Michael Zuckerman 000fca44a8 [AVX512] add PSRAW Intrinsic
Differential Revision: http://reviews.llvm.org/D15850

llvm-svn: 256751
2016-01-04 12:50:36 +00:00
Jeroen Ketema 5a02dc46cb [MC] Fix file name in file header
llvm-svn: 256749
2016-01-04 12:22:34 +00:00
Michael Zuckerman 068bc2f219 [AVX512] add PSRLV Intrinsic
Differential Revision: http://reviews.llvm.org/D15838

llvm-svn: 256747
2016-01-04 11:39:06 +00:00
Chandler Carruth 7664127f8c Fix a horrible infloop in value tracking in the face of dead code.
Amazingly, we just never triggered this without:
1) Moving code around for MetadataTracking so that a certain *different*
   amount of inlining occurs in the per-TU compile step.
2) Then you LTO opt or clang with a bootstrap, and get inlining, loop
   opts, and GVN line up everything *just* right.

I don't really know how we didn't hit this before. We really need to be
fuzz testing stuff, it shouldn't be hard to trigger. I'm working on
crafting a reduced nice test case, and will submit that when I have it,
but I want to get LTO build bots going again.

llvm-svn: 256735
2016-01-04 07:23:12 +00:00
Craig Topper 5a6dda5376 [TableGen] Use some free space in Init to store the opcode for UnOpInit/BinOpInit/TernOpInit allowing those types to be a little smaller. NFC
llvm-svn: 256733
2016-01-04 06:28:49 +00:00
David Majnemer ca1c9f074f [X86] Make hasFP constant time
We need a frame pointer if there is a push/pop sequence after the
prologue in order to unwind the stack.  Scanning the instructions to
figure out if this happened made hasFP not constant-time which is a
violation of expectations.  Let's compute this up-front and reuse that
computation when we need it.

llvm-svn: 256730
2016-01-04 04:49:41 +00:00
David Majnemer 42a0730c42 [LICM] Make instruction sinking funclet-aware
We had two bugs here:
- We might try to sink into a catchswitch, causing verifier failures.
- We will succeed in sinking into a cleanuppad but we didn't update the
  funclet operand bundle.

This fixes PR26000.

llvm-svn: 256728
2016-01-04 03:37:39 +00:00
Craig Topper cfd8173327 [TableGen] Change TGParser::SetValue to take an ArrayRef instead of std::vector reference. Use None in many places where a default constructed vector was being passed. NFC
llvm-svn: 256726
2016-01-04 03:15:08 +00:00
Craig Topper 1e23ed9eaa [TableGen] Fix a bug that caused the wrong name for a record built from a multiclass containing a defm called NAME that references another multiclass that contains a defm that uses NAME concatenated with other strings.
It would end up doing the concatenations from the second multiclass twice. This occured because SetValue detected a self assignment when trying to set the value of NAME to a VarInit called NAME. NAME is special here and it will get cleaned up later. So add a flag to suppress the self assignment check for this case.

Strangely the self-assignment error was returning false indicating it wasn't an error, but it wasn't doing the right thing. So this also changes it to report an error.

This fixes the names of some AVX512 FMA instructions that showed this double expansion.

llvm-svn: 256725
2016-01-04 03:05:14 +00:00
Craig Topper e30b8ca149 Use std::is_sorted and std::none_of instead of manual loops. NFC
llvm-svn: 256719
2016-01-03 19:43:40 +00:00
Xinliang David Li 76c3f38774 [PGO] Cleanup: remove reduncant calls in lowering
CoverageMapping data's section and alignment is
already set during creation. No need to call it again
during lowering.

llvm-svn: 256716
2016-01-03 19:38:51 +00:00
Xinliang David Li 5205ca0c70 [PGO] Cleanup: Use covmap header definition in the template file
This is one last remaining instrumentatation related structure
that needs to be migrate to use the centralized template
definition.  With this change, instrumentation code 
related to coverage module header will be kept in sync
with the coverage mapping reader. The remaining code
which makes implicit assumption about covmap control
structure layout in the the lowering pass will cleaned
up in a different patch. This patch is not intended to
have no functional change.

llvm-svn: 256715
2016-01-03 19:26:07 +00:00
Xinliang David Li 946558df2d [PGO] Code refactoring to use header struct def /NFC
llvm-svn: 256712
2016-01-03 18:57:40 +00:00
Simon Pilgrim 5bf96e41c5 [SelectionDAG] Pulled out common code for CONCAT_VECTORS node creation
Pulled out the similar CONCAT_VECTORS creation code from the 2/3 operand getNode() calls (to handle all UNDEF and all BUILD_VECTOR cases). Added a similar handler to the general getNode() call as well.

llvm-svn: 256709
2016-01-03 18:24:19 +00:00
Dimitry Andric 227b928abc Fix several accidental DOS line endings in source files
Summary:
There are a number of files in the tree which have been accidentally checked in with DOS line endings.  Convert these to native line endings.

There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those.

Reviewers: joerg, aaron.ballman

Subscribers: aaron.ballman, sanjoy, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D15848

llvm-svn: 256707
2016-01-03 17:22:03 +00:00
Craig Topper 90b18c4014 Use an ArrayRef to simplify repeated calculation of the array end. NFC
llvm-svn: 256702
2016-01-03 08:45:36 +00:00
Craig Topper 5f167729aa Use std::is_sorted instead of manual loops. NFC
llvm-svn: 256701
2016-01-03 07:33:45 +00:00
Xinliang David Li 37c1fa047d [PGO] simple refactoring (NFC)
llvm-svn: 256695
2016-01-03 04:38:13 +00:00
NAKAMURA Takumi ded575e4eb WinEHPrepare.cpp: Suppress a warning for -Asserts. [-Wunused-variable]
llvm-svn: 256694
2016-01-03 01:41:00 +00:00
Joseph Tremoulet d425dd13ad [Verifier] Add braces to satisfy buildbots. NFC
Fix build break introduced by r256691.

llvm-svn: 256692
2016-01-02 15:50:34 +00:00
Joseph Tremoulet 131a462690 [WinEH] Verify catchswitch handlers
Summary:
The handler list must be nonempty and consist solely of CatchPads.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15842

llvm-svn: 256691
2016-01-02 15:25:25 +00:00
Joseph Tremoulet 06125e52a7 [WinEH] Tighten parentPad verifier checks
Summary: A catchswitch cannot be a parent of a cleanuppad or another catchswitch.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15841

llvm-svn: 256690
2016-01-02 15:24:24 +00:00
Joseph Tremoulet 71e5676de4 [WinEH] Update catchrets with cloned successors
Summary:
Add a pass to update catchrets when their successors get cloned; the
existing pass doesn't catch these because it walks the funclet whose
blocks are being cloned but the catchret is in a child funclet.

Also update the test for removing incoming PHI values; when the
predecessor is a catchret, the relevant color is the catchret's parentPad,
not its block's color.


Reviewers: andrew.w.kaylor, rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15840

llvm-svn: 256689
2016-01-02 15:22:36 +00:00
Yaron Keren c47c6ac0a5 Correct misleading formatting of several ifs followed by two statements without braces.
While the original code would work with or without braces, it makes sense to
set HaveSemi to true only if (!HaveSemi), otherwise it's already true, so I
put the assignment inside the if block. This addresses PR25998.

llvm-svn: 256688
2016-01-02 13:40:36 +00:00
David Majnemer dbdc9c274d [WinEH] Add additional verification
Recolor the IR to make sure our computed colors are not hiding any bugs.
Also, verifyFunction if we are running some post-preparation operations;
some of these operations can hide latent bugs.

llvm-svn: 256687
2016-01-02 09:26:36 +00:00
David Majnemer 011980cd50 [X86] Add intrinsics for reading and writing to the flags register
LLVM's targets need to know if stack pointer adjustments occur after the
prologue.  This is needed to correctly determine if the red-zone is
appropriate to use or if a frame pointer is required.

Normally, LLVM can figure this out very precisely by reasoning about the
contents of the MachineFunction.  There is an interesting corner case:
inline assembly.

The vast majority of inline assembly which will perform a push or pop is
done so to pair up with pushf or popf as appropriate.  Unfortunately,
this inline assembly doesn't mark the stack pointer as clobbered
because, well, it isn't.  The stack pointer is decremented and then
immediately incremented.  Because of this, LLVM was changed in r256456
to conservatively assume that inline assembly contain a sequence of
stack operations.  This is unfortunate because the vast majority of
inline assembly will not end up manipulating the stack pointer in any
way at all.

Instead, let's provide a more principled solution: an intrinsic.
FWIW, other compilers (MSVC and GCC among them) also provide this
functionality as an intrinsic.

llvm-svn: 256685
2016-01-01 06:50:01 +00:00
Sanjay Patel bee05caa6b [LibCallSimplifier] propagate FMF when shrinking binary calls
llvm-svn: 256682
2015-12-31 23:40:59 +00:00
Craig Topper 74658dfaad [X86] Remove a return after llvm_unreachable.
llvm-svn: 256681
2015-12-31 22:40:48 +00:00
Craig Topper 69653af748 [X86] Move shuffle decoding for constant pool into the X86CodeGen library to remove a layering violation in the Util library.
llvm-svn: 256680
2015-12-31 22:40:45 +00:00
Sanjay Patel aa23114cb4 [LibCallSimplifier] propagate FMF when shrinking unary calls
llvm-svn: 256679
2015-12-31 21:52:31 +00:00
Sanjay Patel 96475cbd22 Variable names start with an upper case letter; NFC
llvm-svn: 256676
2015-12-31 16:16:58 +00:00
Sanjay Patel d707db97a9 fix formatting; NFC
llvm-svn: 256675
2015-12-31 16:10:49 +00:00
Michael Zuckerman 0dc468880d [AVX512] add PSRLQ and PSRLD Intrinsic
Differential Revision: http://reviews.llvm.org/D15770

llvm-svn: 256673
2015-12-31 15:22:04 +00:00
Michael Kuperstein d36e24a166 [X86] Avoid folding scalar loads into unary sse intrinsics
Not folding these cases tends to avoid partial register updates:
sqrtss (%eax), %xmm0
Has a partial update of %xmm0, while
movss (%eax), %xmm0
sqrtss %xmm0, %xmm0
Has a clobber of the high lanes immediately before the partial update,
avoiding a potential stall.

Given this, we only want to fold when optimizing for size.
This is consistent with the patterns we already have for some of
the fp/int converts, and in X86InstrInfo::foldMemoryOperandImpl()

Differential Revision: http://reviews.llvm.org/D15741

llvm-svn: 256671
2015-12-31 09:45:16 +00:00
Asaf Badouh af6569afd2 [X86][PKU] Add {RD,WR}PKRU intrinsics
Differential Revision: http://reviews.llvm.org/D15808

llvm-svn: 256670
2015-12-31 08:31:13 +00:00
Craig Topper fd2c6a3be0 [TableGen] Modify the AsmMatcherEmitter to only apply the table growth from r252440 to the Hexagon target.
This restores the previous behavior of not including the mnemonic in the classes table for every target that starts instruction lines with the mnemonic. Not only did the table size increase by 1 entry, but the class enum increased in size which caused every class in the array to increase in size. It also grew the size of the function that parsers tokens into classes by a substantial amount.

This adds a new HasMnemonicFirst flag to all AsmParsers. It's set to 1 by default and Hexagon target overrides it to 0.

For the X86 target alone this recovers 324KB of size on the llvm-mc executable.

I believe the current state is still a bad design choice for the Hexagon target as it causes most of the parsing to do a linear search through the entire match table to comparing operands against every instruction until it finds one that works. At least for the other targets we do a binary search based on mnemonic over which to do the linear scan.

llvm-svn: 256669
2015-12-31 08:18:23 +00:00
Xinliang David Li e413f1a0fc [PGO]: Implement Func PGO name string compression
This is part of the effort/prepration to reduce the size
instr-pgo (object, binary, memory footprint, and raw data).

The functionality is currently off by default and not yet
used by any clients.

llvm-svn: 256667
2015-12-31 07:57:16 +00:00
Sanjay Patel 41160c2094 [ValueTracking] fix bug computing isKnownToBeAPowerOfTwo() with arithmetic shift right (PR25900)
This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=25900

If we think that an arithmetic right shift of a power of two is always a power of two, 
an sdiv gets wrongly converted to udiv.

Differential Revision: http://reviews.llvm.org/D15827

llvm-svn: 256655
2015-12-30 22:40:52 +00:00
Teresa Johnson 96f7f81aa3 [ThinLTO] Rename variables used in metadata linking (NFC)
As suggested in review for r255909, rename MDMaterialized to AllowTemps,
and identify the name of the boolean flag being set in calls to
saveMetadataList.

llvm-svn: 256653
2015-12-30 21:13:55 +00:00
Teresa Johnson cc428573a2 Ensure MDNode used as key in metadata linking map cannot be RAUWed
As suggested in review for r255909, add a way to ensure that temporary
MD used as keys in the MetadataToID map during ThinLTO importing are not
RAUWed.

Add support for marking an MDNode as not replaceable. Clear the new
CanReplace flag when adding a temporary MD node to the MetadataToID map
and clear it when destroying the map.

llvm-svn: 256648
2015-12-30 19:32:24 +00:00
Teresa Johnson 26aa93586a [ThinLTO] Check MDNode values saved for metadata linking (NFC)
Add an assert suggested in review for r255909 to ensure that MDNodes
saved in the map used for metadata linking are either temporary or
resolved.

Also add a comment clarifying why we may need to save off non-MDNode
metadata.

llvm-svn: 256646
2015-12-30 19:13:57 +00:00
Sanjay Patel 16395dd709 fix formatting; NFC
llvm-svn: 256645
2015-12-30 18:31:30 +00:00
Teresa Johnson 61b406ec75 Rename MDValue* to Metadata* (NFC)
Renamed MDValue* to Metadata*, and MDValueToValIDMap to MetadataToIDs,
as per review for r255909.

llvm-svn: 256593
2015-12-29 23:00:22 +00:00
Manuel Jacob 67f1d3ac63 [RS4GC] Use DenseMap::count() instead of DenseMap::find()/DenseMap::end(). NFC.
llvm-svn: 256586
2015-12-29 22:16:41 +00:00
Sanjay Patel ac6e910c42 don't repeat function names in comments; NFC
llvm-svn: 256584
2015-12-29 22:11:50 +00:00
Sanjay Patel d1d9db5889 use auto with dyn_casted values; NFC
llvm-svn: 256581
2015-12-29 22:00:37 +00:00
Manuel Jacob e3773d632e [PlaceSafepoints] Assert that the gc.safepoint_poll function is present in the module.
If running the PlaceSafepoints pass on a module which doesn't have the
gc.safepoint_poll function without disabling entry and backedge safepoints,
previously the pass crashed with an obscure error because of a null pointer.
Now it fails the assert instead.

llvm-svn: 256580
2015-12-29 21:57:55 +00:00
Sanjay Patel 7a7abc9a3b use auto with dyn_casted values; NFC
llvm-svn: 256579
2015-12-29 21:49:08 +00:00
Sanjay Patel b120ae96d3 fix formatting; NFC
llvm-svn: 256574
2015-12-29 19:34:53 +00:00
Sanjay Patel 4104f78640 use range-based for-loops; NFCI
llvm-svn: 256573
2015-12-29 19:14:23 +00:00
Sanjay Patel faeee6f44d use range-based for-loop; NFCI
llvm-svn: 256572
2015-12-29 18:30:09 +00:00
Chad Rosier 6b4326367a Add command line options to force function/loop alignments.
These are being added for testing purposes.
http://reviews.llvm.org/D15648

llvm-svn: 256571
2015-12-29 18:18:07 +00:00
Sanjay Patel 59309cc090 don't repeat function names in comments; NFC
llvm-svn: 256569
2015-12-29 18:14:06 +00:00
Geoff Berry 43dc285915 [JumpThreading] Fix opcode bonus in getJumpThreadDuplicationCost()
The code that was meant to adjust the duplication cost based on the
terminator opcode was not being executed in cases where the initial
threshold was hit inside the loop.

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D15536

llvm-svn: 256568
2015-12-29 18:10:16 +00:00
Sanjay Patel 7dd45697ca use range-based for-loops; NFCI
llvm-svn: 256566
2015-12-29 17:15:22 +00:00
Philip Reames 87a8677e3d [MemoryBuiltins] Delete dead code [NFC]
llvm-svn: 256565
2015-12-29 17:04:43 +00:00
Michael Zuckerman 80821ee77c [AVX512] add PSRLW Intrinsic
Differential Revision: http://reviews.llvm.org/D15751

llvm-svn: 256558
2015-12-29 13:04:35 +00:00
Chandler Carruth 5bd31b37ca [ptr-traits] Provide a real MCFragment address for the sentinel instead
of casting the integer '4' to such a pointer. There is no reason to
expect '4' to be a portable or reliable pointer of this form. The only
reason this ever worked is because the PointerIntPair that this actually
gets used with has an artificially *low* presumed alignment that allowed
it to work. When the alignment of PointerIntPair is derived from the
actual type's alignment, the asserts start firing on this pointer. I'm
amazed we never managed to do anything that triggered the alignment
sanitizer with it, as this is just flat out UB.

If folks dislike this approach to providing a sentinel fragment address,
there are a myriad of other alternatives, suggestions welcome. But this
one has the distinct advantage of not requiring the friend dance of
ilist's sentinel (which I'll point out is *also* in play for
MCFragment!) and seems to be using a nicely provided facility in
MCFragment to establish just such dummy nodes.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

llvm-svn: 256552
2015-12-29 09:32:18 +00:00
Chandler Carruth e0115344e6 [ptr-traits] Sink a constructor definition to the .cpp file and add
missing includes so that the pointee types for DenseMap pointer keys and
such are complete prior to us querying the pointer traits for them.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

llvm-svn: 256550
2015-12-29 09:24:39 +00:00
Chandler Carruth 8d736236d3 [ptr-traits] Split the MCFragment type hierarchy out of the MCAssembler
header to its own header, allowing users of fragments to have a narrower
header file, and avoid circular header dependencies when getting the
definition of MCSection prior to inspecting traits on MCSection
pointers.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

Note that this doesn't in any way change the design of MC, it is just
moving code around to allow the *header files* to be more fine grained.
Without this, it is impossible to get a complete type for MCSection
where it is needed.

If anyone would prefer a different slicing of the header files, I'm
happy to oblige of course. =]

llvm-svn: 256548
2015-12-29 09:06:16 +00:00
Craig Topper d270501a6e [TableGen] Remove MnemonicContainsDot from AsmParser. It isn't used. NFC
llvm-svn: 256542
2015-12-29 07:03:30 +00:00
Craig Topper 3294966ed7 [X86] Remove declaration of ATTAsmParser. Its equivalent to the DefaultAsmParser. NFC
llvm-svn: 256541
2015-12-29 07:03:27 +00:00
Chandler Carruth a7dc087b36 [ptr-traits] Merge the MetadataTracking helpers into the Metadata
header.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

The MetadataTracking helpers aren't actually independent. They rely on
constructing a PointerUnion between Metadata and MetadataAsValue
pointers, which requires know the alignment of pointers to those types
which requires them to be complete.

The .cpp file even defined a method declared in Metadata.h! These really
don't seem like something that is separable, and there is no real
layering problem with just placing them together.

llvm-svn: 256531
2015-12-29 02:14:50 +00:00
Eric Christopher 2ae7180b24 Accept dwarf version 5 for CIE versions.
llvm-svn: 256527
2015-12-28 23:02:42 +00:00
Artyom Skrobov 2aca0c622a [Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'
Summary:
* avoid generating POP {LR} in Thumb1 epilogues
* combine MOV LR, Rx + BX LR -> BX Rx in a peephole optimization pass
* combine POP {LR} + B + BX LR -> POP {PC} on v5T+

Test cases by Ana Pazos

Differential Revision: http://reviews.llvm.org/D15707

llvm-svn: 256523
2015-12-28 21:40:45 +00:00
Sanjay Patel b3c53e512f [x86] lower calls to fmin and llvm.minnum.* using minss/minsd/minps/minpd (PR24475)
This is a follow-on to:
http://reviews.llvm.org/rL255700
http://reviews.llvm.org/rL256454
http://reviews.llvm.org/rL256510

llvm-svn: 256522
2015-12-28 21:16:55 +00:00
Easwaran Raman b9f7120e7a Refactor inline costs analysis by removing the InlineCostAnalysis class
InlineCostAnalysis is an analysis pass without any need for it to be one.
Once it stops being an analysis pass, it doesn't maintain any useful state
and the member functions inside can be made free functions. NFC.

Differential Revision: http://reviews.llvm.org/D15701

llvm-svn: 256521
2015-12-28 20:28:19 +00:00
Manuel Jacob 9db5b93ffc [RS4GC] Fix rematerialization of bitcast of bitcast.
Summary:
Previously, only the outer (last) bitcast was rematerialized, resulting in a
use of the unrelocated inner (first) bitcast after the statepoint.  See the
test case for an example.

Reviewers: igor-laevsky, reames

Subscribers: reames, alex, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15789

llvm-svn: 256520
2015-12-28 20:14:05 +00:00
Elena Demikhovsky 5494698828 Implemented cost model for masked gather and scatter operations
The cost is calculated for all X86 targets. When gather/scatter instruction
is not supported we calculate the cost of scalar sequence.

Differential revision: http://reviews.llvm.org/D15677

llvm-svn: 256519
2015-12-28 20:10:59 +00:00
Sanjay Patel 9da2b647c7 [x86] lower calls to fmax and llvm.maxnum.* using maxps/maxpd (PR24475)
This is a follow-on to:
http://reviews.llvm.org/rL255700
http://reviews.llvm.org/rL256454

llvm-svn: 256510
2015-12-28 19:20:19 +00:00
Sanjay Patel cc4c71b4fb tidy up; NFC
llvm-svn: 256506
2015-12-28 18:18:22 +00:00
Roman Divacky 73fc84761f Support clrex instruction on ARMv6k. Patch by Andrew Turner.
llvm-svn: 256505
2015-12-28 17:47:23 +00:00
Alexander Kornienko d0af3b3178 Refactor: Simplify boolean conditional return statements in lib/Transforms/ObjCARC
Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: craig.topper, bkramer, chandlerc, gottesmm

Subscribers: llvm-commits

Patch by Richard Thomson!

Differential Revision: http://reviews.llvm.org/D9999

llvm-svn: 256502
2015-12-28 16:19:08 +00:00
Alexander Kornienko 66da20a6f2 Refactor: Simplify boolean conditional return statements in llvm/lib/Support
Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: rafael, bkramer, ddunbar, Bigcheese, chandlerc, chapuni, nicholas, alexfh

Subscribers: alexfh, craig.topper, llvm-commits

Patch by Richard Thomson!

Differential Revision: http://reviews.llvm.org/D9978

llvm-svn: 256500
2015-12-28 15:46:15 +00:00
Michael Kuperstein 2ea81baf3a [X86] Better support for the MCU psABI (LLVM part)
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.

The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.

Differential Revision: http://reviews.llvm.org/D15054

llvm-svn: 256494
2015-12-28 14:39:21 +00:00
Alexander Kornienko 175a7cbf3f Refactor: Simplify boolean conditional return statements in lib/Target/PowerPC
Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: uweigand, rafael, wschmidt

Subscribers: craig.topper, llvm-commits

Patch by Richard Thomson!

Differential Revision: http://reviews.llvm.org/D9984

llvm-svn: 256493
2015-12-28 13:38:42 +00:00
Asaf Badouh fba562004b [X86][AVX512] Lower broadcast sub vector to vector inrtrinsics
lower broadcast<type>x<vector> to shuffles.
 there are two cases:
1.src is 128 bits and dest is 512 bits: in this case we will lower it to shuffle with imm = 0.
2.src is 256 bit and dest is 512 bits: in this case we will lower it to shuffle with imm = 01000100b (0x44) that way we will broadcast the 256bit source: ymm[0,1,2,3] => zmm[0,1,2,3,0,1,2,3] then it will mask it with the passthru value (in case it's mask op).



Differential Revision: http://reviews.llvm.org/D15790

llvm-svn: 256490
2015-12-28 08:26:26 +00:00
Asaf Badouh 5546f51011 [X86][AVX512] add fp scalar broadcast intrinsics
Differential Revision: http://reviews.llvm.org/D15790

llvm-svn: 256489
2015-12-28 08:09:25 +00:00
Craig Topper 401675ce5b [AVX512] Remove VEX_LIG from vmovd/vmovq instructions. From what I can tell from the Intel docs these instructions require the L-bit to be 0.
llvm-svn: 256486
2015-12-28 06:32:47 +00:00
Craig Topper af88afb214 [AVX512] Fix some places that used FR64 instead of FR64X.
llvm-svn: 256484
2015-12-28 06:11:45 +00:00
Craig Topper c648c9b92d [AVX512] Bring vmovq instructions names into alignment with the AVX and SSE names. Add a missing encoding to disassembler and assembler.
I believe this also fixes a case where a 64-bit memory form that is documented as being unsupported in 32-bit mode was able to be selected there.

llvm-svn: 256483
2015-12-28 06:11:42 +00:00
Craig Topper b4c56624eb [X86] Move address for store target from outs to ins on a couple instructions.
llvm-svn: 256482
2015-12-28 06:11:39 +00:00
Craig Topper cd4621a8ab [X86] Add proper Uses/Defs/mayLoad flags for AAA/AAD/AAM/AAS/DAA/DAS/XLAT instructions.
llvm-svn: 256481
2015-12-28 06:11:37 +00:00
Chandler Carruth 9153911551 [lcg] Fix a few more formatting goofs found by clang-format. NFC.
llvm-svn: 256480
2015-12-28 01:54:20 +00:00
Craig Topper f3ed5c115c [AVX512] Remove separate instruction and patterns for lowering ctlz_zero_undef. Change the operation for CTLZ_ZERO_UNDEF to Expand so SelectionDAG will convert them to CTLZ before lowering.
llvm-svn: 256477
2015-12-27 21:33:50 +00:00
Craig Topper 4b1808d8e7 [SelectionDAG] Teach LegalizeVectorOps to not unroll CTLZ_ZERO_UNDEF and CTTZ_ZERO_UNDEF if the non-ZERO_UNDEF form is legal or custom. Will be used to simplify X86 code in a follow on commit.
llvm-svn: 256476
2015-12-27 21:33:47 +00:00
Craig Topper c48fa89e44 [AVX512] Remove alternate data type versions of VALIGND, VALIGNQ, VMOVSHDUP and VMOVSLDUP. They don't have any tests and I don't think they can be selected. If they are truly needed they should be implemented with patterns against the normal instructions and not separate instructions.
llvm-svn: 256475
2015-12-27 19:45:21 +00:00
Igor Breger 756c289dd8 AVX512: Change VPMOVB2M DAG lowering , use CVT2MASK node instead TRUNCATE.
Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB.
Implement VPMOVB/W/D/Q2M intrinsic.

Differential Revision: http://reviews.llvm.org/D15675

llvm-svn: 256470
2015-12-27 13:56:16 +00:00
Asaf Badouh b0d91fa42a [X86][AVX512] change broadcast to use maskable pattern
Differential Revision: http://reviews.llvm.org/D15786

llvm-svn: 256469
2015-12-27 12:14:34 +00:00
Chandler Carruth 3a040e6d47 [attrs] Extract the pure inference of function attributes into
a standalone pass.

There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.

In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.

The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.

Differential Revision: http://reviews.llvm.org/D15676

llvm-svn: 256466
2015-12-27 08:41:34 +00:00
Chandler Carruth f49f1a87ef [attrs] Split off the forced attributes utility into its own pass that
is (by default) run much earlier than FuncitonAttrs proper.

This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.

I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.

Differential Revision: http://reviews.llvm.org/D15668

llvm-svn: 256465
2015-12-27 08:13:45 +00:00
Craig Topper f8423c05ee [AVX-512] Remove alernate integer forms for VPERMILPS and VPERMILPD. There no tests for them and I don't see any way to select them anyway. If they are really needed they should be implemented as patterns and not full fledged instructions.
llvm-svn: 256462
2015-12-27 06:55:08 +00:00
David Majnemer 334676355a [X86, Win64] Use a frame pointer if pushf is emitted
A frame pointer must be used if stack pointer is modified after the
prologue.  LLVM will emit pushf/popf if we need to save/restore the
FLAGS register, requiring us to have a frame pointer for the function.

There is a small twist: this sequence might exist in user code via
inline-assembly.  For now, conservatively assume that such functions
require a frame pointer.  For real world justification, please see
clang's implementation of __readeflags.

This fixes PR25945.

llvm-svn: 256456
2015-12-27 06:07:26 +00:00
David Majnemer 081e8fe4c0 [WinEH] Add comments explaining the EH tables
This is aids in debugging WinEH, similar functionality is present for
DWARF EH.

llvm-svn: 256455
2015-12-27 06:07:12 +00:00
Sanjay Patel bcff3f7d92 [x86] lower calls to llvm.maxnum.v4f32 using maxps
This is a follow-on to:
http://reviews.llvm.org/rL255700

llvm-svn: 256454
2015-12-26 21:44:55 +00:00
Craig Topper 5ce29aa307 [X86] Fix an unused variable warning in released builds.
llvm-svn: 256453
2015-12-26 20:13:33 +00:00
Craig Topper 7e3ba15529 [X86] Add support for printing shuffle comments for AVX512 PSHUFB instructions.
llvm-svn: 256452
2015-12-26 19:48:43 +00:00
Craig Topper fa5f35e6ad [X86] Fold some variable declarations and initializations into if statements. NFC
llvm-svn: 256451
2015-12-26 19:48:37 +00:00
Chen Li d71999ef1b [gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. 

Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob

Subscribers: reames, mjacob, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15662

llvm-svn: 256443
2015-12-26 07:54:32 +00:00
Craig Topper d400019447 [X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the Constant type not matching due to folding in the constant pool and to get VPERMILPD correct.
llvm-svn: 256433
2015-12-26 04:50:07 +00:00
Craig Topper 53bd5cac86 [X86] Fix copy and paste typo from pasting from another Makefile to restore code.
llvm-svn: 256431
2015-12-25 23:27:57 +00:00
Craig Topper 96c985169b [X86] Put back the include path to the main X86 sources in the AsmParser library to fix the bots.
llvm-svn: 256430
2015-12-25 22:22:16 +00:00
Craig Topper 95e5596228 [X86] Remove X86CodeGen dependency from the AsmParser library.
llvm-svn: 256429
2015-12-25 22:10:11 +00:00
Craig Topper c0453e87dc [X86] Move getX86SubSuperRegisterOrZero to X86MCTargetDesc.cpp so it can be used by AsmParser library without depending on X86CodeGen library.
llvm-svn: 256428
2015-12-25 22:10:08 +00:00
Craig Topper daf2e3ff7a Remove extra forward declarations and scrub includes for all in tree InstPrinters. NFC
llvm-svn: 256427
2015-12-25 22:10:01 +00:00
Craig Topper c7277d9485 [X86] Move AVX512 STATIC_ROUNDING enum to X86BaseInfo.h to fix a layering violation in AsmParser.
llvm-svn: 256426
2015-12-25 22:09:49 +00:00
Craig Topper 91dab7baee [X86] Replace MVT::SimpleValueType in the AsmParser library and getX86SubSuperRegister with just an unsigned representing size.
This a is step towards fixing a layering violation so the X86 AsmParser won't depending on CodeGen types.

llvm-svn: 256425
2015-12-25 22:09:45 +00:00
Craig Topper 2c7d7c2584 [X86] Don't pass the default value to the High argument of getX86SubSuperRegister. Most place don't care about this argument. NFC
llvm-svn: 256424
2015-12-25 19:44:16 +00:00
Craig Topper d59bc5188d [X86] getX86SubSuperRegisterOrZero shouldn't call getX86SubSuperRegister recursively. It should call itself instead. Otherwise it might fire an assertion when it was designed not too.
llvm-svn: 256422
2015-12-25 17:07:32 +00:00
Craig Topper 3453a43da9 [X86] Add missing X86II::MRM_C4, MRM_C5, etc. encodings to getMemoryOperandNo. These aren't used by any instructions, but could be someday. NFC
llvm-svn: 256421
2015-12-25 17:07:30 +00:00
Craig Topper f804af209d [X86] Use assert instead of if and llvm_unreachable. NFC
llvm-svn: 256420
2015-12-25 17:07:27 +00:00
Craig Topper 3fb423ef8b [X86] Minor identation fixes. NFC
llvm-svn: 256419
2015-12-25 17:07:24 +00:00
David Majnemer 11234ed7d3 [CodeGen] Use generic printAsOperand machinery instead of hand rolling it
We already know how to properly print out basic blocks in
printAsOperand, we should not roll it ourselves in
AsmPrinter::EmitBasicBlockStart.  No functionality change is intended.

llvm-svn: 256413
2015-12-25 09:37:26 +00:00
Craig Topper 370c8d6c6b [IR] Mark the Type subclass helper methods 'inline' and move their definitions to DerivedTypes.h so they can be inlined by the compiler.
llvm-svn: 256406
2015-12-25 04:06:20 +00:00
Craig Topper 582d8ecf6a [Transforms] Use asserts instead of ifs around llvm_unreachable. NFC
llvm-svn: 256405
2015-12-25 02:04:17 +00:00
Dan Gohman 8887d1faed [WebAssembly] Fix handling of COPY instructions in WebAssemblyRegStackify.
Move RegStackify after coalescing and teach it to use LiveIntervals instead
of depending on SSA form. This avoids a problem where a register in a COPY
instruction is stackified and then subsequently coalesced with a register
that is not stackified.

This also puts it after the scheduler, which allows us to simplify the
EXPR_STACK constraint, as we no longer have instructions being reordered
after stackification and before coloring.

llvm-svn: 256402
2015-12-25 00:31:02 +00:00
Sanjay Patel ae945e7927 [InstCombine] transform more extract/insert pairs into shuffles (PR2109)
This is an extension of the shuffle combining from r203229:
http://reviews.llvm.org/rL203229

The idea is to widen a short input vector with undef elements so the
existing shuffle transform for extract/insert can kick in.

The motivation is to finally solve PR2109:
https://llvm.org/bugs/show_bug.cgi?id=2109

For that example, the IR becomes:

%1 = bitcast <2 x i32>* %P to <2 x float>*
%ld1 = load <2 x float>, <2 x float>* %1, align 8
%2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
%i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
ret <4 x float> %i2

And x86 SSE output improves from:

movq	(%rdi), %xmm1           ## xmm1 = mem[0],zero
movdqa	%xmm1, %xmm2
shufps	$229, %xmm2, %xmm2      ## xmm2 = xmm2[1,1,2,3]
shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
shufps	$132, %xmm1, %xmm0      ## xmm0 = xmm0[0,1],xmm1[0,2]
shufps	$32, %xmm0, %xmm2       ## xmm2 = xmm2[0,0],xmm0[2,0]
shufps	$36, %xmm2, %xmm0       ## xmm0 = xmm0[0,1],xmm2[2,0]
retq

To the almost optimal:

movhpd	(%rdi), %xmm0

Note: There's a tension in the existing transform related to generating
arbitrary shufflevector masks. We avoid that in other places in InstCombine
because we're scared that codegen can't handle strange masks, but it looks
like we're ok with producing those here. I purposely chose weird insert/extract
indexes for the regression tests to see the effect in these cases. 
For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or
better for these examples.

Differential Revision: http://reviews.llvm.org/D15096

llvm-svn: 256394
2015-12-24 21:17:56 +00:00
Dave Bartolomeo a779f5a401 Remove unused constants from TypeTableBuilder.cpp.
llvm-svn: 256389
2015-12-24 19:15:56 +00:00
Bill Seurer 8771bbfbe2 Fix case of path name
llvm-svn: 256388
2015-12-24 18:54:35 +00:00
Dave Bartolomeo dd38b1bf12 Fix CodeView library name and non-CMake builds
llvm-svn: 256387
2015-12-24 18:51:35 +00:00
Dave Bartolomeo 89ba802b92 LLVM CodeView library
Summary: This diff is the initial implementation of the LLVM CodeView library. There is much more work to be done, namely a CodeView dumper and tests. This patch should help others make progress on the LLVM->CodeView debug info emission while I continue with the implementation of the dumper and tests.

This library implements support for emitting debug info in the CodeView format. This phase of the implementation only includes support for CodeView type records. Clients that need to emit type records will use a class derived from TypeTableBuilder. TypeTableBuilder provides member functions for writing each kind of type record; each of these functions eventually calls the writeRecord virtual function to emit the actual bits of the record. Derived classes override writeRecord to implement the folding of duplicate records and the actual emission to the appropriate destination. LLVMCodeView provides MemoryTypeTableBuilder, which creates the table in memory. In the future, other classes derived from TypeTableBuilder will write to other destinations, such as the type stream in a PDB.

The rest of the types in LLVMCodeView define the actual CodeView type records and all of the supporting enums and other types used in the type records. The TypeIndex class is of particular interest, because it is used by clients as a handle to a type in the type table.

The library provides a relatively low-level interface based on the actual on-disk format of CodeView. For example, type records refer to other type records by TypeIndex, rather than by an actual pointer to the referent record. This allows clients to emit type records one at a time, rather than having to keep the entire transitive closure of type records in memory until everything has been emitted. At some point, having a higher-level interface layered on top of this one may be useful for debuggers and other tools that want a more holistic view of the debug info. The lower-level interface should be sufficient for compilers and linkers to do the debug info manipulation that they need to do efficiently.

Reviewers: rnk, majnemer

Subscribers: silvas, rnk, jevinskie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14961

llvm-svn: 256385
2015-12-24 18:12:38 +00:00
Marina Yatsina 8dfd5cbb73 [X86][ms-inline asm] Add support for memory operands that include structs
Add ability to reference struct symbols in memory operands.
Test case will be added on the clang side (review http://reviews.llvm.org/D15749)

Differential Revision: http://reviews.llvm.org/D15748

llvm-svn: 256381
2015-12-24 12:09:51 +00:00
Benjamin Kramer 7a5c8c8fe3 [ProfileData] Make helper function static.
No functional change.

llvm-svn: 256375
2015-12-24 10:03:37 +00:00
Benjamin Kramer fe2b541546 [FunctionImport] Move pass into anonymous namespace.
No functional change.

llvm-svn: 256374
2015-12-24 10:03:35 +00:00
Chandler Carruth 85dbea99ee Add a missing const qualifier on the context instruction. This somehow
has always been missing. =/

llvm-svn: 256371
2015-12-24 09:08:08 +00:00
Asaf Badouh 9a5a83a518 [X86][PKU] Add {RD,WR}PKRU encoding
Differential Revision: http://reviews.llvm.org/D15711

llvm-svn: 256366
2015-12-24 08:25:00 +00:00
Elena Demikhovsky 9e225a2f52 AVX-512: Kreg set 0/1 optimization
The patterns that set a mask register to 0/1
KXOR %kn, %kn, %kn / KXNOR %kn, %kn, %kn
are replaced with
KXOR %k0, %k0, %kn / KXNOR %k0, %k0, %kn - AVX-512 targets optimization.

KNL does not recognize dependency-breaking idioms for mask registers,
so kxnor %k1, %k1, %k2 has a RAW dependence on %k1.
Using %k0 as the undef input register is a performance heuristic based
on the assumption that %k0 is used less frequently than the other mask
registers, since it is not usable as a write mask.

Differential Revision: http://reviews.llvm.org/D15739

llvm-svn: 256365
2015-12-24 08:12:22 +00:00
Igor Breger 268f6f53c5 AVX512: VPMOVM2B/W/D/Q intrinsic implementation.
Differential Revision: http://reviews.llvm.org//D15747

llvm-svn: 256364
2015-12-24 07:11:53 +00:00
Craig Topper 73275a2951 Use range-based for loops. NFC
llvm-svn: 256363
2015-12-24 05:20:40 +00:00
Matt Arsenault 4339b3ff35 AMDGPU: Fix getRegisterBitWidth for vectors
llvm-svn: 256362
2015-12-24 05:14:55 +00:00
Nico Weber 95cc9d5f14 Revert r256336, it caused PR25939
llvm-svn: 256361
2015-12-24 04:01:06 +00:00
Tom Stellard 5ebdfbe562 AMDGPU/SI: Fix encoding of flat instructions on VI
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15735

llvm-svn: 256360
2015-12-24 03:18:18 +00:00
Tom Stellard 668f793049 AMDGPU/SI: Remove non-existent flat instructions
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15734

llvm-svn: 256357
2015-12-24 02:41:55 +00:00
Philip Reames cb0f947a2a [Statepoints] Use Indirect operands for spill slots
Teach the statepoint lowering code to emit Indirect stackmap entries for spill inserted by StatepointLowering (i.e. SelectionDAG), but Direct stackmap entries for in-IR allocas which represent manual stack slots. This is what the docs call for (http://llvm.org/docs/StackMaps.html#stack-map-format), but we've been emitting both as Direct. This was pointed out recently on the mailing list as a bug. It also blocks http://reviews.llvm.org/D15632 which extends the lowering to handle vector-of-pointers since only Indirect references can encode a variable sized slot.

To implement this, I introduced a new flag on the StackObject class used to maintian information about stack slots. I original considered (and prototyped in http://reviews.llvm.org/D15632), the idea of using the existing isSpillSlot flag, but end up deciding that was a bit too risky and that the cost of adding a new flag was low. Having the new flag will also allow us - in the future - to emit better comments in verbose assembly which indicate where a particular stack spill around a call comes from. (deopt, gc, regalloc).

Differential Revision: http://reviews.llvm.org/D15759

llvm-svn: 256352
2015-12-23 23:44:28 +00:00
Philip Reames 4e66c84722 [MemOperands] Clarify code around dropping memory operands [NFC]
Clarify a comment about what it means to drop memory operands from an instruction.  While I'm adding change the name of the method slightly to make it a bit more clear what's going on when reading calling code.

llvm-svn: 256346
2015-12-23 19:16:04 +00:00
Keno Fischer 9bc46b117b [Function] Properly remove use when clearing personality
Summary:
We need to actually remove the use of the personality function,
otherwise we can run into trouble if we want to e.g. delete
the personality function because ther's no way to get rid of
its uses. Do this by resetting to ConstantPointerNull value
that the operands are set to when first allocated.

Reviewers: vsk, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15752

llvm-svn: 256345
2015-12-23 18:27:23 +00:00
JF Bastien 61ad8b3907 Fix SCEV r256338.
llvm-svn: 256344
2015-12-23 18:18:53 +00:00
Sanjoy Das 2fbfb25ad6 [SCEV] Fix getLoopBackedgeTakenCounts
The way `getLoopBackedgeTakenCounts` is written right now isn't
correct. It will try to compute and store the BE counts of a Loop
 #{child loop} number of times (which may be zero).

llvm-svn: 256338
2015-12-23 17:48:14 +00:00
Chad Rosier fba65d2fd3 [LIR] General refactoring to simplify code and the ease future code review.
Move several checks into isLegalStores. Also, delineate between those stores
that are memset-able and those that are memcpy-able.

http://reviews.llvm.org/D15683
Patch by Haicheng Wu <haicheng@codeaurora.org>!

llvm-svn: 256336
2015-12-23 17:29:33 +00:00
Philip Reames 42bd26f29d [MachineLICM] Fix handling of memoperands
As far as I can tell, the correct interpretation of an empty memoperands list is that we didn't have sufficient room to store information about the MachineInstr, NOT that the MachineInstr doesn't access any particular bit of memory. This appears to be fairly consistent in a number of places, but I'm not 100% sure of this interpretation. I'd really appreciate someone more knowledgeable confirming my reading of the code.

This patch fixes two latent bugs in MachineLICM - given the above assumption - and adds comments to document the meaning and required handling. I don't have test cases; these were noticed by inspection.

Differential Revision: http://reviews.llvm.org/D15730

llvm-svn: 256335
2015-12-23 17:05:57 +00:00
Simon Pilgrim 17377bdd45 [X86][AVX] Only shuffle the lower half of vectors if the upper half is undefined
First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151.

As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops.

Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well.

Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions).

Differential Revision: http://reviews.llvm.org/D15477

llvm-svn: 256332
2015-12-23 13:10:07 +00:00
David Majnemer 2bc2538470 [OperandBundles] Have GlobalsModRef play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

llvm-svn: 256329
2015-12-23 09:58:46 +00:00
David Majnemer 63ad9e0543 [OperandBundles] Have TailCallElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

This fixes PR25928.

llvm-svn: 256328
2015-12-23 09:58:43 +00:00
David Majnemer 02f4787e45 [OperandBundles] Have InstCombine play nice with operand bundles
Don't assume a call's use corresponds to an argument operand, it might
correspond to a bundle operand.

llvm-svn: 256327
2015-12-23 09:58:41 +00:00
David Majnemer 464be3724a [OperandBundles] Have DeadArgElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

llvm-svn: 256326
2015-12-23 09:58:36 +00:00
Igor Breger 7b46b4e798 AVX512BW: Enable packed word shift for 512bit vector. Enable lowering scalar immidiate shift v64i8 .Fix predicate for AVX1/2 shifts.
Differential Revision: http://reviews.llvm.org/D15713

llvm-svn: 256324
2015-12-23 08:06:50 +00:00
David Majnemer c640f863e0 [WinEH] Don't visit the same catchswitch twice
We visited the same catchswitch twice because it was both the child of
another funclet and the predecessor of a cleanuppad.

Instead, change the numbering algorithm to only recurse if the unwind
destination of the inner funclet agrees with the unwind destination of
the catchswitch.

This fixes PR25926.

llvm-svn: 256317
2015-12-23 03:59:04 +00:00
Paul Robinson 22d0d31a72 Form reform for MCDwarf.
MCDwarf emits a canned abbreviation table, but was not emitting proper
forms for DWARF version 4, which is the default after r249655.

Differential Revision: http://reviews.llvm.org/D15732

llvm-svn: 256313
2015-12-23 01:57:31 +00:00
Philip Reames ee8f055327 [GC] Make GCStrategy::isGCManagedPointer a type predicate not a value predicate [NFC]
Reasons:
1) The existing form was a form of false generality.  None of the implemented GCStrategies use anything other than a type.  Its becoming more and more clear we're going to need some type of strong GC pointer in the type system and we shouldn't pretend otherwise at this point.
2) The API was awkward when applied to vectors-of-pointers.  The old one could have been made to work, but calling isGCManagedPointer(Ty->getScalarType()) is much cleaner than the Value alternatives.  
3) The rewriting implementation effectively assumes the type based predicate as well.  We should be consistent.

llvm-svn: 256312
2015-12-23 01:42:15 +00:00
Dan Gohman 08d58bcf6a [WebAssembly] Add a TODO comment for a possible future optimization.
llvm-svn: 256306
2015-12-23 00:22:04 +00:00
Manuel Jacob a4efd8ac2e [RS4GC] Fix base pair printing for constants.
Previously, "%" + name of the value was printed for each derived and base
pointer.  This is correct for instructions, but wrong for e.g. globals.

llvm-svn: 256305
2015-12-23 00:19:45 +00:00
Akira Hatanaka 1cb242eb13 Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r256277 with two changes:

- In emitFnAttrCompatCheck, change FuncName's type to std::string to fix
  a use-after-free bug.
- Remove an unnecessary install-local target in lib/IR/Makefile. 

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256304
2015-12-22 23:57:37 +00:00
Cong Hou 6a2c71af0b [BPI] Fix two potential divide-by-zero operations that are introduced in r256263.
llvm-svn: 256303
2015-12-22 23:45:55 +00:00
Dan Gohman a2b2cdc813 [WebAssembly] Trim unneeded #includes. NFC.
llvm-svn: 256301
2015-12-22 23:45:21 +00:00
Dan Gohman cc38ba1954 [WebAssembly] Minor code simplification. NFC.
llvm-svn: 256300
2015-12-22 23:39:16 +00:00
Changpeng Fang b41574a961 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

  NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256282
2015-12-22 20:55:23 +00:00
Rafael Espindola 10d9a033db Also add unnamed_addr to functions.
llvm-svn: 256281
2015-12-22 20:43:30 +00:00
Akira Hatanaka 9c05cc5670 Revert r256277 and r256279.
Some of the bots failed again.

llvm-svn: 256280
2015-12-22 20:29:09 +00:00
Akira Hatanaka 3f1bf25db1 Add a .td file I forgot to add in r256277.
llvm-svn: 256279
2015-12-22 20:06:50 +00:00
Akira Hatanaka a61deb249b Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r252990 and r252949. I've added member function getKind
to the Attr classes which returns the enum or string of the attribute.

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256277
2015-12-22 20:00:05 +00:00
Rafael Espindola 5349d87a69 Delete dead GlobalAliases.
llvm-svn: 256276
2015-12-22 19:50:22 +00:00
Rafael Espindola 4b0d24c00a Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"
This reverts commit r256273.

It broke CodeGen/AMDGPU/llvm.dbg.value.ll

llvm-svn: 256275
2015-12-22 19:46:44 +00:00
Rafael Espindola 2cc46b3701 Merge duplicated code.
The code for deleting dead global variables and functions was
duplicated.

This is in preparation for also deleting dead global aliases.

llvm-svn: 256274
2015-12-22 19:38:07 +00:00
Changpeng Fang 9b8a9be058 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256273
2015-12-22 19:32:28 +00:00
Rafael Espindola 9f0bebc3da Use early continue to reduce indentation.
llvm-svn: 256272
2015-12-22 19:26:18 +00:00
Rafael Espindola e4ed0e56ce Simplify iterator management. NFC.
Not passing an iterator to processGlobal will allow it to work with
other GlobalValues.

llvm-svn: 256271
2015-12-22 19:16:50 +00:00
Cong Hou e93b8e1539 [BPI] Replace weights by probabilities in BPI.
This patch removes all weight-related interfaces from BPI and replace
them by probability versions. With this patch, we won't use edge weight
anymore in either IR or MC passes. Edge probabilitiy is a better
representation in terms of CFG update and validation.


Differential revision: http://reviews.llvm.org/D15519 

llvm-svn: 256263
2015-12-22 18:56:14 +00:00
Manuel Jacob 4e4f60ded0 Remove deprecated llvm.experimental.gc.result.{int,float,ptr} intrinsics.
Summary:
These were deprecated 11 months ago when a generic
llvm.experimental.gc.result intrinsic, which works for all types, was added.

Reviewers: sanjoy, reames

Subscribers: sanjoy, chenli, llvm-commits

Differential Revision: http://reviews.llvm.org/D15719

llvm-svn: 256262
2015-12-22 18:44:45 +00:00
Vedant Kumar d167586a28 [Support] Allow multiple paired calls to {start,stop}Timer()
Differential Revision: http://reviews.llvm.org/D15619

Reviewed-by: rafael
llvm-svn: 256258
2015-12-22 17:36:17 +00:00
Manuel Jacob 990dfa6fe5 [RS4GC] Fix crash in the case that a live variable has a constant base.
Summary:
Previously, RS4GC crashed in CreateGCRelocates() because it assumed
that every base is also in the array of live variables, which isn't true if a
live variable has a constant base.

This change fixes the crash by making sure CreateGCRelocates() won't try to
relocate a live variable with a constant base.  This would be unnecessary
anyway because anything with a constant base won't move.

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15556

llvm-svn: 256252
2015-12-22 16:50:44 +00:00
Jun Bum Lim 6755c3bc5f [AArch64] Promote loads from stored
This is a recommit of r256004 which was reverted in r256160. The issue was the
incorrect promotion for half and byte loads transformed into mov instructions.
This fix will replace half and byte type loads only with bit field extracts.

Original commit message:

This change promotes load instructions which directly read from stored by
replacing them with mov instructions. If the store is wider than the load,
the load will be replaced with a bitfield extract.
For example :
  STRWui %W1, %X0, 1
  %W0 = LDRHHui %X0, 3
becomes
  STRWui %W1, %X0, 1
  %W0 = UBFMWri %W1, 16, 31

llvm-svn: 256249
2015-12-22 16:36:16 +00:00
Chad Rosier a108010385 Typo. NFC.
llvm-svn: 256242
2015-12-22 15:06:47 +00:00
Asaf Badouh 13ffa4bf7c [X86][AVX512] Add rcp14 and rsqrt14 intrinsics
Differential Revision: http://reviews.llvm.org/D15414

llvm-svn: 256237
2015-12-22 11:40:04 +00:00
Keno Fischer 4eccf11373 [ASMPrinter] Fix missing handling of DW_OP_bit_piece
In r256077, I added printing for DIExpressions in DEBUG_VALUE comments,
but neglected to handle DW_OP_bit_piece operands. Thanks to
Mikael Holmen and Joerg Sonnenberger for spotting this.

llvm-svn: 256236
2015-12-22 07:14:50 +00:00
Kostya Serebryany b0fb6e8508 [libFuzzer] add AFL-style dictionary for C++, remove the old file with tokens
llvm-svn: 256229
2015-12-22 01:50:51 +00:00
David Majnemer ff1d084aa2 [MC] Don't use the architecture to govern which object file format to use
InitMCObjectFileInfo was trying to override the triple in awkward ways.
For example, a triple specifying COFF but not Windows was forced as ELF.
This makes it easy for internal invariants to get violated, such as
those which triggered PR25912.

This fixes PR25912.

llvm-svn: 256226
2015-12-22 01:39:04 +00:00
Teresa Johnson d213aa469e Handle empty Subprogram list when linking metadata.
Use an iterator that handles an empty subprogram list.

Fixes PR25915.

llvm-svn: 256224
2015-12-22 01:17:19 +00:00
Easwaran Raman bdb6f1dcc3 Determine callee's hotness and adjust threshold based on that. NFC.
This uses the same criteria used in CFE's CodeGenPGO to identify hot and cold
callees and uses values of inlinehint-threshold and inlinecold-threshold
respectively as the thresholds for such callees.

Differential Revision: http://reviews.llvm.org/D15245

llvm-svn: 256222
2015-12-22 00:32:35 +00:00
Evgeniy Stepanov 8827f2db85 [safestack] Add option for non-TLS unsafe stack pointer.
This patch adds an option, -safe-stack-no-tls, for using normal
storage instead of thread-local storage for the unsafe stack pointer.
This can be useful when SafeStack is applied to an operating system
kernel.

http://reviews.llvm.org/D15673

Patch by Michael LeMay.

llvm-svn: 256221
2015-12-22 00:13:11 +00:00
Xinliang David Li 5fe0455563 [PGO] Fix another comdat related issue for COFF
The linker requires that a comdat section must be associated
with a another comdat section that precedes it. This
means the comdat section's name needs to use the  profile name
var's name.

Patch tested by Johan Engelen.

llvm-svn: 256220
2015-12-22 00:11:15 +00:00
Vedant Kumar 11dc6dc71e [Support] Timer: Use emplace_back() and range-based loops (NFC)
llvm-svn: 256217
2015-12-21 23:41:38 +00:00
Vedant Kumar 3f79e32593 [Support] Timer: simplify the init() method
llvm-svn: 256215
2015-12-21 23:27:44 +00:00
Dylan McKay 751a449e2f [AVR] Added configuration file and machine function information class
This commit adds the 'AVRMachineFunctionInfo' class, which simply stores
basic properties about generated machine functions.

llvm-svn: 256213
2015-12-21 23:13:15 +00:00
Eric Christopher 213a5daab7 Fix line endings after r256155. NFC.
llvm-svn: 256211
2015-12-21 23:04:27 +00:00
Evgeniy Stepanov fda72c52a2 [cfi] Fix LowerBitSets on 32-bit targets.
This code attempts to truncate IntPtrTy to i32, which may be the same
type.

llvm-svn: 256205
2015-12-21 22:14:04 +00:00
David Majnemer 03e2cc3007 [MC, COFF] Support link /incremental conditionally
Today, we always take into account the possibility that object files
produced by MC may be consumed by an incremental linker.  This results
in us initialing fields which vary with time (TimeDateStamp) which harms
hermetic builds (e.g. verifying a self-host went well) and produces
sub-optimal code because we cannot assume anything about the relative
position of functions within a section (call sites can get redirected
through incremental linker thunks).

Let's provide an MCTargetOption which controls this behavior so that we
can disable this functionality if we know a-priori that the build will
not rely on /incremental.

llvm-svn: 256203
2015-12-21 22:09:27 +00:00
Jun Bum Lim a23e5f7516 Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst
This is recommit of r256028 with minor fixes in unittests:
  CodeGen/Mips/eh.ll
  CodeGen/Mips/insn-zero-size-bb.ll

Original commit message:

When identifying blocks post-dominated by an unreachable-terminated block
in BranchProbabilityInfo, consider only the edge to the normal destination
block if the terminator is InvokeInst and let calcInvokeHeuristics() decide
edge weights for the InvokeInst.

llvm-svn: 256202
2015-12-21 22:00:51 +00:00
Xinliang David Li ab361efee7 Resubmit r256193 with test fix: assertion failure analyzed
llvm-svn: 256201
2015-12-21 21:52:27 +00:00
Xinliang David Li 13da1f149e Revert r256193: build bot failure triggered
llvm-svn: 256198
2015-12-21 21:00:33 +00:00
Cong Hou 8df93ce455 [X86][SSE] Transform truncations between vectors of integers into X86ISD::PACKUS/PACKSS operations during DAG combine.
This patch transforms truncation between vectors of integers into
X86ISD::PACKUS/PACKSS operations during DAG combine. We don't do it in
lowering phase because after type legalization, the original truncation
will be turned into a BUILD_VECTOR with each element that is extracted
from a vector and then truncated, and from them it is difficult to do
this optimization. This greatly improves the performance of truncations
on some specific types.

Cost table is updated accordingly.


Differential revision: http://reviews.llvm.org/D14588

llvm-svn: 256194
2015-12-21 20:42:43 +00:00
Xinliang David Li 6c494cd0df [PGO] Fix profile var comdat generation problem with COFF
When targeting COFF, it is required that a comdat section to
have a global obj with the same name as the comdat (except for
comdats with select kind to be associative). This fix makes
sure that the comdat is keyed on the data variable for COFF.

Also improved test coverage for this.

llvm-svn: 256193
2015-12-21 20:41:20 +00:00
Michael Zolotukhin 0c97988e54 [ValueTracking] Properly handle non-sized types in isAligned function.
Reviewers: apilipenko, reames, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15597

llvm-svn: 256192
2015-12-21 20:38:18 +00:00
Adrian Prantl ce8581389b Fix PR24563 (LiveDebugVariables unconditionally propagates all DBG_VALUEs)
LiveDebugVariables unconditionally propagates all DBG_VALUE down the
dominator tree, which happens to work fine if there already is another
DBG_VALUE or the DBG_VALUE happends to describe a single-assignment vreg
but is otherwise wrong if the DBG_VALUE is coming from only one of the
predecessors.

In r255759 we introduced a proper data flow analysis scheduled after
LiveDebugVariables that correctly propagates DBG_VALUEs across basic block
boundaries. With the new pass in place, the incorrect propagation in
LiveDebugVariables can be retired witout loosing any of the benefits
where LiveDebugVariables happened to do the right thing.

llvm-svn: 256188
2015-12-21 20:03:00 +00:00
Adrian Prantl 5d9acc2443 Teach ARMLoadStoreOptimizer to ignore DBG_VALUE instructions when merging
instructions.

As noted in PR24563.
rdar://problem/23963293

llvm-svn: 256183
2015-12-21 19:25:03 +00:00
Tom Stellard 2b65ed306d AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI
Summary:
These register has different encodings on CI and VI, so we add pseudo
FLAT_SCRACTH registers to be used before MC, and subtarget specific
registers to be used by the MC layer.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15661

llvm-svn: 256178
2015-12-21 18:44:27 +00:00
Tom Stellard 9da8620cdb AMDGPU/SI: Change assembly name for flat scratch registers to flat_scratch
This matches what the assembler accepts.

llvm-svn: 256177
2015-12-21 18:44:21 +00:00