Commit Graph

70184 Commits

Author SHA1 Message Date
Matheus Almeida 595fcab2d0 [mips] Implement jr.hb and jalr.hb (Jump Register and Jump and Link Register with Hazard Barrier).
Summary: These instructions are available in ISAs >= mips32/mips64. For mips32r6/mips64r6, jr.hb has a new encoding format.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4019

llvm-svn: 210654
2014-06-11 15:05:56 +00:00
Cameron McInally 5d1b7b94e4 Add AVX512 masked leadz instrinsic support.
llvm-svn: 210652
2014-06-11 12:54:45 +00:00
Andrea Di Biagio c7af75f9a7 [X86] Refactor the logic to select horizontal adds/subs to a helper function.
This patch moves part of the logic implemented by the target specific
combine rules added at r210477 to a separate helper function.
This should make easier to add more rules for matching AVX/AVX2 horizontal
adds/subs.

This patch also fixes a problem caused by a wrong check performed on indices
of extract_vector_elt dag nodes in input to the scalar adds/subs.

New tests have been added to verify that we correctly check indices of
extract_vector_elt dag nodes when selecting a horizontal operation.

llvm-svn: 210644
2014-06-11 07:57:50 +00:00
Jiangning Liu d623c528c5 Create macro INITIALIZE_TM_PASS.
Pass initialization requires to initialize TargetMachine for back-end
specific passes. This commit creates a new macro INITIALIZE_TM_PASS to
simplify this kind of initialization.

llvm-svn: 210641
2014-06-11 07:04:37 +00:00
Jiangning Liu b2ae37fb67 Global merge for global symbols.
This commit is to improve global merge pass and support global symbol merge.
The global symbol merge is not enabled by default. For aarch64, we need some
more back-end fix to make it really benifit ADRP CSE.

llvm-svn: 210640
2014-06-11 06:44:53 +00:00
Jiangning Liu 3e5b855a51 Rename global-merge to enable-global-merge.
llvm-svn: 210639
2014-06-11 06:35:26 +00:00
Craig Topper 213d2f79e5 Convert StringMapEntry::Create to use StringRef instead of start/end pointers. Simpliies all in tree call sites. No functional change.
llvm-svn: 210638
2014-06-11 05:35:56 +00:00
Rafael Espindola ace0080a4a Try to fix the msvc build.
llvm-svn: 210636
2014-06-11 04:41:37 +00:00
Rafael Espindola 181adb5f57 Uses generic_category instead of system_category.
Some c++ libraries (libstdc++ at least) don't seem to map to the generic
category in in the system_category's default_error_condition.

llvm-svn: 210635
2014-06-11 04:34:41 +00:00
Saleem Abdulrasool faa29bd529 MC: add enumeration of WinEH data encoding
Most Windows platforms use auxiliary data for unwinding.  This information is
stored in the .pdata section.  The encoding format for the data differs between
architectures and Windows variants.  Windows MIPS and Alpha use identical
formats; Alpha64 is the same with different widths.  Windows x86_64 and Itanium
share the representation.  All Windows CE entries are identical irrespective of
the architecture.  ARMv7 (Windows [NT] on ARM) has its own format.

This enumeration will become the differentiator once the windows EH emission
infrastructure is generalised, allowing us to emit the necessary unwinding
information for Windows on ARM.

llvm-svn: 210634
2014-06-11 04:19:25 +00:00
Rafael Espindola a813d608a9 Remove windows_error.
MSVC doesn't seem to provide any is_error_code_enum enumeration for the
windows errors.

Fortunately very few places in llvm have to handle raw windows errors, so
we can just construct the corresponding error_code directly.

llvm-svn: 210631
2014-06-11 03:58:34 +00:00
Rafael Espindola 6a9aae77d4 There is no posix_category in std, use generic_category.
llvm-svn: 210630
2014-06-11 03:49:13 +00:00
Matt Arsenault 10da3b2516 Use cast instead of assert + dyn_cast
llvm-svn: 210628
2014-06-11 03:30:06 +00:00
Matt Arsenault c9df794042 R600: Add helper functions.
Extract these from some of my other patches, since this
is the only thing really making them dependent on each other.

llvm-svn: 210627
2014-06-11 03:29:54 +00:00
Saleem Abdulrasool 8076cab0ce CodeGen: refactor DwarfException
DwarfException served as a base class for exception handling directive emission.
However, this is also used by other exception models (e.g. Win64EH).  Rename
this class to EHStreamer and split it out of DwarfException.h.  NFC.

Use the opportunity to fix up some of the documentation comments to match
current LLVM style.  Also rename some functions to conform better with current
LLVM coding style.

llvm-svn: 210622
2014-06-11 01:19:03 +00:00
Eric Christopher a475d5c54a Remove duplicate copy of InstrItineraryData from the TargetMachine,
it's already on the subtarget.

llvm-svn: 210619
2014-06-11 00:53:17 +00:00
Eric Christopher 7c9d4e058a Move to a private function to initialize the subtarget dependencies
so that we can use initializer lists for the AArch64 Subtarget.

llvm-svn: 210616
2014-06-11 00:46:34 +00:00
Eric Christopher 1a2120312b Move to a private function to initialize the subtarget dependencies
so that we can use initializer lists for the X86Subtarget.

llvm-svn: 210614
2014-06-11 00:25:19 +00:00
Eric Christopher 946a6581ea Sort includes.
llvm-svn: 210613
2014-06-11 00:25:16 +00:00
Juergen Ributzka 2dace6e54b [FastISel][X86] Extend support for {s|u}{add|sub|mul}.with.overflow intrinsics.
llvm-svn: 210610
2014-06-10 23:52:44 +00:00
Eric Christopher cd996edec5 Use unique_ptr for X86Subtarget pointer members.
llvm-svn: 210606
2014-06-10 23:26:47 +00:00
Eric Christopher 841da85198 Move AArch64TargetLowering to AArch64Subtarget.
This currently necessitates a TargetMachine for the TargetLowering
constructor and TLOF.

llvm-svn: 210605
2014-06-10 23:26:45 +00:00
Zachary Turner 6610b99cb5 Revert "Remove support for runtime multi-threading."
This reverts revision r210600.

llvm-svn: 210603
2014-06-10 23:15:43 +00:00
Zachary Turner f6054ca18c Remove support for runtime multi-threading.
This patch removes the functions llvm_start_multithreaded() and
llvm_stop_multithreaded(), and changes llvm_is_multithreaded()
to return a constant value based on the value of the compile-time
definition LLVM_ENABLE_THREADS.

Previously, it was possible to have compile-time support for
threads on, and runtime support for threads off, in which case
certain mutexes were not allocated or ever acquired.  Now, if the
build is created with threads enabled, mutexes are always acquired.

A test before/after patch of compiling a very large TU showed no
noticeable performance impact of this change.

Reviewers: rnk

Differential Revision: http://reviews.llvm.org/D4076

llvm-svn: 210600
2014-06-10 23:01:20 +00:00
Eric Christopher f63bc64df5 Move AArch64InstrInfo to AArch64Subtarget.
llvm-svn: 210599
2014-06-10 22:57:25 +00:00
Eric Christopher 58f3266722 Remove a method that was just replacing direct access to a member.
llvm-svn: 210598
2014-06-10 22:57:21 +00:00
Eric Christopher 6c786a1dd1 Remove the use of TargetMachine from X86InstrInfo.
llvm-svn: 210596
2014-06-10 22:34:31 +00:00
Eric Christopher 1f8ad4f4a7 Move X86RegisterInfo away from using the TargetMachine and only
using the subtarget.

llvm-svn: 210595
2014-06-10 22:34:28 +00:00
Rafael Espindola f5d07fa586 Mark a few functions noexcept.
This reduces the difference between std::error_code and llvm::error_code.

llvm-svn: 210591
2014-06-10 21:26:47 +00:00
Eric Christopher 68d7559e97 Use the TargetMachine on the DAG or the MachineFunction instead
of using the cached TargetMachine.

llvm-svn: 210589
2014-06-10 21:25:13 +00:00
Tom Stellard 4e07b1d76b R600/SI: Emit an error when attempting to spill VGPRs v4
I can't get VGPR spilling to work reliable, so for now just emit
an error when the register allocator tries to spill VGPRs.

v2:
  - Fix build
v3:
  - Added crash fix when spilling SPGRs
v4:
  - Use V_MOV_B32 as a dummy instruction instead of S_NOP

Patch by: Darren Powell

https://bugs.freedesktop.org/show_bug.cgi?id=75276

llvm-svn: 210588
2014-06-10 21:20:41 +00:00
Tom Stellard 060ae39022 R600/SI: Fix a crash when spilling SGPRs
We need to make sure only one new instruction is added when spilling
otherwise the register allocator may crash.

This fixes a crash in the game Antichamber.

https://bugs.freedesktop.org/show_bug.cgi?id=75276

llvm-svn: 210587
2014-06-10 21:20:38 +00:00
Eric Christopher 2af33756c7 We already have a reference to the TargetMachine, use that.
llvm-svn: 210580
2014-06-10 20:39:39 +00:00
Eric Christopher 576d36ae05 Have isInTailCallPosition take the DAG so that we can use the
version of TargetLowering/Machine from there on the way to avoiding
TargetMachine in TargetLowering.

llvm-svn: 210579
2014-06-10 20:39:38 +00:00
Eric Christopher 09fc276d08 Reorder includes to be sorted.
llvm-svn: 210578
2014-06-10 20:39:35 +00:00
Reid Kleckner b01961c2c1 Revert "Patch by Ray Donnelly to print register names instead of numbers."
This reverts commit r206683.

The code was confusing SEH register numbers with DWARF register numbers.
The test case it was committed with was obviously incorrect.  The
disassembler was roundtripping '.seh_pushreg %rsi' as '.seh_pushreg
%rbp', and other exciting things.

Noticed by Vadim Chugunov.

llvm-svn: 210574
2014-06-10 20:16:36 +00:00
Matt Arsenault a73fd935d8 Fix error in tablegen when either operand of !if is an empty list.
!if([Something], []) would error with "No type for list".

llvm-svn: 210572
2014-06-10 20:10:08 +00:00
Eric Christopher db5028bd5b Fix typos.
llvm-svn: 210571
2014-06-10 20:07:29 +00:00
Matt Arsenault 6042506b5c R600: Use BCNT_INT for evergreen
llvm-svn: 210569
2014-06-10 19:18:28 +00:00
Matt Arsenault 8333e4378e R600/SI: Implement i64 ctpop
llvm-svn: 210568
2014-06-10 19:18:24 +00:00
Matt Arsenault b5b5110b5c R600/SI: Use bcnt instruction for ctpop
llvm-svn: 210567
2014-06-10 19:18:21 +00:00
Matt Arsenault 6e43965fbc R600: Handle fcopysign
llvm-svn: 210564
2014-06-10 19:00:20 +00:00
Matt Arsenault b2cbf799d1 R600/SI: Handle sign_extend and zero_extend to i64 with patterns.
llvm-svn: 210563
2014-06-10 18:54:59 +00:00
Eric Christopher 19b1d73e88 Add a FIXME.
llvm-svn: 210559
2014-06-10 18:31:18 +00:00
Eric Christopher fcb06ca908 Move AArch64SelectionDAGInfo down to the subtarget.
llvm-svn: 210557
2014-06-10 18:21:53 +00:00
Juergen Ributzka 89fe23e888 [FastISel] Collect statistics about failing intrinsic calls.
Add more instruction-specific statistics about failing intrinsic calls during
FastISel.

llvm-svn: 210556
2014-06-10 18:17:00 +00:00
Eric Christopher 17254eea62 Remove the cached little endian variable. We can get it easily off
of the DataLayout.

llvm-svn: 210555
2014-06-10 18:11:20 +00:00
Eric Christopher 078a2b62ab Have AArch64SelectionDAGInfo take a DataLayout parameter rather
than a TargetMachine.

llvm-svn: 210554
2014-06-10 18:06:28 +00:00
Eric Christopher 57c2319bb3 Remove caching of the subtarget for AArch64SelectionDAGInfo.
llvm-svn: 210553
2014-06-10 18:06:25 +00:00
Eric Christopher 6f2a203f24 Move DataLayout onto the AArch64 subtarget.
llvm-svn: 210552
2014-06-10 18:06:23 +00:00
Zachary Turner a40ccf620b Test commit, wraps some lines to fit in 80 columns.
llvm-svn: 210551
2014-06-10 18:03:04 +00:00
Eric Christopher 29aab7b355 Move AArch64FrameLowering into the subtarget.
llvm-svn: 210549
2014-06-10 17:44:12 +00:00
Eric Christopher bc76b97797 Remove the uses of AArch64TargetMachine and AArch64Subtarget from
AArch64FrameLowering.

llvm-svn: 210548
2014-06-10 17:33:39 +00:00
Reed Kotler 063d4fba36 Do Materialize Floating Point in Mips Fast-Isel
Summary:
Implement materialize of floating point literals in Mips Fast-Isel

Reopened version of D3659

Test Plan: simplestorefp1.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4071

llvm-svn: 210546
2014-06-10 16:45:44 +00:00
Andrea Di Biagio fa508af0fe [X86] Improved target combine rules for selecting horizontal add/sub.
This patch slightly changes the algorithm introduced at revision 210477
to fix a problem where the algorithm was producing incorrect code for 
the VEX.256 encoded versions of horizontal add/sub.

For these cases, we now try to split the two 256-bit vectors into
128-bit chunks before emitting horizontal add/sub dag nodes.

Added a new test case into haddsub-2.ll.

llvm-svn: 210545
2014-06-10 16:42:57 +00:00
Tom Stellard d172270c44 Hexagon: Expand i1 SELECT_CC
il is legal for Hexagon, so I should have marked this as Expand for
SELECT_CC when I removed setOperationAction(ISD::SELECT_CC, MVT::Other,
Expand); in r210541.

llvm-svn: 210544
2014-06-10 16:42:41 +00:00
Adam Nemet 7f62b23e92 [X86] AVX512: Add vmovntdqa
Along with the corresponding intrinsic and tests.

llvm-svn: 210543
2014-06-10 16:39:53 +00:00
Renato Golin 65eea557ae Fix a bug in the Thumb1 ARM Load/Store optimizer
Previously, the basic block was searched for future uses of the base register,
and if necessary any writeback to the base register was reset using a SUB
instruction (e.g. before calling a function) just before such a use. However,
this step happened *before* the merged LDM/STM instruction was built. So if
there was (e.g.) a function call directly after the not-yet-formed LDM/STM,
the pass would first insert a SUB instruction to reset the base register,
and then (at the same location, incorrectly) insert the LDM/STM itself.

This patch fixes PR19972. Patch by Moritz Roth.

llvm-svn: 210542
2014-06-10 16:39:21 +00:00
Tom Stellard 3787b12255 SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC
The SelectionDAG bad a special case for ISD::SELECT_CC, where it would
allow targets to specify:

setOperationAction(ISD::SELECT_CC, MVT::Other, Expand);

to indicate that they wanted to expand ISD::SELECT_CC for all types.
This wasn't applied correctly everywhere, and it makes writing new
DAG patterns with ISD::SELECT_CC difficult.

llvm-svn: 210541
2014-06-10 16:01:29 +00:00
Tom Stellard b9a023383e SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors
This prevents a future commit from regressing:

test/CodeGen/R600/setcc-equivalent.ll

llvm-svn: 210540
2014-06-10 16:01:25 +00:00
Tom Stellard 3ca1bfc728 SelectionDAG: Expand SELECT_CC to SELECT + SETCC
This consolidates code from the Hexagon, R600, and XCore targets.

No functionality change intended.

llvm-svn: 210539
2014-06-10 16:01:22 +00:00
Bill Schmidt f910a0650e [PPC64LE] Recognize shufflevector patterns for little endian
Various masks on shufflevector instructions are recognizable as
specific PowerPC instructions (vector pack, vector merge, etc.).
There is existing code in PPCISelLowering.cpp to recognize the correct
patterns for big endian code.  The masks for these instructions are
different for little endian code due to the big-endian numbering
employed by these instructions.  This patch adds the recognition code
for little endian.

I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for
this.  The existing recognizer test (vec_shuffle.ll) is unnecessarily
verbose and difficult to read, so I felt it was better to add a new
test rather than modify the old one.

llvm-svn: 210536
2014-06-10 14:35:01 +00:00
Chad Rosier d863ae39d1 [AArch64] Emit .ident compiler version attribute.
Patch by Ana Pazos<apazos@codeaurora.org>!

llvm-svn: 210535
2014-06-10 14:32:08 +00:00
Artyom Skrobov 6c8682e2e9 Condition codes AL and NV are invalid in the aliases that use
inverted condition codes (CINC, CINV, CNEG, CSET, and CSETM).

Matching aliases based on "immediate classes", when disassembling,
wasn't previously supported, hence adding MCOperandPredicate
into class Operand, and implementing the support for it
in AsmWriterEmitter.

The parsing for those aliases was already custom, so just adding
the missing condition into AArch64AsmParser::parseCondCode.

llvm-svn: 210528
2014-06-10 13:11:35 +00:00
Artyom Skrobov 8b98532af9 Anonymous definitions in foreach blocks triggered a 'def already exists'
llvm-svn: 210526
2014-06-10 12:41:14 +00:00
Tim Northover 9ffd0b020f AArch64: disallow x30 & x29 as the destination for indirect tail calls
As Ana Pazos pointed out, these have to be restored to their incoming values
before a function returns; i.e. before the tail call. So they can't be used
correctly as the destination register.

llvm-svn: 210525
2014-06-10 10:50:24 +00:00
Tim Northover 7b9f86da5d Revert "X86: elide comparisons after cmpxchg instructions."
This reverts commit r210523. It was committed prematurely without waiting for
review.

llvm-svn: 210524
2014-06-10 10:50:11 +00:00
Tim Northover 84ad29ca1f X86: elide comparisons after cmpxchg instructions.
The C++ and C semantics of the compare_and_swap operations actually
require us to return a boolean "success" value. In LLVM terms this
means a second comparison of the output of "cmpxchg" against the input
desired value.

However, x86's "cmpxchg" instruction sets all flags for the comparison
formed, so we can skip any secondary comparison. (N.b. this isn't true
for cmpxchg8b/16b, which only set ZF).

rdar://problem/13201607

llvm-svn: 210523
2014-06-10 10:49:07 +00:00
Tim Northover c141ad4b75 AArch64: teach FastISel how to handle offset FrameIndices
Previously we were abandonning the attempt, leading to some combination of
extra work (when selection of a load/store fails completely) and inferior code
(when this leads to a real memcpy call instead of inlining).

rdar://problem/17187463

llvm-svn: 210520
2014-06-10 09:52:44 +00:00
Tim Northover c19445d07a AArch64: make FastISel memcpy emission more robust.
We were hitting an assert if FastISel couldn't create the load or store we
requested. Currently this happens for large frame-local addresses, though
CodeGen could be improved there.

rdar://problem/17187463

llvm-svn: 210519
2014-06-10 09:52:40 +00:00
Eric Christopher 0fb16ab204 Delete X86JITInfo in the subtarget destructor.
llvm-svn: 210516
2014-06-10 08:03:42 +00:00
Juergen Ributzka b2e4edb5c8 [ConstantHoisting][X86] Improve the cost model for small constants with large types (i64 and above).
This improves the X86 cost model for small constants with large types. Before
this commit we would even hoist trivial constants such as i96 2.

This is related to <rdar://problem/17070936>

llvm-svn: 210504
2014-06-10 00:32:29 +00:00
Reid Kleckner 16bf89ecb2 Reorder Value and User fields to save 8 bytes of padding on 64-bit
Reviewered by: rafael

Differential Revision: http://reviews.llvm.org/D4073

llvm-svn: 210501
2014-06-09 23:32:20 +00:00
Richard Trieu a23043cb9c Removing an "if (!this)" check from two print methods. The condition will
never be true in a well-defined context.  The checking for null pointers
has been moved into the caller logic so it does not rely on undefined behavior.

llvm-svn: 210497
2014-06-09 22:53:16 +00:00
Bill Schmidt 6b5a7dfc24 [PPC64LE] Generate correct code for unaligned little-endian vector loads
The code in PPCTargetLowering::PerformDAGCombine() that handles
unaligned Altivec vector loads generates a lvsl followed by a vperm.
As we've seen in numerous other places, the vperm instruction has a
big-endian bias, and this is fixed for little endian by complementing
the permute control vector and swapping the input operands.  In this
case the lvsl is providing the permute control vector.  Rather than
generating an lvsl and a complement operation, it is sufficient to
generate an lvsr instruction instead.  Thus for LE code generation we
will generate an lvsr rather than an lvsl, and swap the other input
arguments on the vperm.

The existing test/CodeGen/PowerPC/vec_misalign.ll is updated to test
the code generation for PPC64 and PPC64LE, in addition to the existing
PPC32/G5 testing.

llvm-svn: 210493
2014-06-09 22:00:52 +00:00
Alexey Samsonov 8000e2734e Generate better location ranges for some register-described variables.
Don't terminate location ranges for register-described variables
at the end of machine basic block if this register is never modified
in the function body, except for the prologue and epilogue. Prologue
location is guessed by FrameSetup flags on MachineInstructions, while
epilogue location is deduced from debug locations of instructions
in the basic blocks ending with return instructions.

This patch is mostly targeted to fix non-trivial debug locations for
variables addressed via stack and frame pointers.

It is not really a generic fix. We can still produce poor debug info
for register-described variables if this register *is* modified somewhere
in the function, but in unrelated places. This might be the case for the debug
info in optimized binaries (e.g. for local variables in inlined functions).
LiveDebugVariables pass in CodeGen attempts to fix this problem by adjusting
DBG_VALUE instructions, but this pass is tied to greedy register allocator,
which is used in optimized builds only. Proper fix would likely involve
generalizing LiveDebugVariables to all register allocators. See more discussion
in http://reviews.llvm.org/D3933 review thread.

I'm proceeding with this patch to fix immediate severe problems and
important cases, e.g. fix completely broken debug info with AddressSanitizer
and fix PR19307 (missing debug info for by-value std::string arguments).

llvm-svn: 210492
2014-06-09 21:53:47 +00:00
Saleem Abdulrasool abac6e92a0 ARM: add VLA extension for WoA Itanium ABI
The armv7-windows-itanium environment is nearly identical to the MSVC ABI. It
has a few divergences, mostly revolving around the use of the Itanium ABI for
C++. VLA support is one of the extensions that are amongst the set of the
extensions.

This adds support for proper VLA emission for this environment. This is
somewhat similar to the handling for __chkstk emission on X86 and the large
stack frame emission for ARM. The invocation style for chkstk is still
controlled via the -mcmodel flag to clang.

Make an explicit note that this is an extension.

llvm-svn: 210489
2014-06-09 20:18:42 +00:00
Matt Arsenault 44f60d0a60 Look through addrspacecasts when turning ptr comparisons into
index comparisons.

llvm-svn: 210488
2014-06-09 19:20:29 +00:00
Alp Toker 51420a8d62 Remove old fenv.h workaround for a historic clang driver bug
Tested and works fine with clang using libstdc++.

All indications are that this was fixed some time ago and isn't a problem with
any clang version we support.

I've added a note in PR6907 which is still open for some reason.

llvm-svn: 210485
2014-06-09 19:00:52 +00:00
Alp Toker c817d6a5b5 Fold FEnv.h into the implementation
Support headers shouldn't use config.h definitions, and they should never be
undefined like this.

ConstantFolding.cpp was the only user of this facility and already includes
config.h for other math features, so it makes sense to move the checks there at
point of use.

(The implicit config.h was also quite dangerous -- removing the FEnv.h include
would have silently disabled math constant folding without causing any tests to
fail. Need to investigate -Wundef once the cleanup is done.)

This eliminates the last config.h include from LLVM headers, paving the way for
more consistent configuration checks.

llvm-svn: 210483
2014-06-09 18:28:53 +00:00
Eric Christopher a08f30bd40 Move all of the x86 subtarget initialized variables down into the x86 subtarget
from the x86 target machine. Should be no functional change.

llvm-svn: 210479
2014-06-09 17:08:19 +00:00
Matt Arsenault 93840c095a R600/SI: Rename VOP3 helper class to be more general
It has other uses besides shift instructions.

llvm-svn: 210478
2014-06-09 17:00:46 +00:00
Andrea Di Biagio f99dd64f0a [X86] Add target combine rules for horizontal add/sub.
This patch adds new target specific combine rules to identify horizontal
add/sub idioms from BUILD_VECTOR dag nodes.

This patch also teaches the DAGCombiner how to canonicalize sequences of
insert_vector_elt dag nodes according to the following rule:

  (insert_vector_elt (insert_vector_elt A, I0), I1) ->
    (insert_vecto_elt (insert_vector_elt A, I1), I0)

This new canonicalization rule only triggers if the inner insert_vector
dag node has exactly one use; also, both indices must be known constants,
and I1 < I0.
This last rule made it possible to write a simpler algorithm to identify
horizontal add/sub patterns because now we don't have to worry about the
ordering of insert_vector_elt dag nodes.

llvm-svn: 210477
2014-06-09 16:54:41 +00:00
Matt Arsenault 689f325099 R600/SI: Keep 64-bit not on SALU
llvm-svn: 210476
2014-06-09 16:36:31 +00:00
Matt Arsenault 13ccc8f1bc R600: Fix selection failure for vector bswap
llvm-svn: 210475
2014-06-09 16:20:25 +00:00
Bill Schmidt 42995e8c74 [PPC64LE] Generate correct little-endian code for v16i8 multiply
The existing code in PPCTargetLowering::LowerMUL() for multiplying two
v16i8 values assumes that vector elements are numbered in big-endian
order.  For little-endian targets, the vector element numbering is
reversed, but the vmuleub, vmuloub, and vperm instructions still
assume big-endian numbering.  To account for this, we must adjust the
permute control vector and reverse the order of the input registers on
the vperm instruction.

The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed
on powerpc64 and powerpc64le targets as well as the original powerpc
(32-bit) target.

llvm-svn: 210474
2014-06-09 16:06:29 +00:00
Evgeniy Stepanov 70d1b0a818 [msan] Workaround for invalid origins in shufflevector.
Makes origin propagation ignore literal undef operands, and,
in general, any operand we don't have origin for.

https://code.google.com/p/memory-sanitizer/issues/detail?id=56

llvm-svn: 210472
2014-06-09 14:29:34 +00:00
Sasa Stankovic e435f5b2d4 [mips] Fix a bug for NaCl target - Don't report the error when non-dangerous
load/store is in branch delay slot.

Differential Revision: http://llvm-reviews.chandlerc.com/D4048

llvm-svn: 210470
2014-06-09 14:09:28 +00:00
Andrea Di Biagio dfbdc71ea1 [X86] Avoid emitting unnecessary test instructions.
This patch teaches the backend how to check for the 'NoSignedWrap' flag on
binary operations to improve the emission of 'test' instructions.

If the result of a binary operation is known not to overflow we know that
resetting the Overflow flag is unnecessary and so we can avoid emitting
the test instruction.

Patch by Marcello Maggioni.

llvm-svn: 210468
2014-06-09 12:34:50 +00:00
Andrea Di Biagio 4db1abea15 [DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG.
This patch modifies SelectionDAGBuilder to construct SDNodes with associated
NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator
instructions.

Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing
nsw/nuw/exact flags during codegen.

Patch by Marcello Maggioni.

llvm-svn: 210467
2014-06-09 12:32:53 +00:00
Alexey Volkov 5260dba323 [X86] Use ADD/SUB instead of INC/DEC for Silvermont
According to Intel Software Optimization Manual 
on Silvermont INC or DEC instructions require 
an additional uop to merge the flags.
As a result, a branch instruction depending 
on an INC or a DEC instruction incurs a 1 cycle penalty.

Differential Revision: http://reviews.llvm.org/D3990

llvm-svn: 210466
2014-06-09 11:40:41 +00:00
Artyom Skrobov 82ae94f704 [AArch64] Missing aliases for CMP/CMN [W]SP with no shift
llvm-svn: 210464
2014-06-09 11:10:14 +00:00
Zoran Jovanovic 2855142ac5 [mips][mips64r6] Add LDPC instruction
Differential Revision: http://reviews.llvm.org/D3822

llvm-svn: 210460
2014-06-09 09:49:51 +00:00
Evgeniy Stepanov 2be29929be Fix line numbers for code inlined from __nodebug__ functions.
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.

With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.

Fixes PR19001.

llvm-svn: 210459
2014-06-09 09:09:19 +00:00
Evgeniy Stepanov f7c29a9e25 [msan] Fix vector pack intrinsic handling.
This fixes a crash on MMX intrinsics, as well as a corner case in handling of
all unsigned pack intrinsics.

PR19953.

llvm-svn: 210454
2014-06-09 08:40:16 +00:00
Patrik Hagglund aad35e7fc4 Fix gcc warning (enumeral and non-enumeral type in conditional expression)
llvm-svn: 210450
2014-06-09 07:35:07 +00:00
Chad Rosier 3fe0c876c4 [AArch64] Fix the ordering of the accumulate operand in SchedRW list.
Patch by Dave Estes <cestes@codeaurora.org>
http://reviews.llvm.org/D4037

llvm-svn: 210446
2014-06-09 01:54:00 +00:00
Chad Rosier d96e9f14ee [AArch64] When combining constant mul of power of 2 plus/minus 1, prefer shift
plus add.  The shift can be folded into the add.  This only effects codegen
when the constant is 3.

llvm-svn: 210445
2014-06-09 01:25:51 +00:00
Jingyue Wu 5c7b1aed5d [SeparateConstOffsetFromGEP] inbounds zext => sext for better splitting
For each array index that is in the form of zext(a), convert it to sext(a)
if we can prove zext(a) <= max signed value of typeof(a). The conversion
helps to split zext(x + y) into sext(x) + sext(y).

Reviewed in http://reviews.llvm.org/D4060

llvm-svn: 210444
2014-06-08 23:49:34 +00:00
Craig Topper 66f09ad041 [C++11] Use 'nullptr'.
llvm-svn: 210442
2014-06-08 22:29:17 +00:00