Commit Graph

121771 Commits

Author SHA1 Message Date
Asaf Badouh 2744d21fb8 [X86][AVX512] extend support in Scalar conversion
add scalar FP to Int conversion with truncation intrinsics
add scalar conversion FP32 from/to FP64 intrinsics
add rounding mode and SAE mode encoding for these intrinsics

Differential Revision: http://reviews.llvm.org/D12665

llvm-svn: 248117
2015-09-20 14:31:19 +00:00
Igor Breger 4c4cd789c9 AVX512: vsqrtss/sd encoding and intrinsics implementation.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12102

llvm-svn: 248116
2015-09-20 09:13:41 +00:00
Asaf Badouh 572bbceecc [X86][AVX512DQ] Add fpclass instruction
Differential Revision: http://reviews.llvm.org/D12931

llvm-svn: 248115
2015-09-20 08:46:07 +00:00
Michael Kuperstein 58e86bc893 [X86] Fix sitofp and uitofp instruction matching failures with long double and avx512
The operation action for i32 and i64 cannot be set to legal, as long double 
needs custom lowering.

Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D12372

llvm-svn: 248114
2015-09-20 08:12:17 +00:00
Igor Breger 1d55f20bee AVX512: Implemented intrinsics for vshuff32x4, vshuff64x2, vshufi64x2, vshufi32x4
Added tests for intrinsics.

Differential Revision: http://reviews.llvm.org/D12525

llvm-svn: 248113
2015-09-20 07:18:53 +00:00
Sanjoy Das 9119bf4c0b [IndVars] Don't repeat function names in comment; NFC.
Only changes comments.

llvm-svn: 248112
2015-09-20 06:58:03 +00:00
Igor Breger 0ede3cbb5c AVX512: Implement instructions encoding, lowering and intrinsics
vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4
Added tests for encoding, lowering and intrinsics.

Differential Revision: http://reviews.llvm.org/D11893

llvm-svn: 248111
2015-09-20 06:52:42 +00:00
Saleem Abdulrasool 4966f58ac2 ARM: cleanup formatting
clang-format a line which was poorly formatted.  NFC.

llvm-svn: 248110
2015-09-20 03:19:09 +00:00
Sanjoy Das 428db150d1 [IndVars] Fix a bug in r248045.
Because -indvars widens induction variables through arithmetic,
`NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV`
manages information for all transitive uses of an IV being widened,
including uses of `-1 * IV`).  Instead it must live on `NarrowIVDefUse`
which manages information for a specific def-use edge in the transitive
use list of an induction variable.

This change also adds a test case that demonstrates the problem with
r248045.

llvm-svn: 248107
2015-09-20 01:52:18 +00:00
Davide Italiano e210ee56f2 Fixup r248096, commit the *correct* test.
llvm-svn: 248097
2015-09-19 20:52:47 +00:00
Davide Italiano a539f63ae1 [obj2yaml] Fix "time of check to time of use" bug. Add a test.
llvm-svn: 248096
2015-09-19 20:49:34 +00:00
Simon Pilgrim 27f81776ad [X86][AVX2] Use general sext IR for vpmovsx stack folding tests
llvm-svn: 248093
2015-09-19 17:04:18 +00:00
Simon Pilgrim d0448ee59f [X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF
Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1))

Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x))

Differential Revision: http://reviews.llvm.org/D12663

llvm-svn: 248091
2015-09-19 13:22:57 +00:00
Simon Pilgrim 996725eb17 [InstCombine] Use SimplifyDemandedVectorEltsLow helper function. NFCI.
Use the SimplifyDemandedVectorEltsLow helper function introduced in D12680.

llvm-svn: 248089
2015-09-19 11:41:53 +00:00
NAKAMURA Takumi 5881d349f9 [CMake] Update LLVM_TEST_DEPENDS not to use macho-dump. It has been unused since r247235.
llvm-svn: 248088
2015-09-19 07:19:30 +00:00
Matt Arsenault 1fafdc82d6 AMDGPU: Remove dead code
getCFGStructurizerRegClass is not used for SI, so
move it into R600 specific stuff.

llvm-svn: 248087
2015-09-19 06:41:10 +00:00
Bob Wilson 8823b84fae NFC: Fix indentation and add braces to clarify nested of else-statement.
llvm-svn: 248086
2015-09-19 06:20:59 +00:00
Maksim Panchenko 0510cd5161 [PrologEpilogInserter] Minor refactoring.
Differential Revision: http://reviews.llvm.org/D12924

llvm-svn: 248084
2015-09-19 04:42:15 +00:00
Maksim Panchenko 07b754daf8 Test commit. Fix comment. NFC.
llvm-svn: 248082
2015-09-19 04:01:19 +00:00
David Majnemer 47ce0b81b0 [InstCombine] FoldICmpCstShrCst failed for ashr when comparing against -1
(icmp eq (ashr C1, %V) -1) may have multiple answers if C1 is not a
power of two and has the sign bit set.

This fixes PR24873.

llvm-svn: 248074
2015-09-19 00:48:31 +00:00
David Majnemer e5977ebecc [InstCombine] FoldICmpCstShrCst didn't handle icmps of -1 in the ashr case correctly
llvm-svn: 248073
2015-09-19 00:48:26 +00:00
Matt Arsenault cc5d106263 AMDGPU: Add failing testcase for live interval construction
llvm-svn: 248067
2015-09-19 00:03:56 +00:00
Sanjoy Das f69d0e3384 [IndVars] Widen more comparisons for non-negative induction vars
Summary:
If an induction variable is provably non-negative, its sign extension is
equal to its zero extension.  This means narrow uses like

  icmp slt iNarrow %indvar, %rhs

can be widened into

  icmp slt iWide zext(%indvar), sext(%rhs)

Reviewers: atrick, mcrosier, hfinkel

Subscribers: hfinkel, reames, llvm-commits

Differential Revision: http://reviews.llvm.org/D12745

llvm-svn: 248045
2015-09-18 21:21:02 +00:00
Luke Larson e7dba51330 Fix typo and test commit
llvm-svn: 248042
2015-09-18 21:15:45 +00:00
Rafael Espindola 34b9ef680f This code never uses r_addend, so it can just use Elf_Rel.
llvm-svn: 248040
2015-09-18 21:12:38 +00:00
Chris Bieneman 895d96a919 [CMake] Adding ALWAYS_GENERATE option to symlink utility functions.
This implements the behavior required for clang symlinks which should be always generated.

llvm-svn: 248039
2015-09-18 21:08:32 +00:00
Davide Italiano 92a5e83528 [Object/ELF] Change comment to reflect reality.
llvm-svn: 248032
2015-09-18 20:41:15 +00:00
Cong Hou d40105d321 Update edge weights properly when merging blocks in if-conversion.
In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue.

Differential Revision: http://reviews.llvm.org/D12513

llvm-svn: 248030
2015-09-18 20:22:41 +00:00
Eric Christopher a835956bda Limit the range of processors supported by ARM fast isel to v6 or
later as that's all that is tested right now.

Fixes PR24858.

llvm-svn: 248027
2015-09-18 20:08:18 +00:00
Teresa Johnson 92191730f8 Remove couple of new Bitcode enum vals that snuck in via r247927 (NFC)
These are meant to be part of the follow on patch I am sending for
review shortly.

llvm-svn: 248023
2015-09-18 19:38:53 +00:00
Larisse Voufo 532bf7153c Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886)
llvm-svn: 248022
2015-09-18 19:14:35 +00:00
James Y Knight e72b0dbf97 Make MachineScheduler debug output less confusing.
At least...a little bit.

llvm-svn: 248020
2015-09-18 18:52:20 +00:00
Cong Hou f9f9ffb98b Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue.
In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. To avoid this issue, this patch scales those cycles by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables are also scaled up for the final comparison.

Differential Revision: http://reviews.llvm.org/D12742

llvm-svn: 248018
2015-09-18 18:19:40 +00:00
Matthias Braun 77771cfd97 SelectionDAGDumper: Leave out the <multiple use> markers
They mostly clutter the output while it is still possible to see which
node has multiple users without them.

Differential Revision: http://reviews.llvm.org/D12569

llvm-svn: 248013
2015-09-18 17:57:33 +00:00
Matthias Braun bab3fb45e5 SelectionDAGDumper: Avoid unnecessary newlines
Before:
  t0 = EntryToken:ch

    t0: <multiple use>
        t0: <multiple use>
      t1 = CopyFromReg:v4f32,ch t0, Register:v4f32  %vreg0

      t25 = IMPLICIT_DEF:v4f32

    t26 = HADDPSrr:v4f32 t1, t25

  t23 = CopyToReg:ch,glue t0, Register:v4f32  %XMM0, t26

    t23: <multiple use>
    t23: <multiple use>
  t24 = RETQ:ch Register:v4f32  %XMM0, t23, t23:1

After:
    t0: <multiple use>
        t0: <multiple use>
      t1 = CopyFromReg:v4f32,ch t0, Register:v4f32  %vreg0
    t26 = X86ISD::FHADD:v4f32 t1, undef:v4f32
  t23 = CopyToReg:ch,glue t0, Register:v4f32  %XMM0, t26
    t23: <multiple use>
    t21 = TargetConstant:i16<0>
    t23: <multiple use>
  t24 = X86ISD::RET_FLAG:ch t23, t21, Register:v4f32  %XMM0, t23:1

Differential Revision: http://reviews.llvm.org/D12568

llvm-svn: 248012
2015-09-18 17:57:31 +00:00
Matthias Braun f89b7c7188 SelectionDAGDumper: Hide [ID=X], [ORD=X] and source locations by default.
You can show them with the new -dag-dump-verbose switch.

Differential Revision: http://reviews.llvm.org/D12566

llvm-svn: 248011
2015-09-18 17:57:28 +00:00
Matthias Braun 0b7d6c14c9 SelectionDAG: Introduce PersistentID to SDNode for assert builds.
This gives us more human readable numbers to identify nodes in debug
dumps.

Before:
  0x7fcbd9700160: ch = EntryToken

  0x7fcbd985c7c8: i64 = Register %RAX

   ...

      0x7fcbd9700160: <multiple use>
    0x7fcbd985c578: i64,ch = MOV64rm 0x7fcbd985c6a0, 0x7fcbd985cc68, 0x7fcbd985c200, 0x7fcbd985cd90, 0x7fcbd985ceb8, 0x7fcbd9700160<Mem:LD8[@foo]> [ORD=2]

  0x7fcbd985c8f0: ch,glue = CopyToReg 0x7fcbd9700160, 0x7fcbd985c7c8, 0x7fcbd985c578 [ORD=3]

    0x7fcbd985c7c8: <multiple use>
    0x7fcbd985c8f0: <multiple use>
    0x7fcbd985c8f0: <multiple use>
  0x7fcbd985ca18: ch = RETQ 0x7fcbd985c7c8, 0x7fcbd985c8f0, 0x7fcbd985c8f0:1 [ORD=3]

Now:
  t0: ch = EntryToken

  t5: i64 = Register %RAX

    ...

      t0: <multiple use>
    t3: i64,ch = MOV64rm t10, t12, t11, t13, t14, t0<Mem:LD8[@foo]> [ORD=2]

  t6: ch,glue = CopyToReg t0, t5, t3 [ORD=3]

    t5: <multiple use>
    t6: <multiple use>
    t6: <multiple use>
  t7: ch = RETQ t5, t6, t6:1 [ORD=3]

Differential Revision: http://reviews.llvm.org/D12564

llvm-svn: 248010
2015-09-18 17:41:00 +00:00
Chris Bieneman 08249706cd [CMake] More cleanup of installing symlinks.
In order to support building clang out-of-tree the install_symlink script needs to be installed, and it needs to be found by searching the CMAKE_MODULE_PATH.

This change renames install_symlink -> LLVMInstallSymlink so it doesn't conflict with naming from other projects, and adds searching behavior in AddLLVM.cmake

llvm-svn: 248009
2015-09-18 17:39:58 +00:00
Geoff Berry 43ec15e57e [AArch64] Improved bitfield instruction selection.
Summary:
For bitfield insert OR matching, check both operands for larger pattern
first before checking for smaller pattern.

Add pattern for unsigned bitfield insert-in-zero done with SHL+AND.

Resolves PR21631.

Reviewers: jmolloy, t.p.northover

Subscribers: aemerson, rengolin, llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D12908

llvm-svn: 248006
2015-09-18 17:11:53 +00:00
Rafael Espindola 7dbb577826 Remove temporary file on signal.
Without this lld leaves temporary files behind when it crashes.

llvm-svn: 247994
2015-09-18 15:17:53 +00:00
Yaron Keren a89b833c4f Simplify SmallBitVector::applyMask by consolidating common code for 32- and 64-bit builds
and assert when mask is too large to apply in the small case,
previously the extra words were silently ignored.
clang-format the entire function to match current code standards.

This is a rewrite of r247972 which was reverted in r247983 due to
warning and possible UB on 32-bits hosts.

llvm-svn: 247993
2015-09-18 15:08:24 +00:00
Daniel Sanders df19a5e605 [mips][microMIPS] Fix an invalid read for lwm32 and reserved reglist values.
Summary:
Some values of 'reglist' are reserved and cause the disassembler to read past
the end of the Regs array. Treat lwm32's containing reserved values as invalid
instructions.

Reviewers: zoran.jovanovic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12959

llvm-svn: 247990
2015-09-18 14:20:54 +00:00
Chad Rosier d90e2ebdf6 [AArch64] Reorder cases to improve readability. NFC.
llvm-svn: 247989
2015-09-18 14:15:19 +00:00
Chad Rosier 84a0afdeff [AArch64] Remove some redundant cases. NFC.
llvm-svn: 247988
2015-09-18 14:13:18 +00:00
Aaron Ballman 2d0f38c5fb Silencing a -Wsign-compare warning; NFC.
llvm-svn: 247986
2015-09-18 13:31:42 +00:00
Igor Laevsky 0fa4819dd8 [LazyValueInfo] Report nonnull range for nonnull pointers
Currently LazyValueInfo will report only alloca's as having nonnull range. 
For loads with !nonnull metadata it will bailout with no additional information. 
Same is true for calls returning nonnull pointers.

This change extends LazyValueInfo to handle additional nonnull instructions.

Differential Revision: http://reviews.llvm.org/D12932

llvm-svn: 247985
2015-09-18 13:01:48 +00:00
Artur Pilipenko 84bc62f7a3 Support align attribute for return values
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D12844

llvm-svn: 247984
2015-09-18 12:33:31 +00:00
Aaron Ballman eda0a48e53 Reverting r247972 (and subordinate commit r247972) as the 32-bit left-shift is undefined behavior on implementations where uinptr_t is 32-bits. One such platform is Windows, MSVC, x86.
llvm-svn: 247983
2015-09-18 12:18:41 +00:00
Artur Pilipenko 253d71efeb Nit cleanup in LangRef about dereferenceable metadata
Reviewed By: vsk

Differential Revision: http://reviews.llvm.org/D12847

llvm-svn: 247982
2015-09-18 12:07:10 +00:00
Michael Kruse 020296a968 [Support] Reapply r245289 "Always wait for GraphViz before opening the viewer"
The change was accidentally undone by r245290.

Original log message:
When calling DisplayGraph and a PS viewer is chosen, two programs are executed: The GraphViz generator and the PostScript viewer. Always wait for the generator to finish to ensure that the .ps file is written before opening the viewer for that file. DisplayGraph's wait parameter refers to whether to wait until the user closes the viewer.

This happened on Windows and if none of the options to open the .dot file directly applies, also on Linux.

Differential Revision: http://reviews.llvm.org/D11876

llvm-svn: 247980
2015-09-18 10:56:30 +00:00