Commit Graph

249895 Commits

Author SHA1 Message Date
Justin Lebar 3cf25461e0 [CUDA] Add --ptxas-path= flag.
Summary:
This lets you build with one CUDA installation but use ptxas from
another install.

This is useful e.g. if you want to avoid bugs in an old ptxas without
actually upgrading wholesale to a newer CUDA version.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D27788

llvm-svn: 289847
2016-12-15 18:44:57 +00:00
Sanjay Patel a97358bc8e [x86] use a single shufps for 256-bit vectors when it can save instructions
This is the 256-bit counterpart to the 128-bit transform checked in here:
https://reviews.llvm.org/rL289837

This patch is based on the draft by @sroland (Roland Scheidegger) that is
attached to PR27885:
https://llvm.org/bugs/show_bug.cgi?id=27885

llvm-svn: 289846
2016-12-15 18:43:46 +00:00
Matthew Simpson 2c8de192a1 [AArch64] Guard Misaligned 128-bit store penalty by subtarget feature
This patch checks that the SlowMisaligned128Store subtarget feature is set
when penalizing such stores in getMemoryOpCost.

Differential Revision: https://reviews.llvm.org/D27677

llvm-svn: 289845
2016-12-15 18:36:59 +00:00
Ahmed Bougacha 2a26a5f1f0 [AArch64][GlobalISel] Remove redundant RBI comments. NFC.
It's brittle, and Doxygen already picks the overriden method's comment
anyway.

llvm-svn: 289844
2016-12-15 18:22:15 +00:00
Teresa Johnson 1b859a2306 [ThinLTO] Ensure callees get hot threshold when first seen on cold path
This is split out from D27696, since it turned out to be a bug fix and
not part of the NFC efficiency change.

Keep the same adjusted (possibly decayed) threshold in both the worklist
and the ImportList. Otherwise if we encountered it first along a cold
path, the callee would be added to the worklist with a lower decayed
threshold than when it is later encountered along a hot path. But the
logic uses the threshold recorded in the ImportList entry to check if
we should re-add it, and without this patch the threshold recorded there
is the same along both paths so we don't re-add it. Using the
same possibly decayed threshold in the ImportList ensures we re-add it
later with the higher non-decayed hot path threshold.

llvm-svn: 289843
2016-12-15 18:21:01 +00:00
Chris Bieneman 1662da2832 [CMake] Ensure Python files are inside the LLDB framework bundle
When building the LLDB Framework we need to ensure that the Python files get put into the Framework before the Framework's install target can be invoked.

All files inside the Framework's Resources bundle will get copied over during the install action.

llvm-svn: 289842
2016-12-15 18:19:10 +00:00
Chris Bieneman 679d02f2a1 [CMake] Only support LLDB_BUILD_FRAMEWORK on CMake 3.7 and later
CMake's framework target generation was unable to generate POST_BUILD steps (see: https://gitlab.kitware.com/cmake/cmake/issues/16363).

It turns out working around this is really not reasonable. The more reasonable solution to me is just to not support LLDB.framework unless you are on CMake 3.7 or newer.

Since CMake 3.7.1 is released that's how I'm going to handle this.

llvm-svn: 289841
2016-12-15 18:18:47 +00:00
Chris Bieneman dc9b0db8e3 [CMake] Minor change to symlink generation for LLDB
If OUTPUT_DIR is not specified we can assume the symlink is linking to a file in the same directory, so we can use $<TARGET_FILE_NAME:${target}> to create a relative symlink.

In the case of LLDB, when we build a framework, we are creating symlinks in a different directory than the file we're pointing to, and we don't install those links. To make this work in the build directory we can use $<TARGET_FILE:${target}> instead, which uses the full path to the target.

llvm-svn: 289840
2016-12-15 18:17:07 +00:00
Ahmed Bougacha 24f5776216 [Driver] Bump default x86 cpu to Penryn when targeting macosx10.12+.
10.12 dropped support for all pre-Penryn Macs.

llvm-svn: 289839
2016-12-15 18:14:27 +00:00
Kostya Kortchinsky 47be0edfa3 [scudo] Use DefaultSizeClassMap for 32-bit
Summary:
With the recent changes to the Secondary, we use less bits for UnusedBytes,
which allows us in return to increase the bits used for Offset. That means
that we can use a Primary SizeClassMap allowing for a larger maximum size.

Reviewers: kcc, alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27816

llvm-svn: 289838
2016-12-15 18:06:55 +00:00
Sanjay Patel a0d8a278a7 [x86] use a single shufps when it can save instructions
This is a tiny patch with a big pile of test changes.
This partially fixes PR27885:
https://llvm.org/bugs/show_bug.cgi?id=27885

My motivating case looks like this:

  - vpshufd {{.*#+}} xmm1 = xmm1[0,1,0,2]
  - vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
  - vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm1[4,5,6,7]

  + vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2]

And this happens several times in the diffs. For chips with domain-crossing penalties,
the instruction count and size reduction should usually overcome any potential 
domain-crossing penalty due to using an FP op in a sequence of int ops. For chips such
as recent Intel big cores and Atom, there is no domain-crossing penalty for shufps, so
using shufps is a pure win.

So the test case diffs all appear to be improvements except one test in 
vector-shuffle-combining.ll where we miss an opportunity to use a shift to generate 
zero elements and one test in combine-sra.ll where multiple uses prevent the expected
shuffle combining.

Differential Revision: https://reviews.llvm.org/D27692

llvm-svn: 289837
2016-12-15 18:03:38 +00:00
Kelvin Li 51336dd0b4 Fix typo in comment. NFC.
llvm-svn: 289836
2016-12-15 17:55:32 +00:00
Mike Aizatsky 94752697ee [sanitizers] dont dump coverage if not asked to
llvm-svn: 289835
2016-12-15 17:30:58 +00:00
Simon Pilgrim 7522f54feb [X86][SSE] Fix domains for scalar store instructions
As discussed on D27692

llvm-svn: 289834
2016-12-15 17:09:24 +00:00
Robert Lougher 6ea759a83e Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
Reverting as it is causing buildbot failures (address sanitizer).

llvm-svn: 289833
2016-12-15 16:59:13 +00:00
Jacques Pienaar ccffe38352 [lanai] Simplify small section check in LowerGlobalAddress and treat ldata sections specially.
Move the check for the code model into isGlobalInSmallSectionImpl and return false (not in small section) for variables placed in sections prefixed with .ldata (workaround for a tool limitation).

llvm-svn: 289832
2016-12-15 16:56:16 +00:00
Kuba Mracek 659949cb03 [tsan] Add interceptor for libcxx __shared_count::__release_shared()
We already have an interceptor for __shared_weak_count::__release_shared, this patch handles __shared_count::__release_shared in the same way. This should get rid of TSan false positives when using std::future.

Differential Revision: https://reviews.llvm.org/D27797

llvm-svn: 289831
2016-12-15 16:45:28 +00:00
Simon Pilgrim ba46422694 [X86][AVX512] Moved instruction domain lookups to the right table. NFCI.
Avoid duplicating instructions in the int32/int64 domains.

llvm-svn: 289830
2016-12-15 16:38:51 +00:00
Saleem Abdulrasool 05b8fde8ee CodeGen: ubsan is built static on windows, give handlers local storage
The UBSAN runtime is built static on Windows.  This requires that we give local
storage always.  This impacts Windows where the linker would otherwise have to
generate a thunk to access the symbol via the IAT.  This should repair the
windows clang build bots.

llvm-svn: 289829
2016-12-15 16:30:20 +00:00
Robert Lougher cf17674211 [SimplifyCFG] In sinkLastInstruction correctly set debugloc of "common" inst
Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Differential Revision: https://reviews.llvm.org/D27590

llvm-svn: 289828
2016-12-15 16:17:53 +00:00
George Rimar 6cbfce7785 [ELF] - Make LLD accept Ttext-segment X/Ttext-segment=X aliases for -Ttext.
It os used in work/emulators/qemu-user-static port.
Which tries to use -Ttext-segment and then:

# In case ld does not support -Ttext-segment, edit the default linker
# script via sed to set the .text start addr.  This is needed on FreeBSD
# at least.
<here it calls -verbose to extract and edit default bfd linker script.>

Actually now we are do not fully support -Ttext properly (see D27613),
but we also seems never will provide anything close to default script, like bfd do,
so at least this patch introduces proper alias handling.

llvm-svn: 289827
2016-12-15 16:12:34 +00:00
Krzysztof Parzyszek 0ca1987977 Fix ubsan failures in lane mask shifts
llvm-svn: 289826
2016-12-15 16:08:49 +00:00
Simon Pilgrim d7518896ff [X86][SSE] Fix domains for VZEXT_LOAD type instructions
Add the missing domain equivalences for movss, movsd, movd and movq zero extending loading instructions.

Differential Revision: https://reviews.llvm.org/D27684

llvm-svn: 289825
2016-12-15 16:05:29 +00:00
George Rimar 879a657680 [ELF] - Apply format (2). NFC.
llvm-svn: 289824
2016-12-15 15:38:58 +00:00
George Rimar 93c64025fc [ELF] - Apply format. NFC.
llvm-svn: 289823
2016-12-15 15:38:09 +00:00
Alexander Timofeev a57511c451 Fix for regression after Global Load Scalarization patch
llvm-svn: 289822
2016-12-15 15:17:19 +00:00
Hafiz Abid Qadeer f6ee79c926 Fix build for mingw.
Summary: I was building lldb using cross mingw-w64 toolchain on Linux and observed some issues. This is first patch in the series to fix that build. It mostly corrects the case of include files and adjusts some #ifdefs from _MSC_VER to _WIN32 and vice versa. I built lldb on windows with VS after applying this patch to make sure it does not break the build there.

Reviewers: zturner, labath, abidh

Subscribers: ki.stfu, mgorny, lldb-commits

Differential Revision: https://reviews.llvm.org/D27759

llvm-svn: 289821
2016-12-15 15:00:41 +00:00
Krzysztof Parzyszek 91b5cf8412 Extract LaneBitmask into a separate type
Specifically avoid implicit conversions from/to integral types to
avoid potential errors when changing the underlying type. For example,
a typical initialization of a "full" mask was "LaneMask = ~0u", which
would result in a value of 0x00000000FFFFFFFF if the type was extended
to uint64_t.

Differential Revision: https://reviews.llvm.org/D27454

llvm-svn: 289820
2016-12-15 14:36:06 +00:00
Simon Pilgrim 2f7f0e7a48 [CostModel][X86] Updated reverse shuffle costs
llvm-svn: 289819
2016-12-15 14:24:07 +00:00
Alexey Bataev 4160264e30 [TEST] Initial commit of tests for minmax horizontal reductions.
llvm-svn: 289817
2016-12-15 13:21:29 +00:00
Eric Liu 0c0aea0c0a [change-namespace] fix a case references to templated using alias are qualified types.
llvm-svn: 289816
2016-12-15 13:02:41 +00:00
Roman Gareev 2606c48a1d Restrict ranges of extension maps
To prevent copy statements from accessing arrays out of bounds, ranges of their
extension maps are restricted, according to the constraints of domains.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D25655

llvm-svn: 289815
2016-12-15 12:35:59 +00:00
Alexey Bataev 2db6045b29 Revert "[TESTS] Initial commit of tests, by Andrew Tischenko"
This reverts commit ee709f8988653a0334fbf100cdbbdd83a3933347.

llvm-svn: 289814
2016-12-15 12:26:18 +00:00
Ehsan Amiri 795b0671c5 [InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp
A number of new patterns for simplifying and/xor of icmp:

(icmp ne %x, 0) ^ (icmp ne %y, 0) => icmp ne %x, %y if the following is true:
1- (%x = and %a, %mask) and (%y = and %b, %mask)
2- %mask is a power of 2.

(icmp eq %x, 0) & (icmp ne %y, 0) => icmp ult %x, %y if the following is true:
1- (%x = and %a, %mask1) and (%y = and %b, %mask2)
2- Let %t be the smallest power of 2 where %mask1 & %t != 0. Then for any
   %s that is a power of 2 and %s & %mask2 != 0, we must have %s <= %t.
For example if %mask1 = 24 and %mask2 = 16, setting %s = 16 and %t = 8
violates condition (2) above. So this optimization cannot be applied.

llvm-svn: 289813
2016-12-15 12:25:13 +00:00
Alexey Bataev 3da2619b6f Revert "[TESTS] Initial commit of tests, by Andrew Tischenko"
This reverts commit 5898c713bee5e96aae87c73e11f3f4a7d19c74ed.

llvm-svn: 289812
2016-12-15 12:24:20 +00:00
Simon Pilgrim 9876ed07f6 [CostModel] Fix long standing bug with reverse shuffle mask detection
Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles

llvm-svn: 289811
2016-12-15 12:12:45 +00:00
George Rimar ec02b8d4c0 [ELF] - Partial support of --gdb-index command line option (Part 3).
Patch continues work started in D24706 and D25821.

in this patch symbol table and constant pool areas were
added to .gdb_index section output.

This one finishes the implementation of --gdb-index functionality in LLD.

Differential revision: https://reviews.llvm.org/D26283

llvm-svn: 289810
2016-12-15 12:07:53 +00:00
Alexey Bataev 70f090d568 [TESTS] Initial commit of tests, by Andrew Tischenko
llvm-svn: 289809
2016-12-15 12:06:27 +00:00
Roman Gareev 15db81ef71 [NFC] Fix typos in getMacroKernelParams.
llvm-svn: 289808
2016-12-15 12:00:57 +00:00
Alexey Bataev 67c90c7d95 [TESTS] Initial commit of tests, by Andrew Tischenko
llvm-svn: 289807
2016-12-15 11:48:24 +00:00
Roman Gareev 8babe1a216 The order of the loops defines the data reused in the BLIS implementation of
gemm ([1]). In particular, elements of the matrix B, the second operand of
matrix multiplication, are reused between iterations of the innermost loop.
To keep the reused data in cache, only elements of matrix A, the first operand
of matrix multiplication, should be evicted during an iteration of the
innermost loop. To provide such a cache replacement policy, elements of the
matrix A can, in particular, be loaded first and, consequently, be
least-recently-used.

In our case matrices are stored in row-major order instead of column-major
order used in the BLIS implementation ([1]). One of the ways to address it is
to accordingly change the order of the loops of the loop nest. However, it
makes elements of the matrix A to be reused in the innermost loop and,
consequently, requires to load elements of the matrix B first. Since the LLVM
vectorizer always generates loads from the matrix A before loads from the
matrix B and we can not provide it. Consequently, we only change the BLIS micro
kernel and the computation of its parameters instead. In particular, reused
elements of the matrix B are successively multiplied by specific elements of
the matrix A .

Refs.:
[1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D25653

llvm-svn: 289806
2016-12-15 11:47:38 +00:00
Nemanja Ivanovic 552c8e960e [Power9] Allow AnyExt immediates for XXSPLTIB
In some situations, the BUILD_VECTOR node that builds a v18i8 vector by
a splat of an i8 constant will end up with signed 8-bit values and other
situations, it'll end up with unsigned ones. Handle both situations.

Fixes PR31340.

llvm-svn: 289804
2016-12-15 11:16:20 +00:00
Dylan McKay 4f590f28e7 [AVR] Support floats in the instrumention pass
This also refactors some common code into the 'GetTypeName' method.

llvm-svn: 289803
2016-12-15 11:02:41 +00:00
Eric Fiselier f34964bdd7 Fix XFAILS for is_trivially_destructible trait
llvm-svn: 289802
2016-12-15 11:00:07 +00:00
Pavel Labath 1f2c1b6ccd Remove linux/personality.h wrapper
This code is currently unused.

Removing it should make porting of the linux plugin to NetBSD easier, and we can
always add it later if needed.

llvm-svn: 289801
2016-12-15 10:47:40 +00:00
Simon Pilgrim 9ebeac3eed [CostModel][X86] Add tests for reverse shuffle costs
llvm-svn: 289800
2016-12-15 10:45:53 +00:00
Eric Liu 26cf68af3a [change-namespace] handling templated type aliases correctly.
Summary: This fixes templated type aliases and templated type aliases in classes.

Reviewers: hokein

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D27801

llvm-svn: 289799
2016-12-15 10:42:35 +00:00
Prakhar Bahuguna bc35f21f70 Add missing triple target for numeric section flag test
llvm-svn: 289798
2016-12-15 10:20:48 +00:00
Malcolm Parsons 8e67aa9a9b [clang-tidy] Enhance modernize-use-auto to templated function casts
Summary:
Use auto when declaring variables that are initialized by calling a templated
function that returns its explicit first argument.

Fixes PR26763.

Reviewers: aaron.ballman, alexfh, staronj, Prazek

Subscribers: Eugene.Zelenko, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D27166

llvm-svn: 289797
2016-12-15 10:19:56 +00:00
George Rimar 232d11cb54 [ELF] - Attempt to fix ubuntu 64x buildbot (2).
Fixed inaccurate member type: uint32_t -> size_t
(http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/2984/steps/build/logs/stdio).

llvm-svn: 289796
2016-12-15 09:59:18 +00:00