Commit Graph

127380 Commits

Author SHA1 Message Date
Teresa Johnson 833571ecb4 Refactor PGO function naming and MD5 hashing support out of ProfileData
Summary:
Move the function renaming logic into the Function class, and the
MD5Hash routine into the MD5 header.

This will enable these routines to be shared with ThinLTO, which
will be changed to store the MD5 hash instead of full function name
in the combined index for significant size reductions. And using the same
function naming for locals in the function index facilitates future
integration with indirect call value profiles.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17006

llvm-svn: 260197
2016-02-09 05:12:44 +00:00
Nick Lewycky e5fa25a094 Use std::forward to make ErrorOr<T> constructible from a value that has a user-defined conversion to T. No functionality change intended.
llvm-svn: 260196
2016-02-09 04:47:58 +00:00
Sanjoy Das ca2edc7ad5 [GMR/OperandBundles] Teach getModRefBehavior about operand bundles
In general, memory restrictions on a called function (e.g. readnone)
cannot be transferred to a CallSite that has operand bundles.  It is
possible to make this inference smarter, but lets fix the behavior to be
correct first.

llvm-svn: 260193
2016-02-09 02:31:47 +00:00
Richard Smith a64e1adf84 Remove TrailingObjects::operator delete. It's still suffering from
compiler-specific issues. Instead, repeat an 'operator delete' definition in
each derived class that is actually deleted, and give up on the static type
safety of an error when sized delete is accidentally used on a type derived
from TrailingObjects.

llvm-svn: 260190
2016-02-09 02:09:16 +00:00
David L Kreitzer 104364e6b5 Fix the LLVM_ENABLE_MODULES build after adding TargetOpcodes.def in r259726.
Differential Revision: http://reviews.llvm.org/D17005

llvm-svn: 260186
2016-02-09 01:35:45 +00:00
Sanjoy Das 1c481f50d2 Add an "addUsedAAAnalyses" helper function
Summary:
Passes that call `getAnalysisIfAvailable<T>` also need to call
`addUsedIfAvailable<T>` in `getAnalysisUsage` to indicate to the
legacy pass manager that it uses `T`.  This contract was being
violated by passes that used `createLegacyPMAAResults`.  This change
fixes this by exposing a helper in AliasAnalysis.h,
`addUsedAAAnalyses`, that is complementary to createLegacyPMAAResults
and does the right thing when called from `getAnalysisUsage`.

Reviewers: chandlerc

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17010

llvm-svn: 260183
2016-02-09 01:21:57 +00:00
Sanjoy Das 55394d929c Remove SCEVAAWrapperPass from createLegacyPMAAResults; NFC
Summary:
createLegacyPMAAResults is only called by CGSCC and Module passes, so
the call to getAnalysisIfAvailable<SCEVAAWrapperPass>() never
succeeds (SCEVAAWrapperPass is a function pass).

Reviewers: chandlerc

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17009

llvm-svn: 260182
2016-02-09 01:21:50 +00:00
Richard Smith 1b65c3279d Re-commit r259942 (reverted in r260053) with a different workaround for the MSVC bug.
This fixes undefined behavior in C++14 due to the size of the object being
deleted being different from sizeof(dynamic type) when it is allocated with
trailing objects.

MSVC seems to have several bugs around using-declarations changing the access
of a member inherited from a base class, so use forwarding functions instead of
using-declarations to make TrailingObjects::operator delete accessible where
desired.

llvm-svn: 260180
2016-02-09 01:03:42 +00:00
David Blaikie fed557ef76 Simplify some expressions involving unique_ptr and ErrorOr
llvm-svn: 260179
2016-02-09 01:02:24 +00:00
Wei Mi fc1cab305f This patch is to fix PR26529 caused by r259736.
IndVarSimplify assumes scAddRecExpr to be expanded in literal form instead of
canonical form by calling disableCanonicalMode after it creates SCEVExpander.
When CanonicalMode is disabled, SCEVExpander::expand should always return PHI
node for scAddRecExpr. r259736 broke the assumption.

The fix is to let SCEVExpander::expand skip the reuse Value logic if
CanonicalMode is false.

In addition, Besides IndVarSimplify, LSR pass also calls disableCanonicalMode
before doing rewrite. We can remove the original check of LSRMode in reuse
Value logic and use CanonicalMode instead.

llvm-svn: 260174
2016-02-09 00:07:08 +00:00
Davide Italiano 92e6c2896c [llvm-nm] Remove excessive parenthesis, noticed by David Blaikie.
llvm-svn: 260173
2016-02-08 23:50:23 +00:00
Rong Xu d0dfb67fe1 [PGO] Revert r260146 as it breaks Darwin platforms.
r260146 | xur | 2016-02-08 13:07:46 -0800 (Mon, 08 Feb 2016) | 13 lines
[PGO] Differentiate Clang instrumentation and IR level instrumentation profiles

llvm-svn: 260170
2016-02-08 23:11:16 +00:00
Michael Zolotukhin 1da4afdfc9 Factor out UnrollAnalyzer to Analysis, and add unit tests for it.
Summary:
Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests.
The change itself is supposed to be NFC, except adding a couple of tests.

I plan to add more tests as I add new functionality and find/fix bugs.

Reviewers: chandlerc, hfinkel, sanjoy

Subscribers: zzheng, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D16623

llvm-svn: 260169
2016-02-08 23:03:59 +00:00
Simon Pilgrim a207436b01 [X86][SSE1] Add MOVLHPS/MOVHLPS lowering and memory folding support
As discussed on PR26491, this patch adds support for lowering v4f32 shuffles to the MOVLHPS/MOVHLPS instructions. It also adds support for memory folding with their MOVLPS/MOVHPS load equivalents.

This first patch only really helps SSE1 targets as SSE2+ targets will widen the shuffle mask and use v2f64 equivalents (although they still combine to MOVLHPS/MOVHLPS for v2f64 splats). This will have to be addressed in a future patch, most likely when we add support for binary target shuffle combines.

Differential Revision: http://reviews.llvm.org/D16956

llvm-svn: 260168
2016-02-08 23:03:46 +00:00
Davide Italiano 71c85df860 [llvm-nm] Yet another attempt of simplifying code.
llvm-svn: 260166
2016-02-08 22:58:26 +00:00
Andrew Kaylor 1224488e0c [regalloc][WinEH] Do not mark intervals as not spillable if they contain a regmask
Differential Revision: http://reviews.llvm.org/D16831

llvm-svn: 260164
2016-02-08 22:52:51 +00:00
Justin Bogner 34a34aa89e llvm-cov: Fix reading gcov data that does not have function names
In order for recent gcov versions to read the coverage data, you have
to use UseCfgChecksum=true and FunctionNamesInData=false options for
coverage profiling pass. This is because gcov is expecting the
function section in .gcda to be exactly 3 words in size, containing
ident and two checksums.

While llvm-cov is compatible with UseCfgChecksum=true, it always
expects a function name in .gcda function sections (it's not
compatible with FunctionNamesInData=false). Thus it's currently
impossible to generate one set of coverage files that works with both
gcov and llvm-cov.

This change fixes the reading of coverage information to only read the
function name if it's present.

Patch by Arseny Kapoulkine. Thanks!

llvm-svn: 260162
2016-02-08 22:49:40 +00:00
Justin Bogner 740f2ca672 cmake: Use "set" instead of "option" for LLVM_ENABLE_LTO
Apparently option is for bools and cmake-gui will display this
strangely with option.

Pointed out by edward-san - thanks!

llvm-svn: 260154
2016-02-08 21:55:19 +00:00
Dan Gohman 06b4958260 [WebAssembly] Update the br_if instructions' operand orders to match the spec.
llvm-svn: 260152
2016-02-08 21:50:13 +00:00
Sanjay Patel 4d36bbaf19 rangify; NFC
llvm-svn: 260151
2016-02-08 21:32:43 +00:00
Rong Xu 1288a19421 [PGO] Differentiate Clang instrumentation and IR level instrumentation profiles
This patch uses one bit in profile version to differentiate Clang
instrumentation and IR level instrumentation profiles.

PGOInstrumenation generates a COMDAT variable __llvm_profile_raw_version so
that the compiler runtime can set the right profile kind.
PGOInstrumenation now checks this bit to make sure it's an IR level
instrumentation profile.

Differential Revision: http://reviews.llvm.org/D15540

llvm-svn: 260146
2016-02-08 21:07:46 +00:00
Sanjay Patel 264d7e5b68 [x86] convert masked store of one element to scalar store
Another opportunity to reduce masked stores: in D16691, we decided not to attempt the 'one mask element is set'
transform in InstCombine, but this should be a win for any AVX machine.

Code comments note that this transform could be extended for other targets / cases.

Differential Revision: http://reviews.llvm.org/D16828

llvm-svn: 260145
2016-02-08 21:05:08 +00:00
Justin Bogner 9fd2039faf cmake: Accept "thin" or "full" as arguments to -DLLVM_ENABLE_LTO
Mehdi suggested in a review of r259766 that it's also useful to easily
set the type of LTO. Augment the cmake variable to support that.

llvm-svn: 260143
2016-02-08 21:01:24 +00:00
Xinliang David Li f5b462d1e7 Fix build bot failure
llvm-svn: 260138
2016-02-08 20:08:21 +00:00
Tom Stellard 309617645d AMDGPU/SI: Implement a work-around for smrd corrupting vccz bit
Summary:
We will hit this once we have enabled uniform branches.  The
smrd-vccz-bug.ll test will be added with the uniform branch commit.

Reviewers: mareko, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16725

llvm-svn: 260137
2016-02-08 19:49:20 +00:00
Hans Wennborg 303d3dd110 Add triple to h-registers-3.ll to make bots happy after r260133
llvm-svn: 260136
2016-02-08 19:45:24 +00:00
Hans Wennborg 850ec6ca18 [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)
This matches GCC and MSVC's behaviour, and saves on code size.

We were already not extending i1 return values on x86_64 after r127766. This
takes that patch further by applying it to x86 target as well, and also for i8
and i16.

The ABI docs have been unclear about the required behaviour here. The new i386
psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return
vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being
updated to say the same [2].

Differential Revision: http://reviews.llvm.org/D16907

 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf
 [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ

llvm-svn: 260133
2016-02-08 19:34:30 +00:00
Tim Northover e316f76222 AArch64: match correct order in subtraction pattern.
The accumulator in multiply-and-subtract instructions is actually subtracted
*from* so these patterns were computing the wrong value.

llvm-svn: 260131
2016-02-08 19:33:18 +00:00
Sanjay Patel e08381a529 fix typos; NFC
llvm-svn: 260130
2016-02-08 19:27:33 +00:00
Adrian Prantl 817c47bb42 Simplify this unittest.
Thanks to dblaikie for the suggestion!

llvm-svn: 260125
2016-02-08 19:13:15 +00:00
Matt Arsenault 92edab2df9 AMDGPU: Remove bfi and bfm intrinsics
Nothing is using them.

llvm-svn: 260123
2016-02-08 19:06:01 +00:00
Teresa Johnson d7e88e515c [ThinLTO] Remove imported available externally defs from comdats.
Summary:
Available externally definitions are considered declarations for the
linker and eventually dropped. As such they are not allowed to be
in comdats. Remove any such imported functions from comdats.

Reviewers: rafael

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D16120

llvm-svn: 260122
2016-02-08 18:47:20 +00:00
Xinliang David Li a82d6c0a4b [PGO] Enable compression in pgo instrumentation
This reduces sizes of instrumented object files, final binaries,
process images, and raw profile data.

The format of the indexed profile data remain the same.

Differential Revision: http://reviews.llvm.org/D16388 
 

llvm-svn: 260117
2016-02-08 18:13:49 +00:00
Silviu Baranga ea63a7f512 [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memory
sanitizer issue. The PredicatedScalarEvolution's copy constructor
wasn't copying the Generation value, and was leaving it un-initialized.

Original commit message:

[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection

Summary:
This change adds no wrap SCEV predicates with:
  - support for runtime checking
  - support for expression rewriting:
      (sext ({x,+,y}) -> {sext(x),+,sext(y)}
      (zext ({x,+,y}) -> {zext(x),+,sext(y)}

Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.

We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.

Reviewers: mzolotukhin, sanjoy, anemet

Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel

Differential Revision: http://reviews.llvm.org/D15412

llvm-svn: 260112
2016-02-08 17:02:45 +00:00
Adrian Prantl cbec16036b Add a unit test for r259973.
llvm-svn: 260111
2016-02-08 17:02:34 +00:00
Haicheng Wu b35f772b90 [JumpThreading] Change a return of ComputeValueKnownInPredecessors()
Change a return statement of ComputeValueKnownInPredecessors() to be the same as
the rest return statements of the function. Otherwise, it might return true with
an empty Result when the current basic block has no predecessors and trigger the
first assert of JumpThreading::ProcessThreadableEdges().

llvm-svn: 260110
2016-02-08 17:00:39 +00:00
Matt Arsenault 2bba779272 SelectionDAG: Lower some range metadata to AssertZext
If a range has a lower bound of 0, add an AssertZext from the
nearest floor power of two.

This allows operations with some workitem intrinsics with known
maximum ranges to use fast 24-bit multiplies.

llvm-svn: 260109
2016-02-08 16:28:19 +00:00
Michael Zuckerman 529c27f408 [AVX512][PROLQ][PROLD] Change imm8 to int
Differential Revision: http://reviews.llvm.org/D16983

llvm-svn: 260101
2016-02-08 15:13:32 +00:00
Igor Breger 1a39a34eae [SLP] Fix placement of debug statement (NFC)
By Ayal Zaks (ayal.zaks@intel.com)

Differential Revision: http://reviews.llvm.org/D16976

llvm-svn: 260094
2016-02-08 14:11:39 +00:00
Igor Breger 78991582c3 AVX512: Change builtin function name for scalar intrinsics. Add "mask" to function name to reflect the function behavior.
Differential Revision: http://reviews.llvm.org/D16958

llvm-svn: 260089
2016-02-08 12:38:03 +00:00
Silviu Baranga 41b4973329 Revert r260086 and r260085. They have broken the memory
sanitizer bots.

llvm-svn: 260087
2016-02-08 11:56:15 +00:00
Silviu Baranga 70a98bb9e8 [LoopVersioning] Don't assert when there are no memchecks
We shouldn't assert when there are no memchecks, since we
can have SCEV checks. There is already an assert covering
the case where there are no SCEV checks or memchecks.

This also changes the LAA pointer wrapping versioning test
to use the loop versioning pass (this was how I managed to
trigger the assert in the loop versioning pass).

llvm-svn: 260086
2016-02-08 11:15:29 +00:00
Silviu Baranga a35fadc7c4 [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection
Summary:
This change adds no wrap SCEV predicates with:
  - support for runtime checking
  - support for expression rewriting:
      (sext ({x,+,y}) -> {sext(x),+,sext(y)}
      (zext ({x,+,y}) -> {zext(x),+,sext(y)}

Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.

We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.

Reviewers: mzolotukhin, sanjoy, anemet

Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel

Differential Revision: http://reviews.llvm.org/D15412

llvm-svn: 260085
2016-02-08 10:45:50 +00:00
Maxim Ostapenko b1e3f60fb9 [asan] Introduce new hidden -asan-use-private-alias option.
As discussed in https://github.com/google/sanitizers/issues/398, with current
implementation of poisoning globals we can have some CHECK failures or false
positives in case of mixing instrumented and non-instrumented code due to ASan
poisons innocent globals from non-sanitized binary/library. We can use private
aliases to avoid such errors. In addition, to preserve ODR violation detection,
we introduce new __odr_asan_gen_XXX symbol for each instrumented global that
indicates if this global was already registered. To detect ODR violation in
runtime, we should only check the value of indicator and report an error if it
isn't equal to zero.

Differential Revision: http://reviews.llvm.org/D15642

llvm-svn: 260075
2016-02-08 08:30:57 +00:00
Dan Gohman 4918702542 [WebAssembly] Add another optimization idea to README.txt.
llvm-svn: 260070
2016-02-08 03:42:36 +00:00
Craig Topper 3bb3f73be3 [X86] Change FeatureIFMA string to 'avx512ifma'. Matches gcc and fixes PR26461.
llvm-svn: 260069
2016-02-08 01:23:15 +00:00
Craig Topper 52fa32de18 [Support] Use hexdigit. NFC
llvm-svn: 260068
2016-02-08 01:03:01 +00:00
Craig Topper 686e595d41 [Support] Fix the examples and assertion for format_hex_no_prefix to take into account that there are no prefix characters to include in Width.
llvm-svn: 260067
2016-02-08 01:02:55 +00:00
NAKAMURA Takumi d0759abc19 Disable llvm/test/tools/llvm-profdata/value-prof.proftext on win32 for now. Investigating.
llvm-svn: 260064
2016-02-07 23:03:38 +00:00
Simon Pilgrim f116e4acc7 [X86][SSE] Resolve target shuffle inputs to sentinels to permit more combines
The combineX86ShufflesRecursively only supports unary shuffles, but was missing the opportunity to combine binary shuffles with a zero / undef second input.

This patch resolves target shuffle inputs, converting the shuffle mask elements to SM_SentinelUndef/SM_SentinelZero where possible. It then resolves the updated mask to check if we have created a faux unary shuffle.

Additionally, we now attempt to recursively call combineX86ShufflesRecursively for all input operands (we used to just recurse for unary integer shuffles and unary unpacks) - it safely returns early if its not a target shuffle.

Differential Revision: http://reviews.llvm.org/D16683

llvm-svn: 260063
2016-02-07 22:51:06 +00:00
Simon Pilgrim a8d76d8741 [X86][SSE] Regenerate PSHUFB shuffle mask comments tests
llvm-svn: 260061
2016-02-07 22:22:09 +00:00
Daniel Berlin e19fa17016 Make check line consistent
llvm-svn: 260055
2016-02-07 20:57:46 +00:00
Nico Weber e40dca7285 Revert 259942, r259943, r259948.
The Windows bots have been failing for the last two days, with:

FAILED: C:\PROGRA~2\MICROS~1.0\VC\bin\amd64\cl.exe -c LLVMContextImpl.cpp
D:\buildslave\clang-x64-ninja-win7\llvm\lib\IR\LLVMContextImpl.cpp(137) :
    error C2248: 'llvm::TrailingObjects<llvm::AttributeSetImpl,
                                        llvm::IndexAttrPair>::operator delete' :
        cannot access private member declared in class 'llvm::AttributeSetImpl'
    TrailingObjects.h(298) : see declaration of
        'llvm::TrailingObjects<llvm::AttributeSetImpl,
                               llvm::IndexAttrPair>::operator delete'
    AttributeImpl.h(213) : see declaration of 'llvm::AttributeSetImpl'

llvm-svn: 260053
2016-02-07 20:09:18 +00:00
Keno Fischer 3c30544ace [docs] Add a note that the Visual Studio C++ tools are required
Watching new contributors trying to build LLVM on Windows, one of the
very common failure modes was getting a version of Visual Studio
that did not have a C++ compiler for CMake to put up. Trying to create
a C++ project in Visual Studio will cause Visual Studio to go and
download the C++ tools.

llvm-svn: 260049
2016-02-07 19:36:54 +00:00
Philip Reames 656b3b4c5d [docs] Remove now confusing references to cofigure/autoconf
llvm-svn: 260042
2016-02-07 16:35:04 +00:00
Philip Reames 38a8a5fc1d [docs] Wordsmithing to program layout descriptio in GettingStarted
This just incrementally improves what was already there; it's questionable whether this content belongs in the getting started guide at all.

Patch by Ben Nathanson w/permission w/minor edtis by me.

llvm-svn: 260040
2016-02-07 16:23:32 +00:00
Philip Reames a8feaf6184 [docs] Clarify disk space usage of debug builds
llvm-svn: 260039
2016-02-07 15:58:35 +00:00
Roman Divacky 8a63c71699 Fix a typo.
llvm-svn: 260038
2016-02-07 15:50:55 +00:00
Philip Reames d244502a1a [docs] Remove a stale and confusing section from GettingStarted
The mentioned environment variable doesn't appear to have any use in the LLVM repository.  If it is still relevant for clang, we can consider adding it to the clang getting started page.

Patch inspired by documentation work by Ben Nathanson at the LLVM Bloomberg sprint.

llvm-svn: 260037
2016-02-07 15:49:57 +00:00
Philip Reames f03808162b [docs] Update the docs to describe how to build the docs with cmake
llvm-svn: 260035
2016-02-07 15:42:12 +00:00
Simon Pilgrim a3d674470c [X86][SSE] Added support for MOVHPD/MOVLPD + MOVHPS/MOVLPS shuffle decoding.
llvm-svn: 260034
2016-02-07 15:39:22 +00:00
Asaf Badouh ad5c3fc47d [X86][AVX512] add intrinsics of Scalar FP to integer conversion with rounding mode
Differential Revision: http://reviews.llvm.org/D16629

llvm-svn: 260033
2016-02-07 14:59:13 +00:00
Simon Pilgrim 73fc26b44a [X86][SSE] Pulled out repeated target shuffle decodes into helper functions. NFCI.
Pulled out the code used by PSHUFB/VPERMV/VPERMV3 shuffle mask decoding into common helper functions.

The helper functions handle masks coming from BROADCAST/BUILD_VECTOR and ConstantPool nodes respectively.

llvm-svn: 260032
2016-02-07 14:33:03 +00:00
Jeroen Ketema ece55b045b Fix typo in default getNoPreservedMask implementation
llvm-svn: 260026
2016-02-07 11:31:56 +00:00
Igor Breger 0aeda37464 AVX512: VPBROADCASTB/W/D/Q from GPR intrinsics implementation.
Differential Revision: http://reviews.llvm.org/D16813

llvm-svn: 260024
2016-02-07 08:30:50 +00:00
Duncan P. N. Exon Smith c917c7a7b1 LangRef: Fix example code for cmpxchg
Patch by Daniel Robertson!

llvm-svn: 260018
2016-02-07 05:06:35 +00:00
Daniel Berlin 905a646c24 Don't use module context here. It's unnecessary and makes it harder to write unittests
llvm-svn: 260015
2016-02-07 02:03:39 +00:00
Daniel Berlin 1b51a2957d Compute live-in for MemorySSA
llvm-svn: 260014
2016-02-07 01:52:19 +00:00
Daniel Berlin 7898ca658e Only insert into definingblocks once per block
llvm-svn: 260013
2016-02-07 01:52:15 +00:00
Simon Pilgrim 4108368a89 [X86][AVX2] Regenerated broadcast domain tests
llvm-svn: 260010
2016-02-06 22:09:25 +00:00
Simon Pilgrim 672808a853 [X86][SSE] Add tests for MOVHLPS/MOVLHPS shuffle lowering.
As raised in PR26491, we don't make use of these instructions at the moment.

llvm-svn: 260008
2016-02-06 20:11:52 +00:00
Simon Pilgrim 0acc32a3b3 [X86][AVX512] Added support for VPMOVZX shuffle decoding.
llvm-svn: 260007
2016-02-06 19:51:21 +00:00
Philip Reames c4139663ce [docs] Warn against slow serial builds
llvm-svn: 260006
2016-02-06 19:43:40 +00:00
Justin Lebar 1fdb5e6942 [NVPTX] Mark nvvm synchronizing intrinsics as convergent.
Summary:
This is the attribute purpose-made for e.g. __syncthreads.  It appears
that NoDuplicate may not be sufficient to prevent Sink from touching a
call to __syncthreads.

Reviewers: jingyue, hfinkel

Subscribers: llvm-commits, jholewinski, jhen, rnk, tra, majnemer

Differential Revision: http://reviews.llvm.org/D16941

llvm-svn: 260005
2016-02-06 19:32:44 +00:00
Philip Reames 39580a4a30 [docs] Redirect new contributors to the right starting point
llvm-svn: 260004
2016-02-06 19:29:23 +00:00
Philip Reames 9840241245 [docs] Clarify a couple of getting started issues identified during Sprint
llvm-svn: 260003
2016-02-06 19:20:26 +00:00
Simon Pilgrim 83e04913e5 [X86][AVX512] Fixed prefix ordering for lzcnt tests.
Let AVX512 targets share the same CHECKs.

llvm-svn: 260000
2016-02-06 18:07:19 +00:00
Simon Pilgrim a09a154c2e [X86][SSE] Regenerate vector shift tests
llvm-svn: 259999
2016-02-06 17:57:15 +00:00
Simon Pilgrim bfa5f236e4 [X86][SSE] Moved shuffle decode CASE macros earlier. NFC.
To allow the helper functions to make use of them.

llvm-svn: 259997
2016-02-06 17:02:15 +00:00
Simon Pilgrim e1b6db901f [X86][SSE] Refactored PMOVZX shuffle decoding to use scalar input types
First step towards being able to decode AVX512 PMOVZX instructions without a massive bloat in the shuffle decode switch statement.

This should also make it easier to decode X86ISD::VZEXT target shuffles in the future.

llvm-svn: 259995
2016-02-06 16:33:42 +00:00
Teresa Johnson 5e22e4461d [ThinLTO] Include linkage type in function summary
Summary:
Adds the linkage type to both the per-module and combined function
summaries, which subsumes the current islocal bit. This will eventually
be used to optimized linkage types based on global summary-based
analysis.

Reviewers: joker.eph

Subscribers: joker.eph, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D16943

llvm-svn: 259993
2016-02-06 16:07:35 +00:00
Simon Pilgrim 63b1ecab7d line endings fix
llvm-svn: 259992
2016-02-06 15:38:25 +00:00
Simon Pilgrim 9e369f2a51 [X86][SSE] Don't replace an existing 32-bit load with its duplicate
If we are already loading a single 32-bit float/integer then just reuse it.

Fix for regression in D16729

llvm-svn: 259991
2016-02-06 15:37:09 +00:00
Simon Pilgrim 11e4d1146f Comment fix
llvm-svn: 259990
2016-02-06 14:21:49 +00:00
Ashutosh Nema 3bc6d46e62 Corrected tests for Loop Versioning LICM, by adding “REQUIRES: asserts”.
Earlier they were failing under no-assert build.

llvm-svn: 259989
2016-02-06 12:34:41 +00:00
Ashutosh Nema 5f0e4726e9 Fixed short underline error in LangRef.rst for recently added
metadata 'llvm.loop.licm_versioning.disable' description.

llvm-svn: 259988
2016-02-06 09:24:37 +00:00
Ashutosh Nema df6763abe8 New Loop Versioning LICM Pass
Summary:
When alias analysis is uncertain about the aliasing between any two accesses,
it will return MayAlias. This uncertainty from alias analysis restricts LICM
from proceeding further. In cases where alias analysis is uncertain we might
use loop versioning as an alternative.

Loop Versioning will create a version of the loop with aggressive aliasing
assumptions in addition to the original with conservative (default) aliasing
assumptions. The version of the loop making aggressive aliasing assumptions
will have all the memory accesses marked as no-alias. These two versions of
loop will be preceded by a memory runtime check. This runtime check consists
of bound checks for all unique memory accessed in loop, and it ensures the
lack of memory aliasing. The result of the runtime check determines which of
the loop versions is executed: If the runtime check detects any memory
aliasing, then the original loop is executed. Otherwise, the version with
aggressive aliasing assumptions is used.

The pass is off by default and can be enabled with command line option 
-enable-loop-versioning-licm.

Reviewers: hfinkel, anemet, chatur01, reames

Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga,
             llvm-commits

Differential Revision: http://reviews.llvm.org/D9151

llvm-svn: 259986
2016-02-06 07:47:48 +00:00
Adrian Prantl b925b9c85f Relax assertion in ReplaceableMetadataImpl::replaceAllUsesWith().
There is a legitimate use-case in clang where we need to replace a
temporary placeholder node with the temporary node that may be a
forward declaration.

<rdar://problem/24493203>

llvm-svn: 259973
2016-02-06 01:56:55 +00:00
David Blaikie 23919372d1 [llvm-dwp] Merge cu_index from DWPs
This is almost feature complete - just missing tu_index merging now.

llvm-svn: 259971
2016-02-06 01:15:26 +00:00
Lang Hames 120a9b418b [Orc] Slightly improve the x86-64 resolver block machine code.
Replace leaq + movq of a pointer with a single movabsq.

llvm-svn: 259968
2016-02-06 00:55:08 +00:00
Richard Smith dc1414b3f9 llvm-bcanalyzer: Produce summary information for the BLOCKINFO block, it can be
a significant fraction of the file size (for files that otherwise have few
records). Also include an average size per record in the summary information.

llvm-svn: 259965
2016-02-06 00:46:09 +00:00
George Burgess IV 304ccee528 Add note of suboptimal behavior in MemorySSA. NFC.
llvm-svn: 259963
2016-02-06 00:42:52 +00:00
Evandro Menezes d761ca2308 [AArch64] Add the scheduling model for Exynos-M1
Summary:
Add the core scheduling model for the Samsung Exynos-M1 (ARMv8-A).


Reviewers: jmolloy, rengolin, christof, MinSeongKIM, t.p.northover

Subscribers: aemerson, rengolin, MatzeB

Differential Revision: http://reviews.llvm.org/D16644

llvm-svn: 259958
2016-02-06 00:01:41 +00:00
Sanjoy Das 86d7d83f2a [StatepointLower] Use None instead of Optional<int>()
llvm-svn: 259956
2016-02-05 23:40:04 +00:00
Eric Christopher 76f6e70bc7 Make the OCaml tests temporarily unsupported until they can be updated.
llvm-svn: 259954
2016-02-05 23:28:03 +00:00
Lang Hames d677fa8332 [Orc] Fix a typo in the comments for the x86_64 resolver block.
llvm-svn: 259953
2016-02-05 23:27:48 +00:00
Xinliang David Li 1d90b73a3d Variable naming style fix /nfc
llvm-svn: 259952
2016-02-05 23:24:42 +00:00
Richard Smith ef9ac7a512 Attempt#2 to work around MSVC rejects-valid.
llvm-svn: 259948
2016-02-05 23:05:09 +00:00
Richard Smith 4838bd9c22 Attempt to work around an MSVC rejects-valid. Apparently it gets the access
check wrong when inheriting a member through two levels of private inheritance,
where the middle one is a class template specialization.

llvm-svn: 259943
2016-02-05 22:48:19 +00:00
Richard Smith ebfdf26d93 More workarounds for undefined behavior exposed when compiling in C++14 with
-fsized-deallocation. Disable sized deallocation for all objects derived from
TrailingObjects, as we expect the storage allocated for these objects to be
larger than the size of their dynamic type.

llvm-svn: 259942
2016-02-05 22:32:52 +00:00
Xinliang David Li 6219836edd [PGO] Speed up name tab reading
The change allows skipping duplicate strings
 early to avoid redundant md5 computation and
 string copying/swapping. 

llvm-svn: 259941
2016-02-05 22:32:01 +00:00
Davide Italiano da57013776 [llvm-nm] Prefer empty() over size() == 0.
Thanks to David Blaikie for pointing this out!

llvm-svn: 259938
2016-02-05 22:10:42 +00:00
Davide Italiano d535365794 [llvm-nm] Transform a switch() statement in a pair of if(s).
This is more uniform wrt what other tools do and makes the code
a little bit more readable.

llvm-svn: 259937
2016-02-05 22:07:09 +00:00
Davide Italiano a090a00e45 [llvm-nm] Simplify code logic. NFCI.
llvm-svn: 259917
2016-02-05 21:10:48 +00:00
Hans Wennborg 00ab73dcb0 CallAnalyzer::analyzeCall: change the condition back to "Cost < Threshold"
In r252595, I inadvertently changed the condition to "Cost <= Threshold",
which caused a significant size regression in Chrome. This commit rectifies
that.

llvm-svn: 259915
2016-02-05 20:32:42 +00:00
Jun Bum Lim 1de2d44dcf [AArch64] Refactoring aarch64-ldst-opt. NCF.
Remove narrow load / store instructions from getMatchingPairOpcode(),
and add getMatchingWideOpcode().

llvm-svn: 259914
2016-02-05 20:02:03 +00:00
Tom Stellard b9f235e5ce TableGen: Add IsOptional field to AsmOperandClass
Summary:
This makes it possible to specify some operands as optional to the AsmMatcher.
Setting this field to true will prevent the AsmMatcher from emitting
'too few operands' errors when there are missing optional operands.

Reviewers: olista01, ab

Subscribers: nhaustov, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15755

llvm-svn: 259913
2016-02-05 19:59:33 +00:00
Matt Arsenault 7f83397d72 AMDGPU: Account for LDS alignment
The current situation isn't great, because the amount of padding
requires is determined by the inverse order of the first encountered
use. We should eventually somehow sort these to minimize wasted space.

Another problem is the alignment of kernel arguments isn't
respected. The group_segment_alignment is always emitted as
the default 16, and typed arguments with higher alignments
or an explicitly set alignment are also ignored.

llvm-svn: 259912
2016-02-05 19:47:29 +00:00
Matt Arsenault cf84e26fb6 AMDGPU: Preserve alignments on new created globals
Also switch to internal linkage, and include the name of the function in
the name.

llvm-svn: 259911
2016-02-05 19:47:23 +00:00
Reid Kleckner 98762d2429 [codeview] Dump a missing field and change its signedness
llvm-svn: 259904
2016-02-05 19:15:45 +00:00
Tom Stellard 1242ce9695 AMDGPU: Remove some purely R600 functions from AMDGPUInstrInfo
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16862

llvm-svn: 259900
2016-02-05 18:44:57 +00:00
Tom Stellard 5dde1d2eb3 AMDGPU: Fix ordering of CPU and FS parameters in TargetMachine constructors
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16863

llvm-svn: 259897
2016-02-05 18:29:17 +00:00
Reid Kleckner 6da9115e53 Fix echo.ll test failing due to DOS line endings
llvm-svn: 259896
2016-02-05 18:21:28 +00:00
Wei Mi a62f058989 Some stackslots are allocated to vregs which have no real reference.
LiveRangeEdit::eliminateDeadDef is used to remove dead define instructions
after rematerialization. To remove a VNI for a vreg from its LiveInterval,
LiveIntervals::removeVRegDefAt is used. However, after non-PHI VNIs are all
removed, PHI VNI are still left in the LiveInterval. Such unused vregs will
be kept in RegsToSpill[] at the end of InlineSpiller::reMaterializeAll and
spiller will allocate stackslot for them.

The fix is to get rid of unused reg by checking whether it has non-dbg
reference instead of whether it has non-empty interval.

llvm-svn: 259895
2016-02-05 18:14:24 +00:00
Tom Stellard 6e1967ef66 AMDGPU/SI: Correctly initialize SIInsertWaits pass
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16724

llvm-svn: 259894
2016-02-05 17:42:38 +00:00
Dan Gohman d46b09267b [WebAssembly] Update the select instructions' operand orders to match the spec.
llvm-svn: 259893
2016-02-05 17:14:59 +00:00
Nemanja Ivanovic d05e072b74 Add the missing test case for PR26193
llvm-svn: 259888
2016-02-05 15:03:17 +00:00
Nemanja Ivanovic d389c7a3cc Fix for PR 26193
This is a simple fix for a PowerPC intrinsic that was incorrectly defined
(the return type was incorrect).

llvm-svn: 259886
2016-02-05 14:50:29 +00:00
Benjamin Kramer 85c824f131 Move classes defined in a cpp file into an anonymous namespace.
No functionality change intended.

llvm-svn: 259883
2016-02-05 13:50:53 +00:00
Benjamin Kramer 9a3bd23668 Prefix external symbols in llvm-c-test.
This makes it less likely to clash with other stuff that might be linked
in by change, e.g. ncurses exposes an external function called simply
"echo", so linking ncurses statically into the binary explodes in funny
ways.

llvm-svn: 259882
2016-02-05 13:31:14 +00:00
Renato Golin 6274e5222d Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3)."
This reverts commit r259812 as it broke AArch64 self-hosting.

llvm-svn: 259881
2016-02-05 12:14:30 +00:00
Dmitry Polukhin f4124a72c1 [DebugInfo] Eliminate compilation warning about used variable LSDA
The waring was:
lib/DebugInfo/DWARF/DWARFDebugFrame.cpp:643:20: warning: variable ‘LSDA’ set but not used

llvm-svn: 259877
2016-02-05 09:24:34 +00:00
Michael Zolotukhin 73957179d3 [LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible.
In r255133 (reapplied r253126) we started to avoid redundant
recomputation of LCSSA after loop-unrolling. This patch moves one step
further in this direction - now we can avoid it for much wider range of
loops, as we start to look at IR and try to figure out if the
transformation actually breaks LCSSA phis or makes it necessary to
insert new ones.

Differential Revision: http://reviews.llvm.org/D16838

llvm-svn: 259869
2016-02-05 02:17:36 +00:00
David Majnemer 408b5e6603 [MC] Add support for encoding CodeView variable definition ranges
CodeView, like most other debug formats, represents the live range of a
variable so that debuggers might print them out.

They use a variety of records to represent how a particular variable
might be available (in a register, in a frame pointer, etc.) along with
a set of ranges where this debug information is relevant.

However, the format only allows us to use ranges which are limited to a
maximum of 0xF000 in size.  This means that we need to split our debug
information into chunks of 0xF000.

Because the layout of code is not known until *very* late, we must use a
new fragment to record the information we need until we can know
*exactly* what the range is.

llvm-svn: 259868
2016-02-05 01:55:49 +00:00
Joseph Tremoulet adc2376375 [RS4GC] Pass DenseMap by reference, NFC
Summary:
Passing the rematerialized values map to insertRematerializationStores by
value looks to be a simple oversight; update it to pass by reference.


Reviewers: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16911

llvm-svn: 259867
2016-02-05 01:42:52 +00:00
Amaury Sechet b6df435db9 Add various binary operations in the LLVM C API echo test
Summary: This diff increase the tested surface of the C API.

Reviewers: bogner, chandlerc, echristo, dblaikie, joker.eph, Wallbraker

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16910

llvm-svn: 259863
2016-02-05 01:27:11 +00:00
Adam Nemet 9455c1d2b1 [LoopLoadElim] Don't allow versioning when optForSize
This was requested in the review of D16300.

llvm-svn: 259861
2016-02-05 01:14:05 +00:00
Adam Nemet 0cf866ac6c Fix typo in comment
llvm-svn: 259860
2016-02-05 01:14:00 +00:00
Matt Arsenault 5923973fe2 Fix printing of f16 machine operands
Only single and double FP immediates are correctly printed by
MachineInstr::print() during debug output. Half float type goes to
APFloat::convertToDouble() and hits assertion it is not a double
semantics. This diff prints half machine operands correctly.

This cannot currently be hit by any in-tree target.

Patch by Stanislav Mekhanoshin

llvm-svn: 259857
2016-02-05 00:50:18 +00:00
Easwaran Raman 312b34fc30 Fix build breakage introduced by r259846.
llvm-svn: 259855
2016-02-05 00:45:02 +00:00
George Burgess IV 43d8365ecb Add a test for MemorySSA. NFC.
We don't currently have many tests that deal with operations on multiple
local MemoryLocations. This new test helps out a bit in that regard.

llvm-svn: 259854
2016-02-05 00:42:02 +00:00
Amaury Sechet f01be74f16 Add Support to llvm-c-test dependancies
Summary: As per title. It is required and don't get linked in in some builds.

Reviewers: chapuni, joker.eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16903

llvm-svn: 259853
2016-02-05 00:19:50 +00:00
Xinliang David Li eb7d7f8729 Function name change /NFC
llvm-svn: 259851
2016-02-04 23:59:09 +00:00
Easwaran Raman d68aae24e4 Refactor profile summary support code. NFC.
Summary computation is not just for instrumented profiling and so I have moved
the ProfileSummary class to ProfileCommon.h (named so to allow code unrelated
to summary but common to instrumented and sampled profiling to be placed there)

Differential Revision: http://reviews.llvm.org/D16661

llvm-svn: 259846
2016-02-04 23:34:31 +00:00
Amaury Sechet e8ea7d8b1d Improve testing for the C API
Summary:
This basically add an echo test case in C. The support is limited right now, but full support would just be too much to review at once.

The echo test case simply get a module as input and try to output the same exact module. This allow to check the both reading and writing API are working as expected.

I want to improve this test over time to support more and more of the API, in order to improve coverage (coverage is quite poor right now).

Test Plan: Run the test.

Reviewers: chandlerc, bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10725

llvm-svn: 259844
2016-02-04 23:26:19 +00:00
Nemanja Ivanovic b6fdce4ca0 Fix for PR 26356
Using the load immediate only when the immediate (whether signed or unsigned)
can fit in a 16-bit signed field. Namely, from -32768 to 32767 for signed and
0 to 65535 for unsigned. This patch also ensures that we sign-extend under the
right conditions.

llvm-svn: 259840
2016-02-04 23:14:42 +00:00
Evandro Menezes 898acf9db8 Fix typo and test commit
llvm-svn: 259839
2016-02-04 23:07:57 +00:00
Nemanja Ivanovic 220b4fe4a9 Provide a test case for rl259798
llvm-svn: 259835
2016-02-04 22:36:10 +00:00
Chad Rosier 35706ad6bb [AArch64] Bound the number of instructions we scan when searching for updates.
This only impacts the creation of pre-/post-index instructions.  The bound was
set high enough such that it did not change code generation for SPEC200X.

llvm-svn: 259828
2016-02-04 21:26:02 +00:00
Vedant Kumar 4de5cb8aef [docs] Fix typo in YamlIO.rst
Patch by Mario Lang!

llvm-svn: 259825
2016-02-04 20:42:43 +00:00
Niels Ole Salscheider fc81413531 Install cmake files to lib/cmake/llvm
This is the right location for platform-specific files.

On some distributions (e. g. Exherbo), a package can be installed for several
architectures in parallel, but the architecture-independent files are shared.
Therefore, we must not install architecture-dependent files (like the CMake
config and export files) to share/.

llvm-svn: 259821
2016-02-04 20:08:19 +00:00
Simon Pilgrim 7823fd2535 [X86][SSE] Select domain for 32/64-bit partial loads for EltsFromConsecutiveLoads
Choose between MOVD/MOVSS and MOVQ/MOVSD depending on the target vector type.

This has a lot fewer test changes than trying to add this to X86InstrInfo::setExecutionDomain.....

llvm-svn: 259816
2016-02-04 19:27:51 +00:00
Wei Mi 33e7bc0029 Fix a regression for r259736.
When SCEV expansion tries to reuse an existing value, it is needed to ensure
that using the Value at the InsertPt will not break LCSSA. The fix adds a
check that InsertPt is either inside the candidate Value's parent loop, or
the candidate Value's parent loop is nullptr.

llvm-svn: 259815
2016-02-04 19:17:33 +00:00
Xinliang David Li b56be389c8 Fix format in comment
llvm-svn: 259814
2016-02-04 19:14:10 +00:00
Xinliang David Li 402477d2ba [PGO] Add interfaces to annotate instr with VP data
Add interfaces to do value profile data IR annnotation
  and read. Needed by both FE and IR based PGO.

llvm-svn: 259813
2016-02-04 19:11:43 +00:00
Chad Rosier 05f8020cdf [AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3).
This patch allows the mixing of scaled and unscaled load/stores to form
load/store pairs.

PR24465
http://reviews.llvm.org/D12116
Many thanks to Ahmed and Michael for fixes and code review.

This is a reapplication of r246769 and r259790.  The tramp3d failure was caused
by an incorrect refactoring in the patch.  Specifically, we weren't always
properly clearing the SExtIdx flag.

llvm-svn: 259812
2016-02-04 18:59:49 +00:00
Sanjoy Das 76c48e0e70 [SCEV] Add boolean accessors for NSW, NUW and NW; NFC
llvm-svn: 259809
2016-02-04 18:21:54 +00:00
David Majnemer a4859dfa46 Correctly handle {Always,Never}StepIntoLine
llvm-svn: 259806
2016-02-04 17:57:12 +00:00
David Majnemer 4d123512a2 Add support for S_DEFRANGE and S_DEFRANGE_SUBFIELD
llvm-svn: 259805
2016-02-04 17:37:30 +00:00
David Majnemer 6f01e05d7e Make the dumper's output for variable ranges easier to read
llvm-svn: 259804
2016-02-04 17:29:13 +00:00
Sanjay Patel 5f4cc6fbd9 use 'auto' for iterators; NFCI
llvm-svn: 259802
2016-02-04 17:00:35 +00:00
Silviu Baranga 33b3bd17dd [AArch64] Multiply extended 32-bit ints with `[U|S]MADDL'
During instruction selection, the AArch64 backend can recognise the
following pattern and generate an [U|S]MADDL instruction, i.e. a
multiply of two 32-bit operands with a 64-bit result:

(mul (sext i32), (sext i32))
However, when one of the operands is constant, the sign extension
gets folded into the constant in SelectionDAG::getNode(). This means
that the instruction selection sees this:

(mul (sext i32), i64)
...which doesn't match the pattern. Sign-extension and 64-bit
multiply instructions are generated, which are slower than one 32-bit
multiply.

Add a pattern to match this and generate the correct instruction, for
both signed and unsigned multiplies.

Patch by Chris Diamand!

llvm-svn: 259800
2016-02-04 16:47:09 +00:00
Benjamin Kramer e4dff62f64 The canonical way to XFAIL a test for all targets is XFAIL: *, not XFAIL:
Fix the lit bug that enabled this "feature" (empty triple is substring
of all possible target triples) and change the two outliers to use the
documented * syntax.

llvm-svn: 259799
2016-02-04 16:21:38 +00:00
Nemanja Ivanovic e8cbae32e9 Enable the %s modifier in inline asm template string
This patch corresponds to review:
http://reviews.llvm.org/D16847

There are some files in glibc that use the output operand modifier even though
it was deprecated in GCC. This patch just adds support for it to prevent issues
with such files.

llvm-svn: 259798
2016-02-04 16:18:08 +00:00
Renato Golin c455e2f441 [PPC] Move PPC test to a PPC-specific dir
llvm-svn: 259797
2016-02-04 16:14:59 +00:00
Simon Pilgrim 6788f33cf2 [X86][SSE] Add general 32-bit LOAD + VZEXT_MOVL support to EltsFromConsecutiveLoads
This patch adds support for consecutive (load/undef elements) 32-bit loads, followed by trailing undef/zero elements to be combined to a single MOVD load.

Differential Revision: http://reviews.llvm.org/D16729

llvm-svn: 259796
2016-02-04 16:12:56 +00:00
Chad Rosier 18896c0f5e Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."
This reverts commit r259790. tramp3d-v4 is still having problems.

llvm-svn: 259795
2016-02-04 16:01:40 +00:00
Simon Pilgrim 528e94e9a2 [X86][SSE] Added i686 target tests to make sure we are correctly loading consecutive entries as 64-bit integers
llvm-svn: 259794
2016-02-04 15:51:55 +00:00
Elena Demikhovsky 86528270b9 AVX-512: Fixed a bug in FMA instruction selection on KNL
The FMA instruction was selected from AVX2 set instead of AVX-512

Differential Revision: http://reviews.llvm.org/D16884

llvm-svn: 259792
2016-02-04 15:11:11 +00:00
Petar Jovanovic 23e44f5e39 [Power PC] softening long double type
This patch implements softening of long double type (ppcf128) on ppc32
architecture and enables operations for this type for soft float.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D15811

llvm-svn: 259791
2016-02-04 14:43:50 +00:00
Chad Rosier feec2aeb0f [AArch64] Improve load/store optimizer to handle LDUR + LDR.
This patch allows the mixing of scaled and unscaled load/stores to form
load/store pairs.

PR24465
http://reviews.llvm.org/D12116
Many thanks to Ahmed and Michael for fixes and code review.

This is a reapplication of r246769, which was reverted in r246782 due to a
test-suite failure.  I'm unable to reproduce the issue at this time.

llvm-svn: 259790
2016-02-04 14:42:55 +00:00
Michael Zuckerman 7d73360479 [AVX512] add vfmadd132ss and vfmadd132sd Intrinsic
Differential Revision: http://reviews.llvm.org/D16589

llvm-svn: 259789
2016-02-04 14:41:08 +00:00
Simon Pilgrim 8159cf11bc [X86] Add AVX512 vector zext tests
llvm-svn: 259786
2016-02-04 14:06:19 +00:00
Jonas Paulsson 2293685731 [ScheduleDagInstrs] Improved comments
llvm-svn: 259783
2016-02-04 13:08:48 +00:00
Simon Pilgrim 1d2d6c5a57 [X86] Moved SEXT -> SIGN_EXTEND_VECTOR_INREG combine into helper. NFC.
llvm-svn: 259771
2016-02-04 09:27:19 +00:00
Andrey Turetskiy bca0f99224 [X86] Use hash table in LEA optimization pass.
Use hash table (key is a memory operand) to store found LEA instructions to reduce compile time.

Differential Revision: http://reviews.llvm.org/D16404

llvm-svn: 259770
2016-02-04 08:57:03 +00:00
Justin Bogner e12385bd6a cmake: Add a flag to enable LTO
This adds -DLLVM_ENABLE_LTO, rather than forcing people to manually
add -flto to the various _FLAGS variables.

llvm-svn: 259766
2016-02-04 07:28:30 +00:00
Craig Topper 775fb73de7 [Support] Use range-based for loop. NFC
llvm-svn: 259763
2016-02-04 06:51:41 +00:00
Craig Topper d08f32f66a [Support] Use hexdigit instead of manually coding the same thing. NFC
llvm-svn: 259762
2016-02-04 06:51:38 +00:00
Xinliang David Li 1e4c809c6c [PGO] Profile interface cleanup
- Remove unused valuemapper parameter
  - add totalcount optional parameter

llvm-svn: 259756
2016-02-04 05:29:51 +00:00
Jingyue Wu f650441b04 [NVPTX] Disable performance optimizations when OptLevel==None
Reviewers: jholewinski, tra, eliben

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D16874

llvm-svn: 259749
2016-02-04 04:15:36 +00:00
Nemanja Ivanovic 155402c9c2 Test case for PR 26381
llvm-svn: 259740
2016-02-04 01:58:20 +00:00
Wei Mi a49559befb [SCEV] Try to reuse existing value during SCEV expansion
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.

This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.

The original commit triggered regressions in Polly tests. The regressions
exposed two problems which have been fixed in current version.

1. Polly will generate a new function based on the old one. To generate an
instruction for the new function, it builds SCEV for the old instruction,
applies some tranformation on the SCEV generated, then expands the transformed
SCEV and insert the expanded value into new function. Because SCEV expansion
may reuse value cached in ExprValueMap, the value in old function may be
inserted into new function, which is wrong.
   In SCEVExpander::expand, there is a logic to check the cached value to
be used should dominate the insertion point. However, for the above
case, the check always passes. That is because the insertion point is
in a new function, which is unreachable from the old function. However
for unreachable node, DominatorTreeBase::dominates thinks it will be
dominated by any other node.
   The fix is to simply add a check that the cached value to be used in
expansion should be in the same function as the insertion point instruction.

2. When the SCEV is of scConstant type, expanding it directly is cheaper than
reusing a normal value cached. Although in the cached value set in ExprValueMap,
there is a Constant type value, but it is not easy to find it out -- the cached
Value set is not sorted according to the potential cost. Existing reuse logic
in SCEVExpander::expand simply chooses the first legal element from the cached
value set.
   The fix is that when the SCEV is of scConstant type, don't try the reuse
logic. simply expand it.

Differential Revision: http://reviews.llvm.org/D12090

llvm-svn: 259736
2016-02-04 01:27:38 +00:00
Richard Smith 69cb000974 Fix undefined behavior when compiling in C++14 mode (with sized deletion
enabled): ensure that we do not invoke the sized deallocator for MemoryBuffer
subclasses that have tail-allocated data.

llvm-svn: 259735
2016-02-04 01:21:16 +00:00
Reid Kleckner cb91e7d395 [codeview] Don't attempt a cross-section label diff
This only comes up when we're trying to find the next .cv_loc label.

Fixes PR26467

llvm-svn: 259733
2016-02-04 00:21:42 +00:00
Kostya Serebryany ce925c580e [libFuzzer] hot fix a test
llvm-svn: 259732
2016-02-04 00:12:28 +00:00
Kostya Serebryany b92602ada0 [libFuzzer] don't write the test unit when a leak is detected (since we don't know which unit causes the leak)
llvm-svn: 259731
2016-02-04 00:02:17 +00:00
Gerolf Hoflehner 2432bd0ddd [SimplifyCFG] Fix for "endless" loop after dead code removal (Alternative to
D16251)

Summary:
This is a simpler fix to the problem than the dominator approach in
http://reviews.llvm.org/D16251. It adds only values into the gather() while loop
that have been seen before.

The actual endless loop is in the constant compare gather() routine in
Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the
queue:
%.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i

Here is what happens at the IR level:

for.cond.i:                                       ; preds = %if.end6.i,
%if.end.i54
%ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ]
%ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<<
%cmp2.i = icmp ult i32 %ix.0.i, %11
br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit

if.end6.i:                                        ; preds = %for.body.i
%cmp10.i = icmp ugt i32 %conv.i, %add9.i
%.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<<

When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i.
The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i
(Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i).
And
now there is use of .ret.0.off0.i before a definition which triggers the
“endless” loop in gather():

while(!DFT.empty()) {

    V = DFT.pop_back_val();   // V is .ret.0.off0.i

    if (Instruction *I = dyn_cast<Instruction>(V)) {
      // If it is a || (or && depending on isEQ), process the operands.
      if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) {
        DFT.push_back(I->getOperand(1));  // This is now .ret.0.off0.i also
        DFT.push_back(I->getOperand(0));

        continue; // “endless loop” for .ret.0.off0.i
      }

Reviewers: reames, ahatanak

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16839

llvm-svn: 259730
2016-02-03 23:54:25 +00:00
Vedant Kumar 2d5b5d3d3a [InstrProfiling] Fix a comment (NFC)
llvm-svn: 259727
2016-02-03 23:22:43 +00:00
David L Kreitzer f24d409dce Unify the target opcode enum in TargetOpcodes.h and the FixedInstrs array in
CodeGenTarget.cpp to avoid the ordering dependence. NFCI.

Differential Revision: http://reviews.llvm.org/D16826

llvm-svn: 259726
2016-02-03 23:17:32 +00:00
Junmo Park e90057a5f3 Minor code cleanups. NFC.
llvm-svn: 259725
2016-02-03 23:16:39 +00:00
David Majnemer d74490f2cc Print the OffsetStart field's relocation
llvm-svn: 259723
2016-02-03 22:45:21 +00:00
Sanjay Patel e9fa3363b4 rangify; NFCI
llvm-svn: 259722
2016-02-03 22:44:14 +00:00
Sanjay Patel 460ce9cd9b clean up; NFC
llvm-svn: 259720
2016-02-03 22:37:37 +00:00
David Majnemer ac10cfde97 [llvm-readobj] Add support for dumping S_DEFRANGE symbols
llvm-svn: 259719
2016-02-03 22:36:46 +00:00
Reid Kleckner 0d4ecb6ff5 Replace static const int with enum to fix obnoxious linker errors about a missing definition
llvm-svn: 259712
2016-02-03 21:45:39 +00:00
Reid Kleckner 17495274fd [unittests] Move TargetRegistry test from Support to MC
This removes the dependency from SupportTests to all of the LLVM
backends, and makes it link faster.

llvm-svn: 259705
2016-02-03 21:41:24 +00:00
Reid Kleckner c2e2311627 Silence -Wsign-conversion issue in ProgramTest.cpp
Unfortunately, ProgramInfo::ProcessId is signed on Unix and unsigned on
Windows, breaking the standard fix of using '0U' in the gtest
expectation.

llvm-svn: 259704
2016-02-03 21:41:12 +00:00
Ana Pazos b3596028cf Fix pointers to go on the right hand side. NFC.
Summary:
Fixed pointers to go on the right hand side following coding guidelines. NFC.

Patch by Mandeep Singh Grang.

Reviewers: majnemer, arsenm, sanjoy

Differential Revision: http://reviews.llvm.org/D16866

llvm-svn: 259703
2016-02-03 21:34:39 +00:00
David Majnemer a53b5bbb18 [LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitches
Bail out if we have a PHI on an EHPad that gets a value from a
CatchSwitchInst.  Because the CatchSwitchInst cannot be split, there is
no good place to stick any instructions.

This fixes PR26373.

llvm-svn: 259702
2016-02-03 21:30:34 +00:00
David Majnemer fa8681e452 [ScalarEvolutionExpander] Simplify findInsertPointAfter
No functional change is intended.  The loop could only execute, at most,
once.

llvm-svn: 259701
2016-02-03 21:30:31 +00:00
Reid Kleckner eb3bcdd28b [codeview] Remove EmitLabelDiff in favor emitAbsoluteSymbolDiff
llvm-svn: 259700
2016-02-03 21:24:42 +00:00
Reid Kleckner dac21b43d5 [codeview] Use the MCStreamer interface directly instead of AsmPrinter
This is mostly about having shorter lines and standardizing on one
interface, but it also avoids some needless indirection.

No functional change.

llvm-svn: 259697
2016-02-03 21:15:48 +00:00
Keno Fischer 6c1e47a66b [DWARFDebug] Fix another case of overlapping ranges
Summary:
In r257979, I added code to ensure that we wouldn't merge DebugLocEntries if
the pieces they describe overlap. Unfortunately, I failed to cover the case,
where there may have multiple active Expressions in the entry, in which case we
need to make sure that no two values overlap before we can perform the merge.

This fixed PR26148.

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16742

llvm-svn: 259696
2016-02-03 21:13:33 +00:00
Todd Fiala 675bdcedff Address NDEBUG-related linkage issues for Value::assertModuleIsMaterialized()
The IR/Value class had a linkage issue present when LLVM was built
as a library, and the LLVM library build time had different settings
for NDEBUG than the client of the LLVM library.  Clients could get
into a state where the LLVM lib expected
Value::assertModuleIsMaterialized() to be inline-defined in the header
but clients expected that method to be defined in the LLVM library.

See this llvm-commits thread for more details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160201/329667.html

llvm-svn: 259695
2016-02-03 21:13:23 +00:00
Tim Shen f99f0d5a7e [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behavior
This patch consists of two parts: a performance fix in DAGCombiner.cpp
and a correctness fix in SelectionDAG.cpp.

The test case tests the bug that's uncovered by the performance fix, and
fixed by the correctness fix.

The performance fix keeps the containers required by the
hasPredecessorHelper (which is a lazy DFS) and reuse them. Since
hasPredecessorHelper is called in a loop, the overall efficiency reduced
from O(n^2) to O(n), where n is the number of SDNodes.

The correctness fix keeps iterating the neighbor list even if it's time
to early return. It will return after finishing adding all neighbors to
Worklist, so that no neighbors are discarded due to the original early
return.

llvm-svn: 259691
2016-02-03 20:58:55 +00:00
Reid Kleckner 45b6159ed3 Minor performance tweaks to llvm-tblgen (and a few that might be a good idea)
Summary:
This patch adds a reserve call to an expensive function
(`llvm::LoadIntrinsics`), and may fix a few other low hanging
performance fruit (I've put them in comments for now, so we can
discuss).

**Motivation:**

As I'm sure other developers do, when I build LLVM, I build the entire
project with the same config (`Debug`, `MinSizeRel`, `Release`, or
`RelWithDebInfo`). However, the `Debug` config also builds llvm-tblgen
in `Debug` mode. Later build steps that run llvm-tblgen then can
actually be the slowest steps in the entire build. Nobody likes slow
builds.

Reviewers: rnk, dblaikie

Differential Revision: http://reviews.llvm.org/D16832

Patch by Alexander G. Riccio

llvm-svn: 259683
2016-02-03 19:34:28 +00:00
Saleem Abdulrasool f36005a358 ARM: support TLS for WoA
Add support for TLS access for Windows on ARM.  This generates a similar access
to MSVC for ARM.

The changes to the tablegen data is needed to support loading an external symbol
global that is not for a call.  The adjustments to the DAG to DAG transforms are
needed to preserve the 32-bit move.

llvm-svn: 259676
2016-02-03 18:21:59 +00:00
Wei Mi 97de385868 Revert r259662, which caused regressions on polly tests.
llvm-svn: 259675
2016-02-03 18:05:57 +00:00
Quentin Colombet 7ec03dc7f8 [InstCombine] Revert r238452: Fold IntToPtr and PtrToInt into preceding loads.
According to git bisect, this is the root cause of a miscompile for Regex in
libLLVMSupport. I am still working on reducing a test case.
The actual bug may be elsewhere and this commit just exposed it.

Anyway, at the moment, to reproduce, follow these steps:
1. Build clang and libLTO in release mode.
2. Create a new build directory <stage2> and cd into it.
3. Use clang and libLTO from #1 to build llvm-extract in Release mode + asserts
   using -O2 -flto
4. Run llvm-extract  -ralias '.*bar' -S test/Other/extract-alias.ll

Result:
program doesn't contain global named '.*bar'!

Expected result:
@a0a0bar = alias void ()* @bar
@a0bar = alias void ()* @bar

declare void @bar()

Note: In step #3, if you don't use lto or asserts, the miscompile disappears.
llvm-svn: 259674
2016-02-03 18:04:13 +00:00
Jonas Paulsson ac29f01788 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
Recommited, after some fixing with test cases.

Updated test cases:
test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
test/CodeGen/AArch64/tailcall_misched_graph.ll

Temporarily disabled test cases:
test/CodeGen/AMDGPU/split-vector-memoperand-offsets.ll
test/CodeGen/PowerPC/ppc64-fastcc.ll (partially updated)
test/CodeGen/PowerPC/vsx-fma-m.ll
test/CodeGen/PowerPC/vsx-fma-sp.ll

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259673
2016-02-03 17:52:29 +00:00
Xinliang David Li ae2556f7ee Fix comments /NFC
llvm-svn: 259672
2016-02-03 17:51:16 +00:00
Joseph Tremoulet e1014a398e [Unittest] Clean up formatting, NFC
Summary:
Use an early return to reduce indentation.
Remove unused local.

Reviewers: dblaikie, lhames

Subscribers: lhames, llvm-commits

Differential Revision: http://reviews.llvm.org/D16513

llvm-svn: 259663
2016-02-03 17:11:24 +00:00
Wei Mi ed133978a0 [SCEV] Try to reuse existing value during SCEV expansion
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.

This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.

Differential Revision: http://reviews.llvm.org/D12090

llvm-svn: 259662
2016-02-03 17:05:12 +00:00
Renato Golin 6027dd38ef [ARM] Move GNUEABI divmod to __aeabi_divmod*
The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores
which happens to be a lot faster than __divsi3/__modsi3 when the core
has hardware divide instructions. Do the same here.

Fixes PR26450.

llvm-svn: 259657
2016-02-03 16:10:54 +00:00
Jun Bum Lim 59df5e89c2 [MachineCopyPropagation] Fix comment. NFC
Reviewers: MatzeB, qcolombet, jmolloy, mcrosier

Subscribers: llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16806

llvm-svn: 259656
2016-02-03 15:56:27 +00:00
Daniel Sanders 3b1a2dbffa [mips] Remove redundant inclusions of MipsAnalyzeImmediate.h
llvm-svn: 259655
2016-02-03 15:54:12 +00:00
James Molloy 6e518a3b50 [DemandedBits] Revert r249687 due to PR26071
This regresses a test in LoopVectorize, so I'll need to go away and think about how to solve this in a way that isn't broken.

From the writeup in PR26071:

What's happening is that ComputeKnownZeroes is telling us that all bits except the LSB are zero. We're then deciding that only the LSB needs to be demanded from the icmp's inputs.

This is where we're wrong - we're assuming that after simplification the bits that were known zero will continue to be known zero. But they're not - during trivialization the upper bits get changed (because an XOR isn't shrunk), so the icmp fails.

The fault is in demandedbits - its contract does clearly state that a non-demanded bit may either be zero or one.

llvm-svn: 259649
2016-02-03 15:05:06 +00:00
Nemanja Ivanovic 82e1168989 Fix for PR 26381
Simple fix - Constant values were not being sign extended in FastIsel.

llvm-svn: 259645
2016-02-03 12:53:38 +00:00
Simon Atanasyan e774126c96 [mips] Add SHF_MIPS_GPREL flag to the MIPS .sbss and .sdata sections
MIPS ABI states that .sbss and .sdata sections must have SHF_MIPS_GPREL
flag. See Figure 4–7 on page 69 in the following document:
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf.

Differential Revision: http://reviews.llvm.org/D15740

llvm-svn: 259641
2016-02-03 11:50:22 +00:00
Dylan McKay bff960a926 [TableGen] Add 'register alternative name matching' support
Summary:
This adds a new attribute which targets can set in TableGen which causes a function to be generated which matches register alternative names. This is very similar to `ShouldEmitMatchRegisterName`, except it works on alt names.

This patch is currently used by the out of tree part of the AVR backend. It reduces code duplication greatly, and has the effect that you do not need to hardcode altname to register mappings in C++.

It will not work on targets which have registers which share the same aliases.

Reviewers: stoklund, arsenm, dsanders, hfinkel, vkalintiris

Subscribers: hfinkel, dylanmckay, llvm-commits

Differential Revision: http://reviews.llvm.org/D16312

llvm-svn: 259636
2016-02-03 10:30:16 +00:00
Simon Pilgrim 18bcf93efb [X86][AVX] Add support for 64-bit VZEXT_LOAD of 256/512-bit vectors to EltsFromConsecutiveLoads
Follow up to D16217 and D16729

This change uncovered an odd pattern where VZEXT_LOAD v4i64 was being lowered to a load of the lower v2i64 (so the 2nd i64 destination element wasn't being zeroed), I can't find any use/reason for this and have removed the pattern and replaced it so only the 1st i64 element is loaded and the upper bits all zeroed. This matches the description for X86ISD::VZEXT_LOAD

Differential Revision: http://reviews.llvm.org/D16768

llvm-svn: 259635
2016-02-03 09:41:59 +00:00
Xinliang David Li 876ed52c8a Add a compatibility test
llvm-svn: 259632
2016-02-03 06:27:38 +00:00
Xinliang David Li 3c88288927 Fix a typo in comment
llvm-svn: 259631
2016-02-03 06:24:11 +00:00
Xinliang David Li a398d2d94a Fix uninitiazed variable use problem
llvm-svn: 259630
2016-02-03 06:23:16 +00:00
Xinliang David Li 6c93ee8d36 [PGO] Profile summary reader/writer support
With this patch, the profile summary data will be available in indexed
profile data file so that profiler reader/compiler optimizer can start
to make use of.

Differential Revision: http://reviews.llvm.org/D16258

llvm-svn: 259626
2016-02-03 04:08:18 +00:00
Peter Collingbourne 0c0d7e2d0f LowerBitSets: Don't bother to do any work if the llvm.bitset.test intrinsic is unused.
llvm-svn: 259625
2016-02-03 03:48:46 +00:00
Peter Collingbourne 83cc981c49 Add #include "llvm/Support/raw_ostream.h" to fix Windows build.
llvm-svn: 259623
2016-02-03 03:16:37 +00:00
Peter Collingbourne 9f7ec14009 Transforms: Move GlobalOpt's Evaluator to Utils where it can be reused.
llvm-svn: 259621
2016-02-03 02:51:00 +00:00
Nick Lewycky a093ab4ad6 Fix typo in comment. NFC
llvm-svn: 259620
2016-02-03 02:15:49 +00:00
Peter Collingbourne 4e3605a2af docs: Document how bitsets may be used to encode type information.
llvm-svn: 259619
2016-02-03 02:01:08 +00:00
Kyle Butt d62d8b771d Codegen: [PPC] Fix PPCVSXFMAMutate to handle duplicates.
The purpose of PPCVSXFMAMutate is to elide copies by changing FMA forms
on PPC.

    %vreg6<def> = COPY %vreg96
    %vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg7
    ;v6 = v6 + v5 * v7

is replaced by

    %vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg7, %vreg96
    ;v5 = v5 * v7 + v96

This was broken in the case where the target register was also used as a
multiplicand. Fix this case by checking for it and replacing both uses
with the copied register.

    %vreg6<def> = COPY %vreg96
    %vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg6
    ;v6 = v6 + v5 * v6

is replaced by

    %vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg96, %vreg96
    ;v5 = v5 * v96 + v96

llvm-svn: 259617
2016-02-03 01:41:09 +00:00
Yunzhong Gao eb959722a7 Revert r259576: Disable the vzeroupper insertion pass on PS4.
Will re-implement based on review feedback.

llvm-svn: 259615
2016-02-03 01:25:12 +00:00
Marcello Maggioni bfe87568aa RegCoalescer: Making sure re-materialization defines all subranges
The register coalescer can rematerialize constants that define
more of a register than the copy it is going to replace was going
to do.
This is valid in the case the register was undef before the
copy happened.
This patch makes sure that all the subranges defined by the new
rematerialization instructions have at least a dead def.

Review: http://reviews.llvm.org/D16693
llvm-svn: 259614
2016-02-03 00:22:32 +00:00
NAKAMURA Takumi a8d480d9d5 DiagnosticInfoWithDebugLocBase: Appease Twine for now.
FIXME: We should get rid of Twine in the record.
llvm-svn: 259612
2016-02-03 00:09:22 +00:00
Adam Nemet d52ed84160 [LoopVersioning] Expose loop versioning as a pass too
Summary:
LoopVersioning is a transform utility that transform passes can use to
run-time disambiguate may-aliasing accesses. I'd like to also expose as
pass to allow it to be unit-tested.

I am planning to add support for non-aliasing annotation in
LoopVersioning and I'd like to be able to write tests directly using
this pass.

(After that feature is done, the pass could also be used to look for
optimization opportunities that are hidden behind incomplete alias
information at compile time.)

The pass drives LoopVersioning in its default way which is to fully
disambiguate may-aliasing accesses no matter how many checks are
required.

Reviewers: hfinkel, ashutosh.nema, sbaranga

Subscribers: zzheng, mssimpso, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D16612

llvm-svn: 259610
2016-02-03 00:06:10 +00:00
George Burgess IV 60adac46f2 Attempt #2 to unbreak r259595.
llvm-svn: 259602
2016-02-02 23:26:01 +00:00
David Majnemer 30579ec851 [codeview] Improve readability of codeview assembly output
Strictly speaking, this is not an improvement in functionality per se
but a usability improvement to those debugging codeview.

llvm-svn: 259601
2016-02-02 23:18:23 +00:00
Kostya Serebryany d88d1305c4 [libFuzzer] don't create too many trace-based mutations as it may be too slow
llvm-svn: 259600
2016-02-02 23:17:45 +00:00
George Burgess IV b5a229f779 Attempt to fix builds broken by r259595.
llvm-svn: 259599
2016-02-02 23:15:26 +00:00
George Burgess IV e1100f533f This patch adds MemorySSA to LLVM.
Please see include/llvm/Transforms/Utils/MemorySSA.h for a description
of MemorySSA, and what it does.

Differential Revision: http://reviews.llvm.org/D7864

llvm-svn: 259595
2016-02-02 22:46:49 +00:00
Philip Reames b7571043f2 [LVI] Fix debug output
Due to staleness in a patch I committed yesterday, the debug output was reporting overdefined cases as being undefined.  Confusing to say the least.  The mistake appears to have only effected the debug output thankfully.

llvm-svn: 259594
2016-02-02 22:43:08 +00:00
Anna Zaks 3b50e70bbe [asan] Add iOS support to AddressSanitzier
Differential Revision: http://reviews.llvm.org/D15625

llvm-svn: 259586
2016-02-02 22:05:07 +00:00
Philip Reames ed8cd0d36e [LVI] Code motion only [NFC]
I introduced a declaration in 259583 to keep the diff readable.  This change just moves the definition up to remove the declaration again.

llvm-svn: 259585
2016-02-02 22:03:19 +00:00
Philip Reames d1f829d374 [LVI] Refactor to use newly introduced intersect utility
This patch uses the newly introduced 'intersect' utility (from 259461: [LVI] Introduce an intersect operation on lattice values) to simplify existing code in LVI.

While not introducing any new concepts, this change is probably not NFC.  The common 'intersect' function is more powerful that the ad-hoc implementations we'd had in a couple of places.  Given that, we may see optimizations triggering a bit more often.

llvm-svn: 259583
2016-02-02 21:57:37 +00:00
Justin Bogner 246345a834 Remove utils/buildit
The autoconf build system was removed - this doesn't even work and
doesn't need to be here.

llvm-svn: 259582
2016-02-02 21:56:16 +00:00
Hemant Kulkarni 782edae7d6 Correct size calculations for ELF files
llvm-svn: 259578
2016-02-02 21:41:49 +00:00
Yunzhong Gao b76ccacfb1 Disable the vzeroupper insertion pass on PS4.
See comments in test/CodeGen/X86/avx-vzeroupper.ll for more explanation.

Original patch by: Sean Silva

llvm-svn: 259576
2016-02-02 21:39:23 +00:00
Lang Hames 3923698b3f [Orc] Stub addresses should be based on stub size, not pointer size.
This didn't affect X86_64, which is the only client of this code at the moment,
as stubs and pointers are both 8-bytes there. It will affect other platforms
though.

llvm-svn: 259575
2016-02-02 21:38:30 +00:00
Matt Arsenault de4208122b AMDGPU: Do not promote allocas with non-inbounds GEPs
If we can't assume the pointer value isn't within the bounds
of the object, it seems risky to try to replace the pointer
calculations.

llvm-svn: 259573
2016-02-02 21:16:12 +00:00
Matt Arsenault 7e747f1a38 AMDGPU: Handle promoting memmove
Also add missing tests for the others.

llvm-svn: 259558
2016-02-02 20:28:10 +00:00
Quentin Colombet b8fb2ba1bb [X86] Fix the merging of SP updates in prologue/epilogue insertions.
When the merging was involving LEAs, we were taking the wrong immediate
from the list of operands.

rdar://problem/24446069

llvm-svn: 259553
2016-02-02 20:11:17 +00:00
Matthias Braun 1377fd6781 MachineVerifier: Check that defs/uses are live in subregisters as well.
llvm-svn: 259552
2016-02-02 20:04:51 +00:00
Matt Arsenault 8b175672cb AMDGPU: Skip promote alloca with no optimizations
llvm-svn: 259551
2016-02-02 19:32:42 +00:00
Matt Arsenault fb8cdbae0c AMDGPU: Minor cleanups for AMDGPUPromoteAlloca
Mostly convert to use range loops.

llvm-svn: 259550
2016-02-02 19:32:35 +00:00
Lang Hames e28b118be0 [Orc] Turn OrcX86_64::IndirectStubsInfo into a template helper class:
GenericIndirectStubsInfo.

This will allow architecture support classes for other architectures to re-use
this code.

llvm-svn: 259549
2016-02-02 19:31:15 +00:00
David Majnemer c9911f28e5 [codeview] Correctly handle inlining functions post-dominated by unreachable
CodeView requires us to accurately describe the extent of the inlined
code.  We did this by grabbing the next debug location in source order
and using *that* to denote where we stopped inlining.  However, this is
not sufficient or correct in instances where there is no next debug
location or the next debug location belongs to the start of another
function.

To get this correct, use the end symbol of the function to denote the
last possible place the inlining could have stopped at.

llvm-svn: 259548
2016-02-02 19:22:34 +00:00
Matt Arsenault e5737f7cac AMDGPU: Report AMDGPUPromoteAlloca changed the function
llvm-svn: 259547
2016-02-02 19:18:57 +00:00
Matt Arsenault ad1348459f AMDGPU: Whitelist handled intrinsics
We shouldn't crash on unhandled intrinsics.
Also simplify failure handling in loop.

llvm-svn: 259546
2016-02-02 19:18:53 +00:00
Matt Arsenault 853a1fc6d9 AMDGPU: Use inbounds when calculating workitem offset
When promoting allocas to LDS, we know we are indexing
into a specific area just created, and the calculation
will also never overflow.

Also emit some of the muls as nsw nuw, because instcombine
infers this already from the range metadata. I think
putting this on the other adds and muls might be OK too,
but I'm not 100% sure.

llvm-svn: 259545
2016-02-02 19:18:48 +00:00