Commit Graph

147980 Commits

Author SHA1 Message Date
Akira Hatanaka e327f09832 [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 300913
2017-04-20 22:47:56 +00:00
Sanjay Patel cc663b82fa [InstCombine] function names start with lower-case letter; NFC
Forgot to make this fix with the signature change in r300911.

llvm-svn: 300912
2017-04-20 22:37:01 +00:00
Sanjay Patel c9485ca895 [InstCombine] allow shl+shr demanded bits folds with splat constants
llvm-svn: 300911
2017-04-20 22:33:54 +00:00
Sanjay Patel be2dcaf45a [InstCombine] add tests for shl+shr demanded bits splat vector folds; NFC
llvm-svn: 300907
2017-04-20 22:18:47 +00:00
Tim Northover 100b7f6eae AArch64: lower "fence singlethread" to a pure compiler barrier.
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.

llvm-svn: 300905
2017-04-20 21:57:45 +00:00
Tim Northover 46e58354da ARM: lower "fence singlethread" to a pure compiler barrier.
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.

llvm-svn: 300904
2017-04-20 21:56:52 +00:00
Xinliang David Li 99e3ca1526 Use basicblock split block utility function
Instead of calling BasicBlock::SplitBasicBlock directly in 
CodeExtractor.

Differential Revision: https://reviews.llvm.org/D32308

llvm-svn: 300899
2017-04-20 21:40:22 +00:00
Sanjay Patel 3e1ae72fcf [InstCombine] allow shl demanded bits folds with splat constants
More fixes are needed to enable the helper SimplifyShrShlDemandedBits().

llvm-svn: 300898
2017-04-20 21:33:02 +00:00
Craig Topper ff23889609 [InstCombine] Use APInt::intersects and APInt::isSubsetOf to improve a few more places in SimplifyDemandedBits.
llvm-svn: 300896
2017-04-20 21:24:37 +00:00
Chad Rosier 4279c58ec4 [AArch64] Whitespace/ordering fixes for Falkor machine description. NFC.
llvm-svn: 300893
2017-04-20 21:11:17 +00:00
Chad Rosier a56bdbe62d [AArch64] Refine Falkor machine description for pre/post-inc and stores.
llvm-svn: 300892
2017-04-20 21:11:09 +00:00
Sanjay Patel fb5b3e773a [InstCombine] allow ashr/lshr demanded bits folds with splat constants
llvm-svn: 300888
2017-04-20 20:59:02 +00:00
Craig Topper 17f37ba3b9 [InstCombine] Use APInt::isSubsetOf to simplify some code in SimplifyDemandedBits. NFC
This allows us to use less temporary APInt for And and Invert operations.

llvm-svn: 300885
2017-04-20 20:47:35 +00:00
Sanjay Patel 7e77bed813 [InstCombine] add tests for demanded bits ashr/lshr splat constants; NFC
llvm-svn: 300884
2017-04-20 20:44:54 +00:00
Adrian Prantl ada104888e Don't emit locations that need a DW_OP_stack_value in DWARF 2 & 3.
https://bugs.llvm.org/show_bug.cgi?id=32382

llvm-svn: 300883
2017-04-20 20:42:33 +00:00
Benjamin Kramer 9ed1371adb [Support] Make asan poisoning for recyclers more aggressive by also poisoning the 'next' pointer.
llvm-svn: 300882
2017-04-20 20:28:18 +00:00
Benjamin Kramer 49f1c3297b Remove stray ^S. NFC.
llvm-svn: 300880
2017-04-20 20:03:36 +00:00
Paul Robinson e5dcdb8558 [DWARF] Fix a couple of typos
llvm-svn: 300879
2017-04-20 20:03:03 +00:00
Tim Northover 8b1240b0f0 ARM: handle post-indexed NEON ops where the offset isn't the access width.
Before, we assumed that any ConstantInt offset was precisely the access width,
so we could use the "[rN]!" form. ISelLowering only ever created that kind, but
further simplification during combining could lead to unexpected constants and
incorrect codegen.

Should fix PR32658.

llvm-svn: 300878
2017-04-20 19:54:02 +00:00
Adrian McCarthy 175d70ee5c VarStreamArrayIterator needed non-const operator* overload.
Without this change, the operator-> provided by iterator_facade lost type
qualifiers.

Differential Revision: https://reviews.llvm.org/D32235

llvm-svn: 300877
2017-04-20 19:34:06 +00:00
Craig Topper 0ec3f2f39a [InstCombine] Remove redundant code from SimplifyDemandedBits handling for Or. The code above it is equivalent if you work through the bitwise math.
llvm-svn: 300876
2017-04-20 19:31:22 +00:00
Paul Robinson 70b34533c2 [DWARF] Versioning for DWARF constants; verify FORMs
Associate the version-when-defined with definitions of standard DWARF
constants.  Identify the "vendor" for DWARF extensions.
Use this information to verify FORMs in .debug_abbrev are defined as
of the DWARF version specified in the associated unit.
Removed two tests that had specified DWARF v1 (which essentially does
not exist).

Differential Revision: http://reviews.llvm.org/D30785

llvm-svn: 300875
2017-04-20 19:16:51 +00:00
Benjamin Kramer 4f1bed1bcf [go bindings] Rmove duplicated conversion function definitions after r300843.
llvm-svn: 300872
2017-04-20 19:06:11 +00:00
Chad Rosier 9f25dd56a8 [AArch64] Improve scheduling of logical operations on Falkor.
llvm-svn: 300871
2017-04-20 18:50:21 +00:00
Weiming Zhao 962c5a3aec [Thumb-1] Fix corner cases for compressed jump tables
Summary:
When synthesized TBB/TBH is expanded, we need to avoid the case of:
   BaseReg is redefined after the load of branching target. E.g.:

    %R2 = tLEApcrelJT <jt#1>
    %R1 =  tLDRr %R1, %R2    ==> %R2 = tLEApcrelJT <jt#1>
    %R2 = tLDRspi %SP, 12        %R2 = tLDRspi %SP, 12
    tBR_JTr %R1                  tTBB_JT %R2, %R1
`
Reviewers: jmolloy

Reviewed By: jmolloy

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D32250

llvm-svn: 300870
2017-04-20 18:37:14 +00:00
Davide Italiano b965121ba8 [CodeExtractor] Remove a bunch of unneeded constructors.
Differential Revision:  https://reviews.llvm.org/D32305

llvm-svn: 300869
2017-04-20 18:33:40 +00:00
Benjamin Kramer 997fd5eeb4 [Recycler] Add asan/msan annotations.
This enables use after free and uninit memory checking for memory
returned by a recycler. SelectionDAG currently relies on the opcode of a
free'd node being ISD::DELETED_NODE, so poke a hole in the asan poison
for SDNode opcodes. This means that we won't find some issues, but only
in SDag.

llvm-svn: 300868
2017-04-20 18:29:37 +00:00
Benjamin Kramer 58dadd59d9 Fix use-after-frees on memory allocated in a Recycler.
This will become asan errors once the patch lands that poisons the
memory after free. The x86 change is a hack, but I don't see how to
solve this properly at the moment.

llvm-svn: 300867
2017-04-20 18:29:14 +00:00
Artyom Skrobov 938fc1341d Fixing outdated comment [NFC]
Since r32105 back in 2006, RegisterPass doesn't support
passes without a default constructor.

llvm-svn: 300866
2017-04-20 18:20:02 +00:00
Andrew Kaylor 73b4a9a4a4 Fix formatting of constrained FP intrinsic documentation
llvm-svn: 300865
2017-04-20 18:18:36 +00:00
Yaxun Liu 5d977f8ed4 CodeGen: Let frame index value type match alloca addr space
Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as pointer in
address space 0.

However, currently the value type of frame index is assumed to be of the
same size as pointer in address space 0.

This patch fixes that.

Most targets assume alloca returning pointer in address space 0, which
is the default alloca address space. Therefore it is NFC for them.

AMDGCN target with amdgiz environment requires this change since it
assumes alloca returning pointer to addr space 5 and its size is 32,
which is different from the size of pointer in addr space 0 which is 64.

Differential Revision: https://reviews.llvm.org/D32021

llvm-svn: 300864
2017-04-20 18:15:34 +00:00
Reid Kleckner 62731e1c89 Remove duplicate AttributeList::removeAttributes implementation
Have the AttributeList overload delegate to the AttrBuilder one.
Simplify the AttrBuilder overload by avoiding getSlotAttributes, which
creates temporary AttributeLists.

Simplify `AttrBuilder::removeAttributes(AttributeList, unsigned)` by
using getAttributes instead of manually iterating over slots.

Extracted from https://reviews.llvm.org/D32262

NFC

llvm-svn: 300863
2017-04-20 18:08:36 +00:00
Sanjay Patel 13985cd111 [DAGCombiner] use more local variables in isAlias(); NFCI
llvm-svn: 300860
2017-04-20 18:02:27 +00:00
Sam Clegg 90d99413ac [WebAssembly] Add known failures for wasm object file backend
Subscribers: jfb, dschuff

Differential Revision: https://reviews.llvm.org/D32300

llvm-svn: 300859
2017-04-20 17:18:15 +00:00
Zachary Turner 2a593bc508 Resubmit "[BitVector] Add operator<<= and operator>>=."
This was failing due to the use of assigning a Mask to an
unsigned, rather than to a BitWord.  But most systems do not
have sizeof(unsigned) == sizeof(unsigned long), so the mask
was getting truncated.

llvm-svn: 300857
2017-04-20 16:56:54 +00:00
Craig Topper bcfd2d1789 [APInt] Rename getSignBit to getSignMask
getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask.

Differential Revision: https://reviews.llvm.org/D32108

llvm-svn: 300856
2017-04-20 16:56:25 +00:00
Amara Emerson 20f2b01222 [SVE] Fix mismatched sign comparison warning in unit test from r300842.
llvm-svn: 300855
2017-04-20 16:54:49 +00:00
Sanjay Patel 2d0e88fb9b [DAGCombiner] fix variable names in isAlias(); NFCI
We started with zero-based params and switched to one-based locals...
Also, variables start with a capital and functions do not.

llvm-svn: 300854
2017-04-20 16:36:37 +00:00
Zachary Turner b3dac3816f Revert "[BitVector] Add operator<<= and operator>>=."
This is causing test failures on Linux / BSD systems.  Reverting
while I investigate.

llvm-svn: 300852
2017-04-20 16:35:22 +00:00
Craig Topper a8129a1122 [APInt] Add isSubsetOf method that can check if one APInt is a subset of another without creating temporary APInts
This question comes up in many places in SimplifyDemandedBits. This makes it easy to ask without allocating additional temporary APInts.

The BitVector class provides a similar functionality through its (IMHO badly named) test(const BitVector&) method. Though its output polarity is reversed.

I've provided one example use case in this patch. I plan to do more as a follow up.

Differential Revision: https://reviews.llvm.org/D32258

llvm-svn: 300851
2017-04-20 16:17:13 +00:00
Sanjay Patel b7701bc9af [DAGCombiner] give names to repeated calcs in isAlias(); NFCI
llvm-svn: 300850
2017-04-20 16:15:08 +00:00
Craig Topper 83dc1c60aa In SimplifyDemandedUseBits, use computeKnownBits directly to handle Constants
Currently we don't explicitly process ConstantDataSequential, ConstantAggregateZero, or ConstantVector, or Undef before applying the Depth limit. Instead they occur after the depth check in the non-instruction path.

For the constant types that we do handle, the code is replicated from computeKnownBits.

This patch fixes the missing constant handling and the reduces the amount of code by just using computeKnownBits directly for any type of Constant.

Differential Revision: https://reviews.llvm.org/D32123

llvm-svn: 300849
2017-04-20 16:14:58 +00:00
Zachary Turner 7500b0ece8 [BitVector] Add operator<<= and operator>>=.
Differential Revision: https://reviews.llvm.org/D32244

llvm-svn: 300848
2017-04-20 15:57:58 +00:00
Daniel Sanders 5377fb3419 [globalisel] Enable tracing the legalizer with --debug-only=legalize-mir
Reviewers: t.p.northover, ab, qcolombet, aditya_nandakumar, rovka, kristof.beyls

Reviewed By: kristof.beyls

Subscribers: dberris, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31750

llvm-svn: 300847
2017-04-20 15:46:12 +00:00
Amaury Sechet a52b03d2ea Introduce LLVMDIBuilderRef
Summary:
This patch adds a definition of `LLVMDIBuilderRef` that represents an `llvm::DIBuilder`.

Authored by Harlan Haskins

Reviewers: deadalnix, aprantl, probinson, dblaikie, echristo, whitequark

Reviewed By: deadalnix, whitequark

Subscribers: CodaFi, loladiro

Differential Revision: https://reviews.llvm.org/D32122

llvm-svn: 300843
2017-04-20 14:22:47 +00:00
Amara Emerson 23e79ec2b3 [MVT][SVE] Scalable vector MVTs (3/3)
Adds MVT::ElementCount to represent the length of a
vector which may be scalable, then adds helper functions
that work with it.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32019

llvm-svn: 300842
2017-04-20 13:54:09 +00:00
Amara Emerson bfbdebd00e [MVT][SVE] Scalable vector MVTs (2/3)
Adds scalable vector machine value types, and updates
the switch statements required for tablegen.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32018

llvm-svn: 300840
2017-04-20 13:36:58 +00:00
Petar Jovanovic 2b6fe3ffa6 [mips][msa] Mask vectors holding shift amounts
Masked vectors which hold shift amounts when creating the following nodes:
ISD::SHL, ISD::SRL or ISD::SRA.
Instructions that use said nodes, which have had their arguments altered are
sll, srl, sra, bneg, bclr and bset.

For said instructions, the shift amount or the bit position that is
specified in the corresponding vector elements will be interpreted as the
shift amount/bit position modulo the size of the element in bits.

The problem lies in compiling with -O2 enabled, where the instructions for
formats .w and .d are not generated, but are instead optimized away.
In this case, having shift amounts that are either negative or greater than
the element bit size results in generation of incorrect results when
constant folding.

We remedy this by masking the operands for the nodes mentioned above before
actually creating them, so that the final result is correct before placed
into the constant pool.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D31331

llvm-svn: 300839
2017-04-20 13:26:46 +00:00
Amara Emerson 5054782052 [MVT][SVE] Scalable vector MVTs (1/3)
This patch adds a few helper functions to obtain new vector
value types based on existing ones without needing to care
about whether they are scalable or not.

I've confined their use to a few common locations right now,
and targets that don't have scalable vectors should never
need to care about these.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32017

llvm-svn: 300838
2017-04-20 13:08:17 +00:00
John Brawn 66719f63d0 [ARM] Fix handling of mapping symbols when changing sections
ChangeSection incorrectly registers LastEMSInfo as belonging to the previous
section, not the current section. This happens to work when changing sections
using .section, as the previous section is set to the current section before
the call to ChangeSection, but not when using .popsection.

Differential Revision: https://reviews.llvm.org/D32225

llvm-svn: 300831
2017-04-20 10:18:13 +00:00
John Brawn 5ca5daa6b9 [AArch64] Fix handling of zero immediate in fmov instructions
Currently fmov #0 with a vector destination is handle incorrectly and results in
fmov #-1.9375 being emitted but should instead give an error. This is due to the
way we cope with fmov #0 with a scalar destination being an alias of fmov zr, so
fix this by actually doing it through an alias.

Differential Revision: https://reviews.llvm.org/D31949

llvm-svn: 300830
2017-04-20 10:13:54 +00:00
John Brawn dcf037a6f0 [AArch64] Fix handling of integer fp immediates
When an integer is used as an fp immediate we're failing to check the return
value of getFP64Imm, so invalid values are silently permitted. Fix this by
merging together the integer and real handling.

llvm-svn: 300828
2017-04-20 10:10:10 +00:00
Diana Picus 7c6dee9f16 [ARM] Rename HW div feature to HW div Thumb. NFCI.
The hardware div feature refers only to Thumb, but because of its name
it is tempting to use it to check for hardware division in general,
which may cause problems in ARM mode. See https://reviews.llvm.org/D32005.

This patch adds "Thumb" to its name, to make its scope clear. One
notable place where I haven't made the change is in the feature flag
(used with -mattr), which is still hwdiv. Changing it would also require
changes in a lot of tests, including clang tests, and it doesn't seem
like it's worth the effort.

Differential Revision: https://reviews.llvm.org/D32160

llvm-svn: 300827
2017-04-20 09:38:25 +00:00
Craig Topper 6a63842708 [APInt] In slt/sgt(uint64_t), only call getMinSignedBits if the APInt is not a single word.
llvm-svn: 300824
2017-04-20 06:04:03 +00:00
Craig Topper fded30584e [APInt] Call the slow case counting methods directly in isMask/isShiftedMask. We already handled the single word case. NFC
llvm-svn: 300823
2017-04-20 06:04:01 +00:00
Craig Topper 9ce5ef9475 [SelectionDAG] Fix another place that was passing a large value to APInt::lshrInPlace.
llvm-svn: 300821
2017-04-20 04:55:01 +00:00
Craig Topper d3884b8402 [SelectionDAG] Use getActiveBits() and countTrailingZeros() to avoid creating temporary APInts with lshr and trunc. NFCI
llvm-svn: 300819
2017-04-20 04:23:43 +00:00
Craig Topper 4db0c69373 Recommit "[APInt] Add back the asserts that check that the APInt shift methods aren't called with values larger than BitWidth."
This includes a fix to clamp a right shift of larger than BitWidth in DAG combining.

llvm-svn: 300816
2017-04-20 03:49:18 +00:00
Craig Topper 6fd0a5c99d Revert r300811 "[APInt] Add back the asserts that check that the APInt shift methods aren't called with values larger than BitWidth."
This is failing a self host debug build.

llvm-svn: 300813
2017-04-20 02:46:21 +00:00
Craig Topper baa392e4e0 [APInt] Implement APInt::intersects without creating a temporary APInt in the multiword case
Summary: This is a simple question we should be able to answer without creating a temporary to hold the AND result. We can also get an early out as soon as we find a word that intersects.

Reviewers: RKSimon, hans, spatel, davide

Reviewed By: hans, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32253

llvm-svn: 300812
2017-04-20 02:11:27 +00:00
Craig Topper e49252cea1 [APInt] Add back the asserts that check that the APInt shift methods aren't called with values larger than BitWidth.
The underlying tcShiftRight/tcShiftLeft functions support the larger bit widths but the APInt interface shouldn't rely on that.

llvm-svn: 300811
2017-04-20 02:03:09 +00:00
Serge Pavlov 802aa667d5 Do not run frame verification if target does not use frame instructions
llvm-svn: 300807
2017-04-20 01:34:04 +00:00
Ahmed Bougacha db2c16aebb Revert "[libFuzzer] XFAIL fuzzer-oom.test on Darwin."
This reverts commit r300127.

r300759 implemented StopTheWorld for Darwin, so the test passes again.

llvm-svn: 300801
2017-04-20 00:16:13 +00:00
Kostya Serebryany f60f61d0b3 [libFuzzer] extend help for -minimize_crash to cover ASAN_OPTIONS=dedup_token_length=3
llvm-svn: 300800
2017-04-19 23:58:05 +00:00
Craig Topper b3624e4f45 [APInt] Implement operator==(uint64_t) similar to ugt/ult(uint64_t) to remove one of the out of line EqualsSlowCase methods.
llvm-svn: 300799
2017-04-19 23:57:51 +00:00
Craig Topper 0e7f6fb6b4 [APInt] Don't call getActiveBits() in ult/ugt(uint64_t) if its a single word.
The compiled code already needs to check single/multi word for the countLeadingZeros call inside of getActiveBits, but it isn't able to optimize out the leadingZeros call in the single word case that can't produce a value larger than 64.

This shrank the opt binary by about 5-6k on my local x86-64 build.

llvm-svn: 300798
2017-04-19 23:55:48 +00:00
Sanjoy Das 25e71d87b5 Statepoint Docs: fix incorrect uses of it's
llvm-svn: 300797
2017-04-19 23:55:03 +00:00
Craig Topper 9700a60a61 [APInt] Use ugt(uint64_t) for the compare in getLimitedValue(uint64_t) since the code is identical to it. NFC
llvm-svn: 300796
2017-04-19 23:52:59 +00:00
Reid Kleckner f1de9e83c2 [DAE] Simplify attribute list creation, NFC
Removes a use of getSlotAttributes, which I intend to change.

llvm-svn: 300795
2017-04-19 23:45:45 +00:00
Kuba Mracek 7fe92fc521 Revert r300789: There are Windows bot failures.
llvm-svn: 300794
2017-04-19 23:44:33 +00:00
Adrian Prantl c12cee3600 Fix bug that caused DwarfExpression to drop DW_OP_deref from FI locations
- introduced in r300522 and found via the Swift LLDB testsuite.

The fix is to set the location kind to memory whenever an FrameIndex
location is emitted.

rdar://problem/31707602

llvm-svn: 300793
2017-04-19 23:42:25 +00:00
Adrian Prantl 295c952b67 Revert "Fix bug that caused DwarfExpression to drop DW_OP_deref from FI locations"
This reverts commit r300790.

llvm-svn: 300792
2017-04-19 23:42:17 +00:00
Kannan Narayanan 2fb5960121 Revert earlier change. ds permute operations affect lgkm counter.
Differential Revision: https://reviews.llvm.org/D32254

llvm-svn: 300791
2017-04-19 23:39:19 +00:00
Adrian Prantl 78ff122709 Fix bug that caused DwarfExpression to drop DW_OP_deref from FI locations
- introduced in r300522 and found via the Swift LLDB testsuite.

The fix is to set the location kind to memory whenever an FrameIndex
location is emitted.

rdar://problem/31707602

llvm-svn: 300790
2017-04-19 23:34:14 +00:00
Kuba Mracek a89fd60a91 [libFuzzer] Always build libFuzzer
There are two reasons why users might want to build libfuzzer:
- To fuzz LLVM itself
- To get the libFuzzer.a archive file, so that they can attach it to their code
This change always builds libfuzzer, and supports the second use case if the specified flag is set.

The point of this patch is to have something that can potentially be shipped with the compiler, and this also ensures that the version of libFuzzer is correct to use with that compiler.

Differential Revision: https://reviews.llvm.org/D32096

llvm-svn: 300789
2017-04-19 23:34:08 +00:00
Reid Kleckner 0a5ed3d5dc [GlobalOpt] Simplify attribute code stripping nest, NFC
llvm-svn: 300787
2017-04-19 23:26:44 +00:00
Reid Kleckner aa0cec7d6d Simplify test for sret attribute in instcombine
This change is correct because the verifier requires that at most one
argument be marked 'sret'.

NFC, removes a use of AttributeList slot APIs.

llvm-svn: 300784
2017-04-19 23:17:47 +00:00
Galina Kistanova 2cc97d92ce Temporarily revert r299221 to fix nondeterminism in ThinLTO builder.
llvm-svn: 300783
2017-04-19 23:16:14 +00:00
Philip Reames 0d98ada0d4 Refresh the statepoint docs a bit
The documentation had gotten a bit stale.  The revised one are by no means perfect, but I tried to remove the obvious incorrect or misleading statements.

llvm-svn: 300782
2017-04-19 23:16:13 +00:00
Matthias Braun 372ee59766 X86FrameLowering: Fix getFrameIndexReference() for 'fixed' objects
Debug information is calculated with getFrameIndexReference() which was
missing some logic for the fixed object cases (= parameters on the stack).

rdar://24557797

Differential Revision: https://reviews.llvm.org/D32204

llvm-svn: 300781
2017-04-19 23:10:43 +00:00
Eugene Zelenko d341c93268 [Object] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 300779
2017-04-19 23:02:10 +00:00
Kostya Serebryany c5d3d49034 [sanitizer-coverage] remove some more stale code
llvm-svn: 300778
2017-04-19 22:42:11 +00:00
Evgeniy Stepanov 7c9b086ef5 Remove two unused variables (-Werror).
llvm-svn: 300777
2017-04-19 22:27:23 +00:00
Craig Topper 3cc8da3248 [APInt] Cast more calls to add/sub/mul overflow functions to void. I missed the unittests in r300758.
llvm-svn: 300773
2017-04-19 22:11:05 +00:00
Sanjay Patel 0658a95a35 [DAG] add splat vector support for 'or' in SimplifyDemandedBits
I've changed one of the tests to not fold away, but we didn't and still don't do the transform
that the comment claims we do (and I don't know why we'd want to do that).

Follow-up to:
https://reviews.llvm.org/rL300725
https://reviews.llvm.org/rL300763

llvm-svn: 300772
2017-04-19 22:00:00 +00:00
Kostya Serebryany be87d480ff [sanitizer-coverage] remove stale code
llvm-svn: 300769
2017-04-19 21:48:09 +00:00
Kostya Serebryany a9e6cb8633 [libFuzzer] remove -output_csv option. It duplicates the default output and got out of sync
llvm-svn: 300768
2017-04-19 21:34:58 +00:00
Sanjay Patel ae382bb6af [DAG] add splat vector support for 'xor' in SimplifyDemandedBits
This allows forming more 'not' ops, so we get improvements for ISAs that have and-not.

Follow-up to:
https://reviews.llvm.org/rL300725

llvm-svn: 300763
2017-04-19 21:23:09 +00:00
Matthias Braun 8aaa368d00 ARMFrameLowering: Reserve emergency spill slot for large arguments
Re-commit after revert in r300668. Changed getMaxFPOffset() to a
more conservative heuristic instead of trying to be clever and missing
for some exotic calling conventions.

We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

llvm-svn: 300761
2017-04-19 21:11:44 +00:00
Craig Topper 9b71a402c2 [APInt] Cast calls to add/sub/mul overflow methods to void if only their overflow bool out param is used.
This is preparation for a clang change to improve the [[nodiscard]] warning to not be ignored on methods that return a class marked [[nodiscard]] that are defined in the class itself. See D32207.

We should consider adding wrapper methods to APInt that return the overflow flag directly and discard the APInt result. This would eliminate the void casts and the need to create a bool before the call to pass to the out param.

llvm-svn: 300758
2017-04-19 21:09:45 +00:00
Simon Pilgrim 9a828c1f2b [InstCombine] Add frem constant folding test (PR3316)
llvm-svn: 300757
2017-04-19 21:09:19 +00:00
Matt Arsenault 4a48623e4f AMDGPU: Custom lower illegal small select types
Promote them to i32 vectors to avoid unpacking and re-packing
the vectors.

llvm-svn: 300754
2017-04-19 20:53:07 +00:00
Dehao Chen db569bae55 Code style change as suggested in https://reviews.llvm.org/D32177 (NFC)
llvm-svn: 300753
2017-04-19 20:52:21 +00:00
Eli Friedman 70ad2751d5 [ARM] Remove redundant computeKnownBits helper.
Move the BFI logic to computeKnownBitsForTargetNode, and delete
the redundant CMOV logic.

This is intended as a cleanup, but it's probably possible to construct
a case where moving the BFI logic allows more combines.

Differential Revision: https://reviews.llvm.org/D31795

llvm-svn: 300752
2017-04-19 20:50:57 +00:00
Aditya Nandakumar 75ad9ccbfa [GISEL]: Move getConstantVReg to Utils
NFCI

llvm-svn: 300751
2017-04-19 20:48:50 +00:00
Simon Pilgrim 19a6dddd6d [InstCombine] Add frem constant folding test (PR32177)
llvm-svn: 300750
2017-04-19 20:47:58 +00:00
Eli Friedman f281d490cc [ARM] Use TableGen patterns to select vtbl. NFC.
Differential Revision: https://reviews.llvm.org/D32103

llvm-svn: 300749
2017-04-19 20:39:39 +00:00
Craig Topper f0dd3c975c [APInt] Use SignExtend64 instead of reinventing it. NFC
llvm-svn: 300747
2017-04-19 20:32:11 +00:00
Eli Friedman e77d2b86b4 [SCEV] Make SCEV or modeling more aggressive.
Use haveNoCommonBitsSet to figure out whether an "or" instruction
is equivalent to addition. This handles more cases than just
checking for a constant on the RHS.

Differential Revision: https://reviews.llvm.org/D32239

llvm-svn: 300746
2017-04-19 20:19:58 +00:00
Dehao Chen a364f09f18 Using address range map to speedup finding inline stack for address.
Summary:
In the current implementation, to find inline stack for an address incurs expensive linear search in 2 places:

* linear search for the top-level DIE
* recursive linear traverse the DIE tree to find the path to the leaf DIE

In this patch, a map is built from address to its corresponding leaf DIE. The inline stack is built by traversing from the leaf DIE up to the root DIE. This speeds up batch symbolization by ~10X without noticible memory overhead.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32177

llvm-svn: 300742
2017-04-19 20:09:38 +00:00
Dehao Chen cfe5b2bf74 Update the madd.ll test with utils/update_llc_test_checks.py (NFC)
llvm-svn: 300740
2017-04-19 20:08:14 +00:00
Dehao Chen 58601674d2 PR32710: Disable using PMADDWD for unsigned short.
Summary: PMADDWD can only handle signed short.

Reviewers: mkuper, wmi

Reviewed By: mkuper

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32236

llvm-svn: 300737
2017-04-19 19:50:34 +00:00
Matt Arsenault 021a218dd2 AMDGPU: Don't emit amd_kernel_code_t for callable functions
This is inserted directly in the text section. The relocation
for the function ends up resolving to the beginning of the
amd_kernel_code_t header rather than the actual function
entry point.

Also skip some of the comments for initialization
that only makes sense for kernels.

llvm-svn: 300736
2017-04-19 19:38:10 +00:00
Aditya Nandakumar 8a76f915ae [tblgen] GCC/MS builtin to target intrisics map.
Patch by Ettore Speziale

Allow TableGen to generate static functions to perform GCC/MS builtin name to
target specific intrinsic ID mapping.

https://reviews.llvm.org/D31150

llvm-svn: 300735
2017-04-19 19:14:20 +00:00
Artem Tamazov 557a057d4f [AMDGPU][mc][tests][NFC] Update bulk ISA tests for Gfx7 and Gfx8
Added approx. 1100 gfx7 and 1040 gfx8 test cases.

llvm-svn: 300734
2017-04-19 19:12:06 +00:00
Matt Arsenault d3406bc45c StructurizeCFG: Directly invert cmp instructions
The most common case for a branch condition is
a single use compare. Directly invert the branch
predicate rather than adding a lot of xor i1 true
which the DAG will have to fold later.

This produces nicer to read structurizer output.

This produces some random changes in codegen
due to the DAG swapping branch conditions itself,
and then does a poor job of dealing with those
inverts.

llvm-svn: 300732
2017-04-19 18:29:07 +00:00
Sanjoy Das 5945447d84 [GVN] Don't coerce non-integral pointers to integers or vice versa
Summary:
See http://llvm.org/docs/LangRef.html#non-integral-pointer-type

The NewGVN test does not fail without these changes (perhaps it does
try to coerce pointers <-> integers to begin with?), but I added the
test case anyway.

Reviewers: dberlin

Subscribers: mcrosier, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D32208

llvm-svn: 300730
2017-04-19 18:21:09 +00:00
Richard Smith c99e91c421 Update comment to match r300252.
llvm-svn: 300728
2017-04-19 18:17:51 +00:00
Tim Northover ff168c68dc ARM: TLS calling convention doesn't preserve r9 or r12 on Darwin.
llvm-svn: 300726
2017-04-19 18:07:54 +00:00
Sanjay Patel ded7d59f0e [DAG] add splat vector support for 'and' in SimplifyDemandedBits
The patch itself is simple: stop discriminating against vectors in visitAnd() and again in 
SimplifyDemandedBits().

Some notes for reference:

1. We're not consistent about calls to SimplifyDemandedBits in the various visitXXX functions. 
   Sometimes, we check if the RHS is a constant first. Other times (like here), we just dive in.
2. I'd like to break the vector shackles in steps for the sake of risk minimization, but we could
    make similar simultaneous changes in other places if we think that would be better.
3. I don't know what the intent of the changed tests in this patch was supposed to be, but since 
   they wiggled in a positive way, I'm just going with that. :)
4. In the rotate tests, note that we can see through non-splat constants. This is a result of D24253.
5. My motivation for being here now is to make D31944 look better, so this is step 1 of N towards 
   improving the vector codegen in that patch without writing any actual new code.

Differential Revision: https://reviews.llvm.org/D32230

llvm-svn: 300725
2017-04-19 18:05:06 +00:00
Peter Collingbourne 3fa4bb4024 IR: Remove some comments that are documenting the obvious. NFC.
llvm-svn: 300724
2017-04-19 18:00:05 +00:00
Benjamin Kramer e25268de9d [MathExtras] Fix undefined behavior (shift by bit width)
While there add some unit tests for uint64_t. Found by ubsan.

llvm-svn: 300721
2017-04-19 17:46:15 +00:00
Matt Arsenault 6cb7b8a42f AMDGPU: Don't align callable functions to 256
llvm-svn: 300720
2017-04-19 17:42:39 +00:00
Matt Arsenault 4c1ecded63 AMDGPU: Change DivergenceAnalysis for function arguments
Stop assuming all functions are kernels.

llvm-svn: 300719
2017-04-19 17:42:34 +00:00
Reid Kleckner 9d16fa09c6 Prefer addAttr(Attribute::AttrKind) over the AttributeList overload
This should simplify the call sites, which typically want to tweak one
attribute at a time. It should also avoid creating ephemeral
AttributeLists that live forever.

llvm-svn: 300718
2017-04-19 17:28:52 +00:00
Davide Italiano ffcb4df204 [InstCombine] Reduce visitLoadInst() code duplication. NFCI.
llvm-svn: 300717
2017-04-19 17:26:57 +00:00
Craig Topper c67fe57e1e [APInt] Move the 'return *this' from the slow cases of assignment operators inline. We should let the compiler see that the fast/slow cases both return *this.
I don't think we chain assignments together very often so this shouldn't matter much.

llvm-svn: 300715
2017-04-19 17:01:58 +00:00
Sanjay Patel a3c297dba4 [InstSimplify] fold identity shuffles (recursing if needed)
This patch simplifies the examples from D31509 and D31927 (PR30630) and catches 
the basic identity shuffle tests that Zvi recently added.

I'm not sure if we have something like this in DAGCombiner, but we should?

It's worth noting that "MaxRecurse / RecursionLimit" is only 3 on entry at the moment. 
We might want to bump that up if there are longer shuffle chains like this in the wild.

For now, we're ignoring shuffles that have undef mask elements because it's not
clear how those should be handled.

Differential Revision: https://reviews.llvm.org/D31960

llvm-svn: 300714
2017-04-19 16:48:22 +00:00
Sanjay Patel 8bd52286d3 use 'auto' with 'dyn_cast' and fix formatting; NFC
llvm-svn: 300713
2017-04-19 16:22:19 +00:00
Zachary Turner 2be153b240 Add an #include for <climits> for CHAR_BIT.
llvm-svn: 300711
2017-04-19 15:50:43 +00:00
Zachary Turner f19b0c7f6b [Support] Add some helpers to generate bitmasks.
Frequently you you want a bitmask consisting of a specified
number of 1s, either at the beginning or end of a word.

The naive way to do this is to write

template<typename T>
T leadingBitMask(unsigned N) {
  return (T(1) << N) - 1;
}

but using this function you cannot produce a word with every
bit set to 1 (i.e. leadingBitMask<uint8_t>(8)) because left
shift is undefined when N is greater than or equal to the
number of bits in the word.

This patch provides an efficient, branch-free implementation
that works for all values of N in [0, CHAR_BIT*sizeof(T)]

Differential Revision: https://reviews.llvm.org/D32212

llvm-svn: 300710
2017-04-19 15:45:31 +00:00
Dehao Chen e0b77b24d9 Revert r300697 which causes buildbot failure.
llvm-svn: 300708
2017-04-19 15:28:58 +00:00
Krzysztof Parzyszek 333b2bf2ed [Hexagon] Generate proper offset in opt-addr-mode
Also, make a few changes to allow using the pass in .mir testcases.
Among other things, change the abbreviation from opt-amode to amode-opt,
because otherwise lit would expand the "opt" part to the full path to
the opt binary.

llvm-svn: 300707
2017-04-19 15:15:51 +00:00
Krzysztof Parzyszek 634f57e0bb [Hexagon] Remove RDefMap, use Liveness:getNearestAliasedRef instead
llvm-svn: 300706
2017-04-19 15:14:30 +00:00
Krzysztof Parzyszek 0de74f315d [RDF] Switch NodeList to SmallVector from std::vector
The list has a single element 75+% of the time, reservation of 4 elements
is sufficient in 95% of cases.

llvm-svn: 300705
2017-04-19 15:12:44 +00:00
Krzysztof Parzyszek 7c69a3b490 [RDF] Use faster version of findBlock
llvm-svn: 300704
2017-04-19 15:11:23 +00:00
Krzysztof Parzyszek 6aa3a3f00b [RDF] Cache register units for reg masks instead of recalculating them
llvm-svn: 300702
2017-04-19 15:10:09 +00:00
Krzysztof Parzyszek 5bfaf56ee5 [Hexagon] Cache reached blocks in bit tracker instead of scanning list
llvm-svn: 300701
2017-04-19 15:08:31 +00:00
Sanjay Patel c9d36f181f [PowerPC] add test and auto-generate checks; NFC
llvm-svn: 300700
2017-04-19 14:58:09 +00:00
Sanjay Patel 5a2235bbd0 [ARM] add test and auto-generate checks; NFC
llvm-svn: 300698
2017-04-19 14:55:50 +00:00
Dehao Chen 74f3e0d426 Using address range map to speedup finding inline stack for address.
Summary:
In the current implementation, to find inline stack for an address incurs expensive linear search in 2 places:

* linear search for the top-level DIE
* recursive linear traverse the DIE tree to find the path to the leaf DIE

In this patch, a map is built from address to its corresponding leaf DIE. The inline stack is built by traversing from the leaf DIE up to the root DIE. This speeds up batch symbolization by ~10X without noticible memory overhead.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32177

llvm-svn: 300697
2017-04-19 14:50:57 +00:00
Davide Italiano a9f047a594 [InstSimplify] Deduce correct type for vector GEP.
InstSimplify returned the wrong type when simplifying a vector GEP
and we ended up crashing when trying to replace all uses with the
new value. Fixes PR32697.

Differential Revision: https://reviews.llvm.org/D32180

llvm-svn: 300693
2017-04-19 14:23:42 +00:00
Nirav Dave 8563fc4664 [DAG] Loop over remaining candidates on successful merge of stores of
extracted vectors types. NFCI.

llvm-svn: 300688
2017-04-19 13:52:38 +00:00
Dylan McKay da2d74642a [AVR] Remove the 'multibyte' asm test
It tests registers which are not actually used on AVR.

llvm-svn: 300684
2017-04-19 12:13:45 +00:00
Simon Pilgrim 5536ecd9f0 Regenerate test. NFCI.
llvm-svn: 300683
2017-04-19 12:06:40 +00:00
Dylan McKay 7838104382 [AVR] Fix the test suite
A bunch of tests failed because memory operations have been reordered.

I am unsure which commit changed this behaviour as the AVR build was
failing at that point with an unrelated error.

This commit just reoders some of the CHECK lines in some tests to suit
current llc output.

llvm-svn: 300682
2017-04-19 12:02:52 +00:00
Igor Breger 4fdf1e489c [GlobalIsel][X86] support G_TRUNC selection.
Summary:
[GlobalIsel][X86] support G_TRUNC selection.
Add regbank-select and legalizer tests. Currently legalization of trunc i64 on 32bit platform not supported.

Reviewers: ab, zvi, rovka

Reviewed By: zvi

Subscribers: dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D32115

llvm-svn: 300678
2017-04-19 11:34:59 +00:00
Simon Pilgrim 94d5dba8ae [X86] Add D32039/PR31357 tests to show current BSWAP codegen
llvm-svn: 300672
2017-04-19 11:06:22 +00:00
Simon Pilgrim b13ab63503 [X86][SSE] Add scheduling latency/throughput tests for (most) SSE2 instructions
llvm-svn: 300671
2017-04-19 10:52:09 +00:00
Renato Golin 742aed8683 Revert "ARMFrameLowering: Reserve emergency spill slot for large arguments"
This reverts commit r300639, as it broke self-hosting on ARM. PR32709.

llvm-svn: 300668
2017-04-19 09:02:52 +00:00
Igor Breger 746b3c336f [GlobalISel][X86] Split select tests. NFC.
llvm-svn: 300666
2017-04-19 08:40:44 +00:00
Diana Picus 49472ff1cf [ARM] GlobalISel: Add support for G_MUL
Support G_MUL, very similar to G_ADD and G_SUB. The only difference is
in the instruction selector, where we have to select either MUL or MULv5
depending on the target.

llvm-svn: 300665
2017-04-19 07:29:46 +00:00
Kristof Beyls 0f36e68f62 [GlobalISel] Support vector-of-pointers in LLT
This fixes PR32471.

As comment 10 on that bug report highlights
(https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
few different defendable design tradeoffs that could be made, including
not representing pointers at all in LLT.

I decided to go for representing vector-of-pointer as a concept in LLT,
while keeping the size of the LLT type 64 bits (this is an increase from
48 bits before). My rationale for keeping pointers explicit is that on
some targets probably it's very handy to have the distinction between
pointer and non-pointer (e.g. 68K has a different register bank for
pointers IIRC). If we keep a scalar pointer, it probably is easiest to
also have a vector-of-pointers to keep LLT relatively conceptually clean
and orthogonal, while we don't have a very strong reason to break that
orthogonality.  Once we gain more experience on the use of LLT, we can
of course reconsider this direction.

Rejecting vector-of-pointer types in the IRTranslator is also an option
to avoid the crash reported in PR32471, but that is only a very
short-term solution; also needs quite a bit of code tweaks in places,
and is probably fragile. Therefore I didn't consider this the best
option.

llvm-svn: 300664
2017-04-19 07:23:57 +00:00
Kristof Beyls 7a71350363 [GlobalISel] Remove non-determinism from IRTranslator.
This showed up in r300535/r300537, which were reverted in r300538 due to
some of the introduced tests in there failing on some bots, due to the
non-determinism fixed in this commit.

Re-committing r300535/r300537 will add 2 tests for the change in this
commit.

llvm-svn: 300663
2017-04-19 06:38:37 +00:00
Chandler Carruth ae3386aa74 Revert r300657 due to crashes in stage2 of bootstraps:
http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/2476/steps/build-stage2-LLVMgold.so/logs/stdio
http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/15036/steps/build_llvmclang/logs/stdio

I've updated the commit thread, reverting to get the bots back to green.

Original commit summary:
[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor.

llvm-svn: 300662
2017-04-19 06:23:20 +00:00
Xin Tong 636a332906 [JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. .
Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread).

Reviewers: efriedma, sanjoy

Reviewed By: sanjoy

Subscribers: dberlin, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D30869

llvm-svn: 300657
2017-04-19 05:15:57 +00:00
Tim Shen 86652f262a Cleanup some GraphTraits iteration code
Use children<> and nodes<> in appropriate places to cleanup the code.

Also, as part of the cleanup,
change the signature of DominatorTreeBase's Split.
It is a protected non-virtual member function called only twice,
both from within the class,
and the removed passed argument in both cases is '*this'.
The reason for the existence of that argument seems to be that
back before r43115 Split was a free function,
so an argument to get '*this' was needed - but now that is no longer the
case.

Patch by Yoav Ben-Shalom!

Differential Revision: https://reviews.llvm.org/D32118

llvm-svn: 300656
2017-04-19 03:22:50 +00:00
Serge Pavlov 5943a96d81 ARM: Use methods to access data stored with frame instructions
In r300196 several methods were added to TarfetInstrInfo to access
data stored with call frame setup/destroy instructions. This change
replaces calls to getOperand with calls to such special methods in
ARM target.

Differential Revision: https://reviews.llvm.org/D32127

llvm-svn: 300655
2017-04-19 03:12:05 +00:00
Reid Kleckner 6190625381 Remove buggy 'addAttributes(unsigned, AttrBuilder)' overload
The 'addAttributes(unsigned, AttrBuilder)' overload delegated to 'get'
instead of 'addAttributes'.

Since we can implicitly construct an AttrBuilder from an AttributeSet,
just standardize on AttrBuilder.

llvm-svn: 300651
2017-04-19 01:51:13 +00:00
Kostya Serebryany 1f231e7cc7 [libFuzzer] update -help: mention -exact_artifact_path in help for -minimize_crash and -cleanse_crash
llvm-svn: 300642
2017-04-19 01:22:04 +00:00
Leslie Zhai b86e9a1c14 [AVR] Migrate to new MCAsmInfo CodePointerSize
Reviewers: dylanmckay, rengolin, kzhuravl, jroelofs

Reviewed By: kzhuravl, jroelofs

Subscribers: kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D32154

llvm-svn: 300641
2017-04-19 01:20:43 +00:00
Matthias Braun 661d3d4b00 ARMFrameLowering: Reserve emergency spill slot for large arguments
We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

llvm-svn: 300639
2017-04-19 01:16:07 +00:00
Craig Topper ff6922ad23 [DataLayout] Removed default value from a variable that isn't used without being overwritten. Make variable an enum instead of an int to avoid a cast later. NFC
llvm-svn: 300634
2017-04-19 00:31:38 +00:00
Dean Michael Berris 97923db891 [XRay][tools] Fix yaml matching to be more permissive
Account for a potentially empty function name.

Follow-up to D32153.

llvm-svn: 300631
2017-04-19 00:10:09 +00:00
Xin Tong 59cb7782cb Allow suppressing host and target info in VersionPrinter
Summary:
VersionPrinter by default outputs information about the Host CPU
and Default target. Printing this information requires linking in
a large amount of data, such as supported target triples as C
strings, which in turn bloats the binary size.

Enable a new CMake option LLVM_VERSION_PRINTER_SHOW_HOST_TARGET_INFO
which controls printing of the host and target info. This allows
the target triple names to be dead-code stripped. This is a nice
win for LLVM clients that wish to minimize their binary size, such
as graphics drivers.

By default this is ON, so there is no change in the default behavior.
Clients who wish to suppress this printing can do so by setting this
option to off via CMake.

A test app on Linux that uses ParseCommandLineOptions() shows a binary
size reduction of 23KB (from 149K to 126K) for a Release build, and 24KB
(from 135K to 111K) in a MinSizeRel build.

Reviewers: klimek, beanz, bogner, chandlerc, compnerd

Reviewed By: compnerd

Patch by pammon (Peter Ammon) !

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30904

llvm-svn: 300630
2017-04-19 00:03:36 +00:00
Dylan McKay eb24b850c5 [AVR] Fix the build
'PointerSize' was renamed to 'CodePointerSize'.

llvm-svn: 300629
2017-04-18 23:53:10 +00:00
Dean Michael Berris 918802bed4 [XRay][tools] Add option to llvm-xray extract to symbolize functions
Summary:
This allows us to, if the symbol names are available in the binary, be
able to provide the function name in the YAML output.

Reviewers: dblaikie, pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32153

llvm-svn: 300624
2017-04-18 23:23:54 +00:00
Craig Topper 88c64f324f [ConstantRange] Optimize APInt creation in getSignedMax/getSignedMin.
We were creating an APInt at the top of these methods that isn't always returned. For ranges wider than 64-bits this results in an allocation and deallocation when its not used.

In getSignedMax we were creating Upper-1 to use in a compare and then creating it again for a return value. The compiler is unable to determine that these can be shared. So help it out and create the Upper-1 in a temporary that can be reused.

This provides a little compile time improvement.

llvm-svn: 300621
2017-04-18 23:02:39 +00:00
Sanjay Patel ff981f9256 [x86] add tests for potential andn optimization; NFC
llvm-svn: 300617
2017-04-18 22:36:59 +00:00
Reid Kleckner fe64c0137e Fix crash in AttributeList::addAttributes, add test
llvm-svn: 300614
2017-04-18 22:10:18 +00:00
Sanjoy Das f09c1e346e Add a getPointerOperandType() helper to LoadInst and StoreInst; NFC
I will use this in a later change.

llvm-svn: 300613
2017-04-18 22:00:54 +00:00
Craig Topper 09bb760baa [MemoryBuiltins] Add isMallocOrCallocLikeFn so BasicAA can check for both at the same time
BasicAA wants to know if a function is either a malloc or calloc like function. Currently we have to check both separately. This means both calls check if its an intrinsic, query TLI, check the nobuiltin attribute, scan the AllocationFnData, etc.

This patch adds a isMallocOrCallocLikeFn so we can go through all of the checks once per call.

This also changes the one other location I saw that called both together.

Differential Revision: https://reviews.llvm.org/D32188

llvm-svn: 300608
2017-04-18 21:43:46 +00:00
Davide Italiano 80fe987b42 [LoopReroll] Prefer hasNUses/hasNUses or more as they're cheaper. NFCI.
llvm-svn: 300607
2017-04-18 21:42:21 +00:00
Matt Arsenault 3138075dd4 DAG: Make mayBeEmittedAsTailCall parameter const
llvm-svn: 300603
2017-04-18 21:16:46 +00:00
Matt Arsenault aa31dce3c5 Fix typo
llvm-svn: 300597
2017-04-18 20:59:46 +00:00
Matt Arsenault 161e2b4223 AMDGPU: Make MFI fields private
llvm-svn: 300596
2017-04-18 20:59:40 +00:00
Craig Topper eae6db0e5c [MemoryBuiltins] Use ImmutableCallSite instead of CallSite to remove a const_cast and const correct. NFCI
llvm-svn: 300585
2017-04-18 20:17:23 +00:00
Daniel Berlin 9d0042b47c NewGVN: Fix memory congruence verification. The return true should be a return false. Merge the appropriate if statements so it doesn't happen again.
llvm-svn: 300584
2017-04-18 20:15:47 +00:00
Chih-Hung Hsieh 877923a87f [X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64.
Android x86_64 target uses f128 type and stores f128 values in %xmm* registers.
SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value
from f128 to i128.

Differential Revision: http://reviews.llvm.org/D32102

llvm-svn: 300583
2017-04-18 20:15:18 +00:00
Craig Topper ae8bd67d96 [APInt] Inline the single word case of lshrInPlace similar to what we do for <<=.
llvm-svn: 300577
2017-04-18 19:13:27 +00:00
Simon Pilgrim 9398649fea [X86][SSE] Add scheduling latency/throughput tests for (most) SSE1 instructions
llvm-svn: 300576
2017-04-18 19:04:40 +00:00
Easwaran Raman 76aba5f6d7 [SLP vectorizer] Allow phi node reordering in tryToVectorizeList.
In tryToVectorizeList, under a very limited circumstance (when entered
from tryToVectorizePair), the values may be reordered (swapped) and the
SLP tree is built with the new order. This extends that to the case when
starting from phis in vectorizeChainsInBlock when there are exactly two
phis. The textual order of phi nodes shouldn't really matter. Without
this change, the loop body in the accompnaying test case is fully vectorized
when we swap the orde of the phis but not with this order. While this
doesn't solve the phi-ordering problem in a general way (for more than 2
phis), this is simple fix that piggybacks on an existing mechanism and
is useful in cases like multiplying two complex numbers.

Differential revision: https://reviews.llvm.org/D32065

llvm-svn: 300574
2017-04-18 18:16:57 +00:00
Simon Pilgrim e8ad1da4e2 [X86] Use for-range loop. NFCI.
llvm-svn: 300567
2017-04-18 17:18:54 +00:00
Craig Topper fc947bcfba [APInt] Use lshrInPlace to replace lshr where possible
This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.

This adds an lshrInPlace(const APInt &) version as well.

Differential Revision: https://reviews.llvm.org/D32155

llvm-svn: 300566
2017-04-18 17:14:21 +00:00
Daniel Berlin ec9deb7f54 NewGVN: Don't waste time value numbering unreachable blocks
llvm-svn: 300565
2017-04-18 17:06:11 +00:00
Nirav Dave 855ef45602 [DAG] Improve store merge candidate pruning.
Remove non-consecutive stores from store merge candidate search as
they cannot be merged and will prevent us from finding subsequent
mergeable store cases.

Reviewers: jyknight, bogner, javed.absar, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32086

llvm-svn: 300561
2017-04-18 15:36:34 +00:00
Nirav Dave e50544cdf2 Add base-index-based store merge test
llvm-svn: 300559
2017-04-18 15:12:13 +00:00
Zvi Rackover d942397e24 LoopRerollPass: Prefer Value::hasOneUse() over Value::getNumUses(). NFC.
getNumUses() can be more expensive as it iterates over all list's elements.

llvm-svn: 300558
2017-04-18 14:55:43 +00:00
Gil Rapaport fb1d915ab2 [LV] Cache block mask values
This patch is part of D28975's breakdown.

Add caching for block masks similar to the cache already used for edge masks,
replacing generation per user with reusing the first generated value which
dominates all uses.

Differential Revision: https://reviews.llvm.org/D32054

llvm-svn: 300557
2017-04-18 14:43:43 +00:00
Sanjay Patel 78d163c79e [ConstantRange] fix doxygen comment formatting; NFC
llvm-svn: 300554
2017-04-18 14:27:24 +00:00
Nikolai Bozhenov 95fc644148 Make globalaa-retained.ll test catching more cases.
Summary:
* Add checks for store. That is needed because GlobalsAA is called
  twice in the current pipeline with different sets of Function passes
  following it. However, the loads are eliminated using instcombine
  which happens everywhere. On the other hand, DeadStoreElimination is
  performed only once so by checking for store we'll be able to catch
  more cases when GlobalsAA is invalidated unintentionally.
* Add empty function above/below the test so that we don't depend on
  the relative order of instcombine/dead-store-elimination and the
  pass that invalidates the analysis (inside the same
  FunctionPassManager).

Reviewers: kristof.beyls

Reviewed By: kristof.beyls

Subscribers: llvm-commits, n.bozhenov

Differential Revision: https://reviews.llvm.org/D32015
Patch by Andrei Elovikov <andrei.elovikov@intel.com>

llvm-svn: 300553
2017-04-18 13:29:26 +00:00
Nikolai Bozhenov 9e4a1c39db [GVNHoist] Mark GlobalsAA as preserved by GVNHoist.
Reviewers: sebpop, hiraditya

Reviewed By: sebpop

Subscribers: n.bozhenov, llvm-commits

Differential Revision: https://reviews.llvm.org/D32158
Patch by Andrei Elovikov <andrei.elovikov@intel.com>

llvm-svn: 300552
2017-04-18 13:25:49 +00:00
Nirav Dave b97768499f Add store Merge test.
llvm-svn: 300551
2017-04-18 13:25:19 +00:00
Oliver Stannard 7ad2e8aae1 [ARM] Add hardware build attributes in assembler
In the assembler, we should emit build attributes based on the target
selected with command-line options. This matches the GNU assembler's
behaviour. We only do this for build attributes which describe the
hardware that is expected to be available, not the ones that describe
ABI compatibility.

This is done by moving some of the attribute emission code to
ARMTargetStreamer, so that it can be shared between the assembly and
code-generation code paths. Since the assembler only creates a
MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to
check raw features, and not use the convenience functions in
ARMSubtarget.

If different attributes are later specified using the .eabi_attribute
directive, then they will take precedence, as happens when the same
.eabi_attribute is specified twice.

This must be enabled by an option, because we don't want to do this when
parsing inline assembly. The attributes would match the ones emitted at
the start of the file, so wouldn't actually change the emitted object
file, but the extra directives would be added to every inline assembly
block when emitting assembly, which we'd like to avoid.

The majority of the changes in the build-attributes.ll test are just
re-ordering the directives, because the hardware attributes are now
emitted before the ABI ones. However, I did fix one bug which I spotted:
Tag_CPU_arch_profile was not being emitted for v6M.

Differential revision: https://reviews.llvm.org/D31812

llvm-svn: 300547
2017-04-18 12:52:35 +00:00
Diana Picus a3a0cccb2c [ARM] GlobalISel: Add support for G_SUB
Support G_SUB throughout the GlobalISel pipeline. It is exactly the same
as G_ADD, nothing fancy.

llvm-svn: 300546
2017-04-18 12:35:28 +00:00
Andrea Di Biagio 517e3fc34c [SampleProfile] Don't assert when printing the DebugLoc of a branch. NFC.
llvm-svn: 300544
2017-04-18 11:27:58 +00:00
Andrea Di Biagio e3edef0977 [SampleProfile] Skip intrinsic calls when visiting callsites in InlineHotFunctions.
Before this patch, we always called method 'findCalleeFunctionSamples()' on
intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable
candidates for obvious reasons.

No functional change intended.

Differential Revision: https://reviews.llvm.org/D32008

llvm-svn: 300541
2017-04-18 10:08:53 +00:00
Kristof Beyls a4e79cca77 Revert "[GlobalISel] Support vector-of-pointers in LLT"
This reverts r300535 and r300537.
The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
produces slightly different code between LLVM versions being built with different compilers.
E.g., dependent on the compiler LLVM is built with, either one of the following
can be produced:

remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement)
remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement)

Non-determinism like this is clearly a bad thing, so reverting this until
I can find and fix the root cause of the non-determinism.

llvm-svn: 300538
2017-04-18 09:26:36 +00:00
Kristof Beyls c10e625076 Fix gcc build after r300535.
llvm-svn: 300537
2017-04-18 08:47:55 +00:00
Diana Picus e2626bb7c2 [ARM] Check for correct HW div when lowering divmod
For subtargets that use the custom lowering for divmod, e.g. gnueabi,
we used to check if the subtarget has hardware divide and then lower to
a div-mul-sub sequence if true, or to a libcall if false.

However, judging by the usage of hasDivide vs hasDivideInARMMode, it
seems that hasDivide only refers to Thumb. For instance, in the
ARMTargetLowering constructor, the code that specifies whether to use
libcalls for (S|U)DIV looks like this:

bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide()
                                      : Subtarget->hasDivideInARMMode();

In the case of divmod for arm-gnueabi, using only hasDivide() to
determine what to do means that instead of lowering to __aeabi_idivmod
to get the remainder, we lower to div-mul-sub and then further lower the
div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but
not in Thumb, we generate a libcall instead of using it (this is not an
issue in practice since AFAICT none of the cores that we support have
hardware divide in ARM but not Thumb).

This patch fixes the code dealing with custom lowering to take into
account the mode (Thumb or ARM) when deciding whether or not hardware
division is available.

Differential Revision: https://reviews.llvm.org/D32005

llvm-svn: 300536
2017-04-18 08:32:27 +00:00
Kristof Beyls fb73eb0324 [GlobalISel] Support vector-of-pointers in LLT
This fixes PR32471.

As comment 10 on that bug report highlights
(https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
few different defendable design tradeoffs that could be made, including
not representing pointers at all in LLT.

I decided to go for representing vector-of-pointer as a concept in LLT,
while keeping the size of the LLT type 64 bits (this is an increase from
48 bits before). My rationale for keeping pointers explicit is that on
some targets probably it's very handy to have the distinction between
pointer and non-pointer (e.g. 68K has a different register bank for
pointers IIRC). If we keep a scalar pointer, it probably is easiest to
also have a vector-of-pointers to keep LLT relatively conceptually clean
and orthogonal, while we don't have a very strong reason to break that
orthogonality. Once we gain more experience on the use of LLT, we can
of course reconsider this direction.

Rejecting vector-of-pointer types in the IRTranslator is also an option
to avoid the crash reported in PR32471, but that is only a very
short-term solution; also needs quite a bit of code tweaks in places,
and is probably fragile. Therefore I didn't consider this the best
option.

llvm-svn: 300535
2017-04-18 08:12:45 +00:00
Leslie Zhai d6fe0db8eb test commit
llvm-svn: 300532
2017-04-18 07:28:54 +00:00
Craig Topper 9eaef07519 [APInt] Cleanup the reverseBits slow case a little.
Use lshrInPlace. Use single bit extract and operator|=(uint64_t) to avoid a few temporary APInts.

llvm-svn: 300527
2017-04-18 05:02:21 +00:00
Craig Topper a8a4f0db79 [APInt] Make operator<<= shift in place. Improve the implementation of tcShiftLeft and use it to implement operator<<=.
llvm-svn: 300526
2017-04-18 04:39:48 +00:00
Adrian Prantl 6825fb64e9 PR32382: Fix emitting complex DWARF expressions.
The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
  - describe a variable in a register
  - consist of only a DW_OP_reg
2. Memory location descriptions
  - describe the address of a variable
3. Implicit location descriptions
  - describe the value of a variable
  - end with DW_OP_stack_value & friends

The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident.  This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.

This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.

There are two major changes in this patch:

I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.

When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
  DBG_VALUE, [RBP-8]                        --> DW_OP_fbreg -8
  DBG_VALUE, RAX                            --> DW_OP_reg0 +0
  DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0

All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.

https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>

Differential Revision: https://reviews.llvm.org/D31439

llvm-svn: 300522
2017-04-18 01:21:53 +00:00
George Burgess IV b71bc44bf4 Add const to a const method. NFC
llvm-svn: 300520
2017-04-18 01:04:05 +00:00
Davide Italiano 3e9986f368 [Target] Use hasOneUse() instead of getNumUses().
The latter does a liner scan over a linked list, therefore is
much more expensive.

llvm-svn: 300518
2017-04-18 00:29:54 +00:00
Peter Collingbourne 76423dce15 Object: Shrink the size of irsymtab::Symbol by a word. NFCI.
Instead of storing an UncommonIndex on the Symbol, use a flag bit to store
whether the Symbol has an Uncommon. This shrinks Chromium's .bc files (after
D32061) by about 1%.

Differential Revision: https://reviews.llvm.org/D32070

llvm-svn: 300514
2017-04-17 23:43:49 +00:00
Dehao Chen 1ea8bd8109 Build SymbolMap in SampleProfileLoader to help matchin function names with suffix.
Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline.

Reviewers: davidxl, dnovillo, tejohnson

Reviewed By: davidxl

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31952

llvm-svn: 300507
2017-04-17 22:23:05 +00:00
Haicheng Wu 68f82a31d3 Change the testcase tail-merge-after-mbp.ll to tail-merge-after-mbp.mir
Differential Revision: https://reviews.llvm.org/D32037

llvm-svn: 300506
2017-04-17 22:22:38 +00:00
Craig Topper c228068d90 [SimplifyCFG] Use hasNUses instead of comparing getNumUses to a constant."
The use list is a linked list so getNumUses requires a linear scan through the whole list. hasNUses will stop scanning at N and see if that is the end.

llvm-svn: 300505
2017-04-17 22:13:00 +00:00
Craig Topper 9575d8ff36 [APInt] Merge the multiword code from lshrInPlace and tcShiftRight into a single implementation
This merges the two different multiword shift right implementations into a single version located in tcShiftRight. lshrInPlace now calls tcShiftRight for the multiword case.

I retained the memmove fast path from lshrInPlace and used a memset for the zeroing. The for loop is basically tcShiftRight's implementation with the zeroing and the intra-shift of 0 removed.

Differential Revision: https://reviews.llvm.org/D32114

llvm-svn: 300503
2017-04-17 21:43:43 +00:00
Jacob Gravelle 0bb7541233 [WebAssembly] Fix WebAssemblyOptimizeReturned after r300367
Summary:
Refactoring changed paramHasAttr(1 + i) to paramHasAttr(0), fix that to
paramHasAttr(i).
Add more tests to WebAssemblyOptimizeReturned that catch that
regression.

Reviewers: dschuff

Subscribers: jfb, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D32136

llvm-svn: 300502
2017-04-17 21:40:28 +00:00
Benjamin Kramer 61d85bc9ae [SCEV] Fix another unused variable warning in release builds.
llvm-svn: 300500
2017-04-17 21:07:26 +00:00
Wei Mi 66c4dd2e29 Fix an unused variable error in rL300494.
llvm-svn: 300499
2017-04-17 21:00:45 +00:00
Kostya Serebryany ac7a9eae0b [libFuzzer] experimental option -cleanse_crash: tries to replace all bytes in a crash reproducer with garbage, while still preserving the crash
llvm-svn: 300498
2017-04-17 20:58:21 +00:00
Sylvestre Ledru b0cec31db2 Add a linker script to version LLVM symbols
Summary:
This patch adds a very simple linker script to version the lib's symbols
and thus trying to avoid crashes if an application loads two different
LLVM versions (as long as they do not share data between them).

Note that we deliberately *don't* make LLVM_5.0 depend on LLVM_4.0:
they're incompatible and the whole point of this patch is
to tell the linker that.


Avoid unexpected crashes when two LLVM versions are used in the same process.

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Author: Lisandro Damían Nicanor Pérez Meyer <lisandro@debian.org>
Author: Sylvestre Ledru <sylvestre@debian.org>
Bug-Debian:  https://bugs.debian.org/848368


Reviewers: beanz, rnk

Reviewed By: rnk

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D31524

llvm-svn: 300496
2017-04-17 20:51:50 +00:00
Davide Italiano cdc937d0fc [InstCombine] Matchers work with both ConstExpr and Instructions.
So, `cast<Instruction>` is not guaranteed to succeed. Change the
code so that we create a new constant and use it in the newly
created instruction, as it's done in other places in InstCombine.

OK'ed by Sanjay/Craig. Fixes PR32686.

llvm-svn: 300495
2017-04-17 20:49:50 +00:00
Wei Mi 8c4053372e [SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to prevent
the exponential behavior.

The patch is to fix PR32043. Functions getZeroExtendExpr and getSignExtendExpr
may call themselves recursively more than once. This is potentially a 2^N
complexity behavior. The exponential behavior was not commonly exposed before
because of existing global cache mechnism like UniqueSCEVs or some early return
mechanism when flags FlagNSW or FlagNUW are seen. However, we still have case
which can expose the exponential behavior, like the case in PR32043, so we add
a local cache in getZeroExtendExpr and getSignExtendExpr. If the input of the
functions -- SCEV and type pair have been seen before, we can find the extended
expression directly in the local cache.

Differential Revision: https://reviews.llvm.org/D30350

llvm-svn: 300494
2017-04-17 20:40:05 +00:00
Sanjay Patel 9083dc9680 [InstSimplify] add/move tests for (icmp X, C1 & icmp X, C2); NFC
We simplify based on range intersection, but we're missing folds.

llvm-svn: 300493
2017-04-17 20:38:33 +00:00
Dehao Chen 6afb3167e9 Update the test to fix the buildbot failure introduced by r300486 (NFC)
llvm-svn: 300492
2017-04-17 20:35:32 +00:00
Derek Schuff f7a4f3dd95 [WebAssembly] Encode block signatures as SLEB instead of ULEB
Use SLEB (varint) for block_type immediates in accordance with the spec.

Patch by Yury Delendik

llvm-svn: 300490
2017-04-17 20:28:28 +00:00
Dehao Chen ef700d550e Add GNU_discriminator support for inline callsites in llvm-symbolizer.
Summary: LLVM symbolize cannot recognize GNU_discriminator for inline callsites. This patch adds support for it.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32134

llvm-svn: 300486
2017-04-17 20:10:39 +00:00
Matt Arsenault a3566f2149 AMDGPU: Use MachineRegisterInfo to find max used register
Avoid looping through program to determine register counts.
This avoids needing to look at regmask operands.

Also fixes some counting errors with flat_scr when there
are no stack objects.

llvm-svn: 300482
2017-04-17 19:48:30 +00:00
Matt Arsenault 869fec278c AMDGPU: Change stack alignment
While the incoming stack for a kernel is 256-byte aligned,
this refers to the base address of the entire wave. This isn't
useful information for most of codegen. Fixes unnecessarily
aligning stack objects in callees.

llvm-svn: 300481
2017-04-17 19:48:24 +00:00
Brendon Cahoon 7769a0854e [CodeGenPrepare] Fix crash due to an invalid CFG
The splitIndirectCriticalEdges function generates and invalid CFG when the
'Target' basic block is a loop to itself. When this occurs, the code that
updates the predecessor terminator needs to update the terminator in the split
basic block.

This occurs when there is an edge from block D back to D. Since D is split in
to D0 and D1, the code needs to update the terminator in D1. But D1 is not in
the OtherPreds vector, so it was not getting updated.

Differential Revision: https://reviews.llvm.org/D32126

llvm-svn: 300480
2017-04-17 19:11:04 +00:00
Benjamin Kramer 54c781a0b5 Unbreak build of the wasm backend after r300463.
llvm-svn: 300479
2017-04-17 19:08:41 +00:00
Peter Collingbourne 4cba6ecb06 Bitcode: Add missing build dep to fix shlib build.
llvm-svn: 300478
2017-04-17 18:53:27 +00:00
Craig Topper 7abfbdf8a4 [APInt] Remove self move check from move assignment operator
This was added to work around a bug in MSVC 2013's implementation of stable_sort. That bug has been fixed as of MSVC 2015 so we shouldn't need this anymore.

Technically the current implementation has undefined behavior because we only protect the deleting of the pVal array with the self move check. There is still a memcpy of that.VAL to VAL that isn't protected. In the case of self move those are the same local and memcpy is undefined for src and dst overlapping.

This reduces the size of the opt binary on my local x86-64 build by about 4k.

Differential Revision: https://reviews.llvm.org/D32116

llvm-svn: 300477
2017-04-17 18:44:27 +00:00
Craig Topper 5b4f5b0887 [IR] Implement DataLayout::getPointerTypeSizeInBits using getPointerSizeInBits directly
Currently we use getTypeSizeInBits which contains a switch statement to dispatch based on what the Type is. We know we always have a pointer type here, but the compiler isn't able to figure out that out to remove the switch.

This patch changes it to just call handle the pointer type directly by calling getPointerSizeInBits without going through a switch.

getPointerTypeSizeInBits is called pretty often, particularly by getOrEnforceKnownAlignment which is used by InstCombine. This should speed that up a little bit.

Differential Revision: https://reviews.llvm.org/D31841

llvm-svn: 300475
2017-04-17 18:22:36 +00:00
Tim Northover 46e36f0953 AArch64: put nonlazybind special handling behind a flag for now.
It's basically a terrible idea anyway but objc_msgSend gets emitted like that.
We can decide on a better way to deal with it in the unlikely event that anyone
actually uses it.

llvm-svn: 300474
2017-04-17 18:18:47 +00:00
Konstantin Zhuravlyov dfe2ffe8c5 AMDGPU: Test handling of R_AMDGPU_ABS64 in RelocVisitor
llvm-svn: 300472
2017-04-17 18:12:45 +00:00
Craig Topper bf1ded2169 [IR] Put the Use list waymarking bits in the bit positions documentation says they are using
The documentation for the waymarking algorithm says that we use the lower 2 bits of Use::Prev to store the way marking bits. But because we use a PointerIntPair with the default PointerLikeTypeTraits, we're using bits 2:1 on 64-bit targets.

There's also a trick employed for distinguishing Users that have Uses stored with them and Users that have Uses stored in a separate array. The documentation says we use the LSB of the first byte of the real User object or the User* that occurs at the end of the Use array. But again due to the PointerLikeTypeTraits we're really using bit 2(64-bit) or bit 1(32-bit) and not the LSB. This is a little worrying because the first byte of the User object is the vtable ptr so we're assuming the vtable has 8 byte or 4 byte alignment where what is documented would only require 2 byte alignment.

This patch provides a custom traits override for these two cases to put the bits where the documentation says they are. It also has the side effect of removing some shifts from the waymarking traversal implementation.

Differential Revision: https://reviews.llvm.org/D31733

llvm-svn: 300471
2017-04-17 18:12:30 +00:00
Konstantin Zhuravlyov 12096848fd AMDGPU: Set CodePointerSize to 8 for amdgcn
llvm-svn: 300470
2017-04-17 18:02:09 +00:00
Peter Collingbourne c74cf06ee4 Object: Use offset+size as the irsymtab string representation.
This is consistent with the bitcode string table.

Differential Revision: https://reviews.llvm.org/D31922

llvm-svn: 300465
2017-04-17 17:55:24 +00:00
Peter Collingbourne a0f371a106 Bitcode: Add a string table to the bitcode format.
Add a top-level STRTAB block containing a string table blob, and start storing
strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in
the string table.

This change allows us to share names between globals and comdats as well
as between modules, and improves the efficiency of loading bitcode files by
no longer using a bit encoding for symbol names. Once we start writing the
irsymtab to the bitcode file we will also be able to share strings between
it and the module.

On my machine, link time for Chromium for Linux with ThinLTO decreases by
about 7% for no-op incremental builds or about 1% for full builds. Total
bitcode file size decreases by about 3%.

As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html

Differential Revision: https://reviews.llvm.org/D31838

llvm-svn: 300464
2017-04-17 17:51:36 +00:00
Konstantin Zhuravlyov dc77b2e960 Distinguish between code pointer size and DataLayout::getPointerSize() in DWARF info generation
llvm-svn: 300463
2017-04-17 17:41:25 +00:00
Tim Northover 879a0b2e1b AArch64: support nonlazybind
It's almost certainly not a good idea to actually use it in most cases (there's
a pretty large code size overhead on AArch64), but we can't do those
experiments until it's supported.

llvm-svn: 300462
2017-04-17 17:27:56 +00:00
Craig Topper d23004c37b Introduce APInt::isSignBitSet/isSignBitClear. Use in place isSignBitSet in place of isNegative in known bits tracking.
This makes statements like KnownZero.isNegative() (which means the value we're tracking is positive) less confusing.

llvm-svn: 300457
2017-04-17 16:38:20 +00:00
Matt Arsenault 7205f3c2e4 AMDGPU: SimplifyDemandedElts for image intrinsics
Causes some VGPR usage improvements in shaderdb, but
introduces some SGPR spilling regressions due to random
scheduling changes later.

llvm-svn: 300453
2017-04-17 15:12:44 +00:00
Davide Italiano ce161a7812 [LCSSA] Don't insert tokens into the worklist at all.
We're gonna skip them anyway, so there's no point in inserting them
in the first place.

llvm-svn: 300452
2017-04-17 14:32:05 +00:00
Amaury Sechet f8429754d8 Introducing LLVMMetadataRef
Summary:
This seems like an uncontroversial first step toward providing access to the metadata hierarchy that now exists in LLVM. This should allow for good debug info support from C.

Future plans are to deprecate API that take mixed bags of values and metadata (mainly the LLVMMDNode family of functions) and migrate the rest toward the use of LLVMMetadataRef.

Once this is in place, mapping of DIBuilder will be able to start.

Reviewers: mehdi_amini, echristo, whitequark, jketema, Wallbraker

Reviewed By: Wallbraker

Subscribers: Eugene.Zelenko, axw, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D19448

llvm-svn: 300447
2017-04-17 11:52:54 +00:00
Max Kazantsev 751579cac0 [LoopPeeling] Get rid of Phis that become invariant after N steps
This patch is a generalization of the improvement introduced in rL296898.
Previously, we were able to peel one iteration of a loop to get rid of a Phi that becomes
an invariant on the 2nd iteration. In more general case, if a Phi becomes invariant after
N iterations, we can peel N times and turn it into invariant.
In order to do this, we for every Phi in loop's header we define the Invariant Depth value
which is calculated as follows:

Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge].

If %y is a loop invariant, then Depth(%x) = 1.
If %y is a Phi from the loop header, Depth(%x) = Depth(%y) + 1.
Otherwise, Depth(%x) is infinite.
Notice that if we peel a loop, all Phis with Depth = 1 become invariants,
and all other Phis with finite depth decrease the depth by 1.
Thus, peeling N first iterations allows us to turn all Phis with Depth <= N
into invariants.

Reviewers: reames, apilipenko, mkuper, skatkov, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31613

llvm-svn: 300446
2017-04-17 09:52:02 +00:00
Serguei Katkov 11d9c4f691 [BPI] NFC: reorder ifs to bail out earlier
This is non-functional change to re-order if statements to bail out earlier
from unreachable and ColdCall heuristics.

Reviewers: sanjoy, reames, junbuml, vsk, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31704

llvm-svn: 300442
2017-04-17 06:39:47 +00:00
Max Kazantsev 8ed6b66d85 [LoopPeeling] Fix condition for phi-eliminating peeling
When peeling loops basing on phis becoming invariants, we make a wrong loop size check.
UP.Threshold should be compared against the total numbers of instructions after the transformation,
which is equal to 2 * LoopSize in case of peeling one iteration.
We should also check that the maximum allowed number of peeled iterations is not zero.

Reviewers: sanjoy, anna, reames, mkuper

Reviewed By: mkuper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31753

llvm-svn: 300441
2017-04-17 05:38:28 +00:00
Serguei Katkov 2616bbb16d [BPI] Use metadata info before any other heuristics
Metadata potentially is more precise than any heuristics we use, so
it makes sense to use first metadata info if it is available. However it makes
sense to examine it against other strong heuristics like unreachable one.
If edge coming to unreachable block has higher probability then it is expected 
by unreachable heuristic then we use heuristic and remaining probability is
distributed among other reachable blocks equally.

An example where metadata might be more strong then unreachable heuristic is
as follows: it is possible that there are two branches and for the branch A
metadata says that its probability is (0, 2^25). For the branch B
the probability is (1, 2^25).
So the expectation is that first edge of B is hotter than first edge of A
because first edge of A did not executed at least once.
If first edge of A points to the unreachable block then using the unreachable
heuristics we'll set the probability for A to (1, 2^20) and now edge of A
becomes hotter than edge of B.
This is unexpected behavior.

This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214

Reviewers: sanjoy, junbuml, vsk, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits, reames, davidxl

Differential Revision: https://reviews.llvm.org/D30631

llvm-svn: 300440
2017-04-17 04:33:04 +00:00
Craig Topper 218a359fbd [InstCombine] Simplify 1/X for vectors.
llvm-svn: 300439
2017-04-17 03:41:47 +00:00
Craig Topper eee53c030a [InstCombine] Add test cases for missing support for simplifying 1/X for vectors. NFC
llvm-svn: 300438
2017-04-17 03:41:44 +00:00
Craig Topper 1a18a7c51e [InstCombine] Add support for vector srem->urem.
llvm-svn: 300437
2017-04-17 01:51:24 +00:00
Craig Topper b60f300afb [InstCombine] Add missing testcases for srem->urem conversion. The vector version isn't currently supported. NFC
llvm-svn: 300436
2017-04-17 01:51:21 +00:00
Craig Topper f248468359 [InstCombine] Add support for turning vector sdiv into udiv.
llvm-svn: 300435
2017-04-17 01:51:19 +00:00
Craig Topper 43b012b1b3 [InstCombine] Add test cases for missing support for turning vector sdiv into udiv. NFC
llvm-svn: 300434
2017-04-17 01:51:16 +00:00
Davide Italiano ee654bf5f1 [LCSSA] Simplify a loop. NFCI.
llvm-svn: 300433
2017-04-17 00:02:45 +00:00
Craig Topper da886c665b [InstCombine][ValueTracking] When computing known bits for Srem make sure we don't compute known bits for the LHS twice.
If we already called computeKnownBits for the RHS being a constant power of 2, we've already computed everything we can and should just stop. I think previously we would still recurse if we had determined the result was negative or had not determined the sign bit at all.

llvm-svn: 300432
2017-04-16 21:46:12 +00:00
Davide Italiano dd37c67d81 [LCSSA] Fix non-determinism due to iterating over a SmallPtrSet.
Use a SmallSetVector instead.

llvm-svn: 300431
2017-04-16 21:07:04 +00:00
Craig Topper 0d304f01b4 [InstCombine] In SimplifyDemandedUseBits, don't bother to mask known bits of constants with DemandedMask.
Just because we didn't demand them doesn't mean they aren't known.

llvm-svn: 300430
2017-04-16 20:55:58 +00:00
Benjamin Kramer f5f593b674 [X86] Remove special handling for 16 bit for A asm constraints.
Our 16 bit support is assembler-only + the terrible hack that is
.code16gcc. Simply using 32 bit registers does the right thing for the
latter.

Fixes PR32681.

llvm-svn: 300429
2017-04-16 20:13:08 +00:00
Bryant Wong c819ba8874 MemorySSA: Stop tracking def-or-use blocks.
The tracking is unused, since MemoryPhis are not pruned as of r282419.

Differential Revision: https://reviews.llvm.org/D32121

llvm-svn: 300428
2017-04-16 19:45:51 +00:00
Sanjay Patel 35ed2413af [InstSimplify] improve getTrue/getFalse; NFCI
The ConstantInt version has the same assert, and using null/allOnes is likely less efficient.
The only advantage of these local variants (and there's probably a better way to achieve this?)
is to save typing "ConstantInt::" over and over.

llvm-svn: 300426
2017-04-16 17:43:11 +00:00
Dimitry Andric 6380f0ffce Garbage collect HAVE_EXECINFO_H from config.h.cmake after r300062. NFCI.
llvm-svn: 300425
2017-04-16 17:22:44 +00:00