Commit Graph

140188 Commits

Author SHA1 Message Date
Craig Topper bf9e5a16a4 [X86] Don't use loadv2i64 on SSE version of PMULHRSW. Use memopv2i64 instead.
This bug was introduced in r285501.

llvm-svn: 285510
2016-10-30 00:02:55 +00:00
NAKAMURA Takumi ff76cfefc0 NativeFormatting.cpp: Fix build for mingw. Where would writePadding() be?
llvm-svn: 285509
2016-10-29 23:14:18 +00:00
Teresa Johnson 38d4df714c [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)
Rename as suggested in code review for D26063.

llvm-svn: 285508
2016-10-29 21:52:23 +00:00
Teresa Johnson 1b9c2be8f4 [ThinLTO] Use NoPromote flag in summary during promotion
Summary:
Replace the check of whether a GV has a section with the flag check
in the summary. This is in preparation for using the NoPromote flag
to convey other situations when we can't promote (e.g. locals used in
inline asm).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26063

llvm-svn: 285507
2016-10-29 21:31:48 +00:00
Peter Collingbourne 310474f576 IR: Remove a no longer needed assert.
This assert was checking for a miscompile in a version of GCC that
we no longer support.

llvm-svn: 285506
2016-10-29 20:57:12 +00:00
Craig Topper defe9ffbb5 [X86] Use intrinsics table for VPMULHRSW intrincis so that the legacy intrinsics can select EVEX encoded instructions when available.
This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated.

llvm-svn: 285501
2016-10-29 18:41:45 +00:00
Sanjay Patel 36eeb6d6f6 [ValueTracking] recognize more variants of smin/smax
Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. 
There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding
smax pattern and commuted variants.

The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We
can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR.

Differential Revision: https://reviews.llvm.org/D26091

llvm-svn: 285499
2016-10-29 16:21:19 +00:00
Sanjay Patel e9fa95e572 [x86] add tests for smin/smax matchSelPattern (D26091)
llvm-svn: 285498
2016-10-29 16:02:57 +00:00
Sanjay Patel 978f827d12 [InstCombine] re-use bitcasted compare operands in selects (PR28001)
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.

The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6

Differential Revision: https://reviews.llvm.org/D25943

llvm-svn: 285495
2016-10-29 15:22:04 +00:00
Simon Pilgrim 75a697a17e [DAGCombiner] (REAPPLIED) Add vector demanded elements support to computeKnownBits
Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.

This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.

The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.

I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.

DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.

This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander!

Differential Revision: https://reviews.llvm.org/D25691

llvm-svn: 285494
2016-10-29 11:29:39 +00:00
Elena Demikhovsky 519b4ccd70 Fixed FMA + FNEG combine.
Masked form of FMA should be omitted in this optimization.

Differential Revision: https://reviews.llvm.org/D25984

llvm-svn: 285492
2016-10-29 08:44:46 +00:00
Matt Arsenault c88ba36eab AMDGPU: Use 1/2pi inline imm on VI
I'm guessing at how it is supposed to be printed

llvm-svn: 285490
2016-10-29 04:05:06 +00:00
Matthias Braun 7d78614ae9 AArch64DeadRegisterDefinitionsPass: Cleanup; NFC
- Fix doxygen file comment
- reduce indentation in loop
- Factor out some common subexpressions
- Move independent helper function out of class
- Fix Changed flag (this is not strictly NFC but a bugfix, but the flag
  seems ignored anyway)

llvm-svn: 285488
2016-10-29 01:03:41 +00:00
Rui Ueyama 77be2403f6 Define calculateDbgStreamSize for consistency.
llvm-svn: 285487
2016-10-29 00:56:44 +00:00
Tim Shen 1bab9cfbe5 [APFloat] Remove the redundent function body of uninitialized ctor, which should be done in r285468
llvm-svn: 285486
2016-10-29 00:51:41 +00:00
Zachary Turner 5b2243e884 Resubmit "Add support for advanced number formatting."
This resubmits r284436 and r284437, which were reverted in
r284462 as they were breaking the AArch64 buildbot.

The breakage on AArch64 turned out to be a miscompile which is
still not fixed, but is actively tracked at llvm.org/pr30748.

This resubmission re-writes the code in a way so as to make the
miscompile not happen.

llvm-svn: 285483
2016-10-29 00:27:22 +00:00
Rui Ueyama c95b46449a Do not print out Flags field twice.
llvm-svn: 285481
2016-10-28 23:57:37 +00:00
Davide Italiano 86168b23cf [DAGCombiner] Fix a crash visiting `AND` nodes.
Instead of asserting that the shift count is != 0 we just bail out
as it's not profitable trying to optimize a node which will be
removed anyway.

Differential Revision:  https://reviews.llvm.org/D26098

llvm-svn: 285480
2016-10-28 23:55:32 +00:00
Tom Stellard 6695ba0440 AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions
Summary:
Flat instruction can return out of order, so we need always need to wait
for all the outstanding flat operations.

Reviewers: tony-tye, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl

Differential Revision: https://reviews.llvm.org/D25998

llvm-svn: 285479
2016-10-28 23:53:48 +00:00
Matt Arsenault 4e9c1e3a79 AMDGPU: Fix instruction flags for s_endpgm
Set isReturn, remove hasSideEffects. Also remove
hasCtrlDep, I'm not really sure what that does.

llvm-svn: 285476
2016-10-28 23:00:38 +00:00
Adrian Prantl 3cd37d0aeb Refactor DW_LNE_* into Dwarf.def
llvm-svn: 285475
2016-10-28 22:57:02 +00:00
Adrian Prantl 79deba6446 Refactor DW_LNS_* into Dwarf.def
llvm-svn: 285474
2016-10-28 22:56:59 +00:00
Adrian Prantl 8580d3f3d3 Refactor DW_APPLE_PROPERTY_* into Dwarf.def
llvm-svn: 285473
2016-10-28 22:56:56 +00:00
Adrian Prantl 44a4461b16 Refactor DW_CFA_* into Dwarf.def
llvm-svn: 285472
2016-10-28 22:56:53 +00:00
Adrian Prantl d50e3e0593 Remove whitespace
llvm-svn: 285471
2016-10-28 22:56:50 +00:00
Adrian Prantl 23865816d5 Refactor all DW_FORM_* constants into Dwarf.def
llvm-svn: 285470
2016-10-28 22:56:45 +00:00
Tim Shen b4991548c8 [APFloat] Fix memory bugs revealed by MSan
Reviewers: eugenis, hfinkel, kbarton, iteratee, echristo

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D26102

llvm-svn: 285468
2016-10-28 22:45:33 +00:00
Justin Bogner db6b6a7f0c SDAG: Make sure we use an allocatable reg class when we create this vreg
As per the discussion on r280783, if constrainRegClass fails we need
to call getAllocatableClass like we did before that commit.

llvm-svn: 285467
2016-10-28 22:42:54 +00:00
Kostya Serebryany 8550238f4a [libFuzzer] mention one more trophie
llvm-svn: 285465
2016-10-28 22:03:54 +00:00
Justin Lebar 1535a5e9df Add missing lit.local.cfg to llvm/test/Transforms/CodeGenPrepare/NVPTX.
llvm-svn: 285464
2016-10-28 21:56:07 +00:00
Matt Arsenault 7b6475568d AMDGPU: Add definitions for scalar store instructions
Also add glc bit to the scalar loads since they exist on VI
and change the caching behavior.

This currently has an assembler bug where the glc bit is incorrectly
accepted on SI/CI which do not have it.

llvm-svn: 285463
2016-10-28 21:55:15 +00:00
Matt Arsenault 4b6a6cc8e9 AMDGPU: Rename glc operand type
While trying to add the glc bit to SMEM instructions on VI
with the new refactoring I ran into some kind of shadowing
problem for the glc operand when using the pseudoinstruction
as a multiclass parameter.

Everywhere that currently uses it defines the operand to have the same
name as its type, i.e. glc:$glc which works. For some reason now it
conflicts, and its up evaluating to the wrong thing. For the
real encoding classes,

let Inst{16} = !if(ps.has_glc, glc, ?); was not being evaluated
and still visible in the Inst initializer in the expanded td file.
In other cases I got a a different error about an illegal operand
where this was using { 0 } initializer from the bits<1> glc initializer
instead of evaluating it as false in the if.

For consistency all of the operand types should probably
be captialized to avoid conflicting with the variable names
unless somebody has a better idea of how to fix this.

llvm-svn: 285462
2016-10-28 21:55:08 +00:00
Justin Lebar f0a80ba385 [NVPTX] Compute 'rem' using the result of 'div', if possible.
Summary:
In isel, transform

  Num % Den

into

  Num - (Num / Den) * Den

if the result of Num / Den is already available.

Reviewers: tra

Subscribers: hfinkel, llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D26090

llvm-svn: 285461
2016-10-28 21:44:00 +00:00
Justin Lebar 0ede5fb1bb Don't leave unused divs/rems sitting around in BypassSlowDivision.
Summary:
This "pass" eagerly creates div and rem instructions even when only one
is needed -- it relies on a later pass (machine DCE?) to clean them up.

This is problematic not just from a cleanliness perspective (this pass
is running during CodeGenPrepare, so should leave the IR in a better
state), but it also creates a problem for instruction selection.  If we
always have a div+rem, isel will always select a divrem instruction (if
possible), even when a single div or rem would do.

Specifically, in NVPTX, we want to compute rem from the output of div,
if available.  But if a div is not available, we want to leave the rem
alone.  This transformation is overeager if div is always available.

Because this code runs as part of CodeGenPrepare, it's nontrivial to
write a test for this change.  But this will effectively be tested by
a later patch which adds the aforementioned change to NVPTX isel.

Reviewers: tra

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26088

llvm-svn: 285460
2016-10-28 21:43:54 +00:00
Justin Lebar 468bf73209 Don't claim the udiv created in BypassSlowDivision is exact.
Summary:
In BypassSlowDivision's short-dividend path, we would create e.g.

  udiv exact i32 %a, %b

"exact" here means that we are asserting that %a is a multiple of %b.
But we have no reason to believe this must be true -- this is just a
bug, as far as I can tell.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D26097

llvm-svn: 285459
2016-10-28 21:43:51 +00:00
Justin Bogner 1b05f6c66d cmake: Enable the lto cache when building with -flto=thin on darwin
llvm-svn: 285450
2016-10-28 20:48:47 +00:00
Matt Arsenault b5f2bb1a88 AMDGPU: Change check prefix in test
llvm-svn: 285449
2016-10-28 20:33:01 +00:00
Adrian Prantl 71385ed8b6 Fix a copy&paste error in the macro definition for HANDLE_DW_MACRO and
HANDLE_DE_RLE. Caught by the LLDB build bot.

llvm-svn: 285448
2016-10-28 20:32:17 +00:00
Matt Arsenault 4eae301995 AMDGPU: Diagnose using too many SGPRs
This is possible when using inline asm.

llvm-svn: 285447
2016-10-28 20:31:47 +00:00
Adrian Prantl 2cdd532532 Remove redundant prefixes from constants and unbreak the LLDB bots.
llvm-svn: 285444
2016-10-28 20:18:26 +00:00
Tim Shen 717d9a1a7a [APFloat] Use std::move() in move assignment operator
llvm-svn: 285442
2016-10-28 20:13:06 +00:00
Krzysztof Parzyszek 2717175c99 Handle non-~0 lane masks on live-in registers in LivePhysRegs
When LivePhysRegs adds live-in registers, it recognizes ~0 as a special
lane mask indicating the entire register. If the lane mask is not ~0,
it will only add the subregisters that overlap the specified lane mask.

The problem is that if a live-in register does not have subregisters,
and the lane mask is not ~0, it will not be added to the live set.
(The given lane mask may simply be the lane mask of its register class.)

If a register does not have subregisters, add it to the live set if
the lane mask is non-zero.

Differential Revision: https://reviews.llvm.org/D26094

llvm-svn: 285440
2016-10-28 20:06:37 +00:00
Matt Arsenault ef00283425 SpeculativeExecution: Allow speculating more inst types
Partial step towards removing the whitelist and only
using TTI's cost.

llvm-svn: 285438
2016-10-28 20:00:33 +00:00
Matt Arsenault 08906a3c62 AMDGPU: Fix using incorrect private resource with no allocation
It's possible to have a use of the private resource descriptor or
scratch wave offset registers even though there are no allocated
stack objects. This would result in continuing to use the maximum
number reserved registers. This could go over the number of SGPRs
available on VI, or violate the SGPR limit requested by
the function attributes.

llvm-svn: 285435
2016-10-28 19:43:31 +00:00
Nemanja Ivanovic e28a0fc72a Implement vector count leading/trailing bytes with zero lsb and vector parity
builtins - llvm portion

This patch corresponds to review https://reviews.llvm.org/D26003.
Committing on behalf of Zaara Syeda.

llvm-svn: 285434
2016-10-28 19:38:24 +00:00
Teresa Johnson 7c31cb1665 [ThinLTO] Use flags from summary when writing variable summary (NFC)
We already read the flags out of the summary when writing the summary
records for functions and aliases, do the same for variables.

This is an NFC change for now since the flags computed on the fly from
the GlobalValue currently will always match those in the summary
already, but once I send a follow-on patch to set the NoRename flag for
locals in the llvm.used set this becomes a necessary change.

llvm-svn: 285433
2016-10-28 19:36:00 +00:00
George Burgess IV 013fd7315f [MemorySSA] Add const to getClobberingMemoryAccess.
Thanks to bryant for the patch!

Differential Revision: https://reviews.llvm.org/D26086

llvm-svn: 285432
2016-10-28 19:22:46 +00:00
Arnold Schwaighofer 6200b2b67e Make swift calling convention test specific to armv7
llvm-svn: 285431
2016-10-28 19:18:09 +00:00
Sanjay Patel 03a585e882 [x86] add tests for missed umin/umax
This is actually a deficiency in ValueTracking's matchSelectPattern(),
but a codegen test is the simplest way to expose the bug.

llvm-svn: 285429
2016-10-28 19:08:20 +00:00
Lang Hames a9682caf96 [Error] Unify +Asserts/-Asserts behavior for checked flags in Error/Expected<T>.
(1) Switches to raw pointer and bitmasking operations for Error payload.
(2) Always includes the 'unchecked' bitfield in Expected<T>, even in -Asserts.
(3) Always propagates checked bit status in move-ops for both classes, even in
    -Asserts.

This should allow debug programs to link against release libraries without
encountering spurious 'unchecked error' terminations.

Error checks still aren't verified in release mode so this doesn't introduce
any new control flow, but it does require new bit-masking ops in release mode
to preserve the flag values during move ops. I expect the overhead to be
minimal, but if we discover any corner cases where it matters we could fix
this by making flag propagation conditional on a new build option.

llvm-svn: 285426
2016-10-28 18:24:15 +00:00
Adrian Prantl 7e55f17825 Move the DWARF attribute constants into Dwarf.def and delete 300 lines of silly code.
llvm-svn: 285425
2016-10-28 18:21:39 +00:00
Matthias Braun de8c1b3433 MachineRegisterInfo: Remove unused arg from isConstantPhysReg(); NFC
llvm-svn: 285423
2016-10-28 18:05:09 +00:00
Matthias Braun 35a024fe0f TargetPassConfig: Move addPass of IPRA RegUsageInfoProp down.
TargetPassConfig::addMachinePasses() does some housekeeping first:
Handling the -print-machineinstrs flag and doing an initial printing
"After Instruction Selection". There is no reason for RegUsageInfoProp
to run before those two steps.

llvm-svn: 285422
2016-10-28 18:05:05 +00:00
Adrian Prantl c4fbbcf9ed Import/update constants from the DWARF 5 public review draft document.
https://reviews.llvm.org/D26051

llvm-svn: 285421
2016-10-28 17:59:50 +00:00
Arnold Schwaighofer 7f4b31c057 More swift calling convention tests
llvm-svn: 285417
2016-10-28 17:21:05 +00:00
Kostya Serebryany 82ff4e7e90 [libFuzzer] a bit more docs
llvm-svn: 285415
2016-10-28 16:55:29 +00:00
Sanjay Patel 19ace1d548 [InstCombine] move/add tests for smin/smax folds
llvm-svn: 285414
2016-10-28 16:54:03 +00:00
Lang Hames 1a2e656c67 [lli] Pass command line arguments in to the orc-lazy JIT.
This brings the LLI orc-lazy JIT's behavior more closely in-line with LLI's
mcjit bahavior.

llvm-svn: 285413
2016-10-28 16:52:34 +00:00
Krzysztof Parzyszek 87a47be039 [Hexagon] Maintain kill flags through splitting in expand-condsets
Do not use LiveIntervals to recalculate kills, because that cannot be
done accurately without implicit uses on predicated instructions.

llvm-svn: 285409
2016-10-28 15:50:22 +00:00
Tom Stellard 13068995b9 [Loads] Fix crash in is isDereferenceableAndAlignedPointer()
Summary:
We were trying to add APInt values with different bit sizes after
visiting an addrspacecast instruction which changed the bit width
of the pointer.

Reviewers: majnemer, hfinkel

Subscribers: hfinkel, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24774

llvm-svn: 285407
2016-10-28 15:32:28 +00:00
Teresa Johnson d01fcc7159 [cmake] Temporarily revert enforcement of minimum GCC version increase
Summary:
This is temporary, until bot that builds public facing LLVM
documentation is upgraded. It reverts only the cmake change in r284497,
but leaves the doc changes in place to preserve intent.

Reviewers: aaron.ballman

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D26078

llvm-svn: 285406
2016-10-28 15:30:27 +00:00
Matthew Simpson 9b6755362b [LV] Correct misleading comments in test (NFC)
llvm-svn: 285402
2016-10-28 14:27:45 +00:00
Simon Pilgrim d9189891fc [SelectionDAG] computeKnownBits - early-out if any BUILD_VECTOR element has no known bits
No need to check the remaining elements - no common known bits are available.

llvm-svn: 285399
2016-10-28 14:07:44 +00:00
Simon Pilgrim 8c043061e5 [SelectionDAG] Tidyup UDIV computeKnownBits implementation
No need to clear KnownOne2/KnownZero2 bits as the next call to computeKnownBits will overwrite them anyway

llvm-svn: 285398
2016-10-28 13:42:23 +00:00
Simon Pilgrim 755cef1ba8 [SelectionDAG] Increment computeKnownBits recursion depth for SMIN/SMAX/UMIN/UMAX like all other ops
llvm-svn: 285397
2016-10-28 13:13:16 +00:00
Igor Laevsky c3ccf5d77b [LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop
which was processed on the current iteration.

Differential Revision: https://reviews.llvm.org/D25873

llvm-svn: 285394
2016-10-28 12:57:20 +00:00
Juergen Ributzka 5cee232be4 Revert "[DAGCombiner] Add vector demanded elements support to computeKnownBits"
This seems to have increased LTO compile time bejond 2x of previous builds.
See http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/

llvm-svn: 285381
2016-10-28 04:01:12 +00:00
Davide Italiano 631cd27f29 [Reassociate] Removing instructions mutates the IR.
Fixes PR 30784. Discussed with Justin, who pointed out that
in the new PassManager infrastructure we can have more fine-grained
control on which analyses we want to preserve, but this is the
best we can do with the current infrastructure.

llvm-svn: 285380
2016-10-28 02:47:09 +00:00
Teresa Johnson 02563cd3a6 [ThinLTO] Create AliasSummary when building index
Summary:
Previously we were creating the alias summary on the fly while writing
the summary to bitcode. This moves the creation of these summaries to
the module summary index builder where we build the rest of the summary
index.

This is going to be necessary for setting the NoRename flag for values
possibly used in inline asm or module level asm.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26049

llvm-svn: 285379
2016-10-28 02:39:38 +00:00
Teresa Johnson 58fbc916a0 [ThinLTO] Rename HasSection to NoRename (NFC)
Summary:
This is in preparation for a change to utilize this flag for symbols
referenced/defined in either inline or module level assembly.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26048

llvm-svn: 285376
2016-10-28 02:24:59 +00:00
Davide Italiano 6231a7e4d1 [IR] Clang-format my previous commit. NFCI.
llvm-svn: 285375
2016-10-28 01:41:56 +00:00
Davide Italiano 30665147f9 [ConstantFold] Get the correct vector type when folding a getelementptr.
Differential Revision:  https://reviews.llvm.org/D26014

llvm-svn: 285371
2016-10-28 00:53:16 +00:00
Tom Stellard aea899e2a0 AMDGPU/SI: Handle hazard with s_rfe_b64
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25638

llvm-svn: 285368
2016-10-27 23:50:21 +00:00
Tom Stellard 04051b5fad AMDGPU/SI: Handle hazard with sgpr lane selects for v_{read,write}lane
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25637

llvm-svn: 285367
2016-10-27 23:42:29 +00:00
Davide Italiano e4146714ca Remove accidentally commited test.
llvm-svn: 285366
2016-10-27 23:40:19 +00:00
Davide Italiano e865a525df [IR] Reintroduce getGEPReturnType(), it will be used in a later patch.
llvm-svn: 285365
2016-10-27 23:38:51 +00:00
Tom Stellard 6b9c1be4ea AMDGPU/SI: Fix unused variable warning on non-debug builds
llvm-svn: 285363
2016-10-27 23:28:03 +00:00
Ekaterina Romanova b7f96d1241 Reverting back r285355: "Update .debug_line section version information to match DWARF version", while I'm investigating a test failure.
llvm-svn: 285362
2016-10-27 23:20:19 +00:00
Vedant Kumar 75f1de0c1a [Coverage] Darwin: Move __llvm_covmap from __DATA to __LLVM_COV
Programs with very large __llvm_covmap sections may fail to link on
Darwin because because of out-of-range 32-bit RIP relative references.
It isn't possible to work around this by using the large code model
because it isn't supported on Darwin. One solution is to move the
__llvm_covmap section past the end of the __DATA segment.

=== Testing ===

In addition to check-{llvm,clang,profile}, I performed a link test on a
simple object after injecting ~4GB of padding into __llvm_covmap:

  @__llvm_coverage_padding = internal constant [4000000000 x i8] zeroinitializer, section "__LLVM_COV,__llvm_covmap", align 8

(This test is too expensive to check-in.)

=== Backwards Compatibility ===

This patch should not pose any backwards-compatibility concerns. LLVM
is expected to scan all of the sections in a binary for __llvm_covmap,
so changing its segment shouldn't affect anything. I double-checked this
by loading coverage produced by an unpatched compiler with a patched
llvm-cov.

Suggested by Nick Kledzik.

llvm-svn: 285360
2016-10-27 23:17:51 +00:00
Tom Stellard b133fbb9a4 AMDGPU/SI: Handle hazard with > 8 byte VMEM stores
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25577

llvm-svn: 285359
2016-10-27 23:05:31 +00:00
Tim Shen 139a58f75e Reapply r285351 "[APFloat] Add DoubleAPFloat mode to APFloat. NFC." with
a workaround for old clang.

llvm-svn: 285358
2016-10-27 22:52:40 +00:00
Ekaterina Romanova 0b82459c6c Update .debug_line section version information to match DWARF version.
In the past the compiler always emitted .debug_line version 2, though some opcodes from DWARF 3 (e.g. DW_LNS_set_prologue_end, DW_LNS_set_epilogue_begin or DW_LNS_set_isa) and from DWARF 4 could be emitted by the compiler. 

This patch changes version information of .debug_line to exactly match the DWARF version. For .debug_line version 4, a new field maximum_operations_per_instruction is emitted. 

Differential Revision: https://reviews.llvm.org/D16697

llvm-svn: 285355
2016-10-27 22:37:25 +00:00
Tim Shen 414b0155c4 Revert "[APFloat] Add DoubleAPFloat mode to APFloat. NFC."
This reverts r285351, since it breaks the build.

llvm-svn: 285354
2016-10-27 21:54:29 +00:00
Kostya Serebryany bcfb0802e2 [libFuzzer] enable use_cmp by default
llvm-svn: 285353
2016-10-27 21:44:37 +00:00
Tim Shen f38e87fa48 [APFloat] Add DoubleAPFloat mode to APFloat. NFC.
Summary:
This patch adds DoubleAPFloat mode to APFloat.

Now, an APFloat with semantics PPCDoubleDouble will have DoubleAPFloat layout
(APFloat.U.Double), which contains two underlying APFloats as
PPCDoubleDoubleImpl and IEEEdouble semantics. Currently the IEEEdouble APFloat
is not used, and the first APFloat behaves exactly the same before this change.

This patch consists of three kinds of logics:
1) Construction and destruction of APFloat. Now the ctors, dtor, assign
   opertors and factory functions construct different underlying layout
   based on the semantics passed in.
2) s/IEEE/getIEEE()/ for normal, lifetime-unrelated computation functions.
   These functions only access Floats[0] in DoubleAPFloat, which is the
   same as today's semantic.
3) A "Double dispatch" function, APFloat::convert. Converting between two
   different layouts requires appropriate logic.

Neither of these change the external behavior.

Reviewers: hfinkel, kbarton, echristo, iteratee

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25977

llvm-svn: 285351
2016-10-27 21:39:51 +00:00
Peter Collingbourne fc0a99bfda BitcodeReader: Require clients to read the block info block at most once.
This change makes it the client's responsibility to call ReadBlockInfoBlock()
at most once. This is in preparation for a future change that will allow
there to be multiple block info blocks.

See also: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106512.html

Differential Revision: https://reviews.llvm.org/D26016

llvm-svn: 285350
2016-10-27 21:39:28 +00:00
Kyle Butt ab9cca7b0c CodeGen: Handle missed case of block removal during BlockPlacement.
There is a use after free bug in the existing code. Loop layout selects
a preferred exit block, and then lays out the loop. If this block is
removed during layout, it needs to be invalidated to prevent a use after
free.

llvm-svn: 285348
2016-10-27 21:37:20 +00:00
Sanjay Patel c0de9c9e40 [InstCombine] fix foldSPFofSPF() to handle vector splats
llvm-svn: 285345
2016-10-27 21:19:40 +00:00
Kostya Serebryany c1708b0d99 [libFuzzer] docs: update the examples
llvm-svn: 285344
2016-10-27 21:03:48 +00:00
Kevin Enderby bc5c29a65f Another additional error check for invalid Mach-O files for the
obsolete load commands.

Again the philosophy of the error checking in libObject for
Mach-O files, the idea behind the checking is that we never
will return a Mach-O file out of libObject that contains unknown
things the library code can’t operate on.  So known obsolete
load commands will cause a hard error.

Also to make things clear I have added comments to the
values and structures in Support/Mach-O.h and
Support/MachO.def as to what is obsolete.

As noted in a TODO in the code, there may need to be a
non-default mode to allow some unknown values for well
structured Mach-O files with things like unknown load
load commands.  So things like using an old lldb on a newer
Mach-O file could still provide some limited functionality.

llvm-svn: 285342
2016-10-27 20:59:10 +00:00
Sanjay Patel 923f74b27c [InstCombine] add vector tests for foldSPFofSPF to show missing folds
llvm-svn: 285340
2016-10-27 20:51:03 +00:00
Kostya Serebryany cbefff7320 [libFuzzer] docs: separate section for fuzz target
llvm-svn: 285339
2016-10-27 20:45:35 +00:00
Tom Stellard 30d30824b4 AMDGPU/SI: Handle s_setreg hazard in GCNHazardRecognizer
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25528

llvm-svn: 285338
2016-10-27 20:39:09 +00:00
Kostya Serebryany af67fd1dbd [libFuzzer] remove large examples from the libFuzzer docs and link to the libFuzzer tutorial instead; also fix a build error in another file
llvm-svn: 285337
2016-10-27 20:14:03 +00:00
Sanjay Patel cc8e0af3e5 [InstCombine] auto-generate checks for min/max tests
llvm-svn: 285336
2016-10-27 19:54:15 +00:00
Ehsan Amiri 2492721c36 [PPC] Adding the removed testcase again
This testcase was originally part of r284995, but I put it in a wrong directory.
So I removed it. Before adding it back I did some small enhancements. Also I
changed the assertions a little bit, to take into account the impact of some
changes performed since code review is done.

This is similar to changes done for another testcase in the original commit.
See: https://reviews.llvm.org/D23614#577749
Basically for instead of vxor we now generate xxlxor in some cases, which is
better.

llvm-svn: 285333
2016-10-27 19:10:09 +00:00
Haicheng Wu 430b3e4893 [LoopUnroll] Check partial unrolling is enabled before initialization. NFC.
Differential Revision: https://reviews.llvm.org/D23891

llvm-svn: 285330
2016-10-27 18:40:02 +00:00
Simon Pilgrim d23219b9ee [X86][AVX512] Fix MUL v8i64 costs on non-AVX512DQ targets
llvm-svn: 285329
2016-10-27 18:32:06 +00:00
Sanjay Patel 611f9f92fc [InstCombine] handle simple vector integer constants in IsFreeToInvert
llvm-svn: 285318
2016-10-27 17:30:50 +00:00
Simon Pilgrim 47c1ff7a43 [X86][AVX512DQ] Move v2i64 and v4i64 MUL lowering to tablegen
As suggested by @igorb on D26011

llvm-svn: 285313
2016-10-27 17:07:40 +00:00
Saleem Abdulrasool 075d2e3c59 ARM: ensure that the Windows DBZ check is in range
The Windows ARM target expects the compiler to emit a division-by-zero check.
The check would use the form of:

    cmp r?, #0
    cbz .Ltrap
    b .Lbody
  .Lbody:
    ...
  .Ltrap:
    udf #249 @ __brkdiv0

This works great most of the time.  However, if the body of the function is
greater than 127 bytes, the branch target limitation of cbz becomes an issue.
This occurs in the unoptimized code generation cases sometimes (like in
compiler-rt).

Since this is a matter of correctness, possibly pay a small penalty instead.  We
now form this slightly differently:

    cbnz .Lbody
    udf #249 @ __brkdiv0
  .Lbody:
    ...

The positive case is through the branch instead of being the next instruction.
However, because of the basic block layout, the negated branch is going to be
a short distance always (2 bytes away, after the inserted __brkdiv0).

The new t__brkdiv0 instruction is required to explicitly mark the instruction as
a terminator as the generic UDF instruction is not a terminator.

Addresses PR30532!

llvm-svn: 285312
2016-10-27 16:59:22 +00:00
Greg Clayton 6c273763a3 Switch all DWARF variables for tags, attributes and forms over to use the llvm::dwarf enumerations instead of using raw uint16_t values. This allows easier debugging as users can see the values of the enumerations in the variables view that will show the enumeration string instead of just a number.
https://reviews.llvm.org/D26013

llvm-svn: 285309
2016-10-27 16:32:04 +00:00
Dehao Chen b94c09baa0 Add Loop Sink pass to reverse the LICM based of basic block frequency.
Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency.

Reviewers: hfinkel, davidxl, chandlerc

Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22778

llvm-svn: 285308
2016-10-27 16:30:08 +00:00
Vasileios Kalintiris cfb005a0ee [mips] Do not allow -opt-bisect-limit to skip the PIC call optimization pass.
r282428 added the MipsOptimizePICCall as an opt-in pass that can be
skipped when using the -opt-bisect-limit option. However, this pass is
needed because it generates code that conforms to the o32 ABI
specification by using the $t9 register for PIC calls with JALR
instructions.

This bug was exposed by the fact that skipFunction() also checks for
the "optnone" attribute. This caused functions with that attribute to
break the requirements of the o32 ABI.

llvm-svn: 285305
2016-10-27 15:50:36 +00:00
Simon Pilgrim 820e1326d7 [X86][AVX512DQ] Improve lowering of MUL v2i64 and v4i64
With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq).

Updated cost table accordingly.

Differential Revision: https://reviews.llvm.org/D26011

llvm-svn: 285304
2016-10-27 15:27:00 +00:00
Sanjay Patel e372aecb8a [ValueTracking] fix matchSelectPattern to allow vector splat folds of min/max/abs/nabs
llvm-svn: 285303
2016-10-27 15:26:10 +00:00
Benjamin Kramer 0eae9eccdf Remove duplicated default move ctors/move assign. No functional change.
llvm-svn: 285302
2016-10-27 15:23:44 +00:00
Sanjay Patel d5b8d64d4b [InstCombine] add tests for missing folds of vector abs/nabs/min/max
llvm-svn: 285299
2016-10-27 15:02:45 +00:00
Bjorn Pettersson 807f732ce8 Fix memory issue in AttrBuilder::removeAttribute uses.
Summary:
Found when running Valgrind.

This removes two unnecessary assignments when using
AttrBuilder::removeAttribute.

AttrBuilder::removeAttribute returns a reference to the object.
As the LHSes were the same as the callees, the assignments
resulted in memcpy calls where dst = src.

Commited on behalf-of: dstenb (David Stenberg)

Reviewers: mkuper, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25460

llvm-svn: 285298
2016-10-27 14:48:09 +00:00
Krzysztof Parzyszek 046da74699 [Hexagon] Do not expand ISD::SELECT for HVX vectors
llvm-svn: 285297
2016-10-27 14:30:16 +00:00
Simon Pilgrim 01e755eab1 [DAGCombiner] Add vector demanded elements support to computeKnownBits
Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.

This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.

The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.

I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.

DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.

Differential Revision: https://reviews.llvm.org/D25691

llvm-svn: 285296
2016-10-27 14:29:28 +00:00
Sanjay Patel f21dd2648f [InstCombine] auto-generate better checks; NFC
llvm-svn: 285293
2016-10-27 13:55:37 +00:00
George Rimar b49a3d3390 Revert r285285 "[Object/ELF] - Fixed behavior when SectionHeaderTable->sh_size is too large."
It broke BB.

llvm-svn: 285288
2016-10-27 12:18:50 +00:00
Alexey Bataev 46c0278e7d [SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer.
After successfull horizontal reduction vectorization attempt for PHI node
vectorizer tries to update root binary op by combining vectorized tree
and the ReductionPHI node. But during vectorization this ReductionPHI
can be vectorized itself and replaced by the `undef` value, while the
instruction itself is marked for deletion. This 'marked for deletion'
PHI node then can be used in new binary operation, causing "Use still
stuck around after Def is destroyed" crash upon PHI node deletion.

Also the test is fixed to make it perform actual testing.

Differential Revision: https://reviews.llvm.org/D25671

llvm-svn: 285286
2016-10-27 12:02:28 +00:00
George Rimar 447d1a1986 [Object/ELF] - Fixed behavior when SectionHeaderTable->sh_size is too large.
Elf.h already has code checking that section table does not go past end of file.
Problem is that this check may not work on values greater than UINT64_MAX / Header->e_shentsize
because of calculation overflow.

Parch fixes the issue.

Differential revision: https://reviews.llvm.org/D25432

llvm-svn: 285285
2016-10-27 11:50:04 +00:00
George Rimar 7aa1626898 [Object/ELF] - Do not allow overflow when checking section size/offset.
Overflow was the reason of incorrect passing the check,
patch fixes the case.

Differentail revision: https://reviews.llvm.org/D25514

llvm-svn: 285284
2016-10-27 11:44:56 +00:00
George Rimar 3fb09b3a9e [Object/ELF] - Do not crash if string table sh_size is equal to zero.
Revealed using "id_000038,sig_11,src_000015,op_havoc,rep_16" from PR30540,
when sh_size was 0, crash happened.

Differential revision: https://reviews.llvm.org/D25091

llvm-svn: 285282
2016-10-27 11:41:57 +00:00
Sam Parker 09947a3155 [ARM] Add newline char to test.
Missed a newline in the previous commit.

Differential Revision: https://reviews.llvm.org/D26027

llvm-svn: 285280
2016-10-27 10:43:02 +00:00
Sam Parker e7d9505c08 [ARM] Predicate UMAAL selection on hasDSP.
UMAAL is a DSP instruction and it is not available on thumbv7m
(Cortex-M3) and thumbv6m (Cortex-M0+1) targets. Also fix wrong
CHECK prefix in longMAC.ll test.

Patch by Vadzim Dambrouski.

Differential Revision: https://reviews.llvm.org/D25890

llvm-svn: 285278
2016-10-27 09:47:10 +00:00
Dylan McKay dd680cc753 [AVR] Generate all of the TableGen files we need
This enables generation of all of the TableGen files that are used
downstream.

llvm-svn: 285274
2016-10-27 08:20:47 +00:00
Nicolai Haehnle 7b0e25b7ad AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies
Summary:
When finding a match for a merge and collecting the instructions that must
be moved, keep in mind that the instruction we merge might actually use one
of the defs that are being moved.

Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect].

The fact that the ds_read in the test case is not eliminated suggests that
there might be another problem related to alias analysis, but that's a
separate problem: this pass should still work correctly even when earlier
optimization passes missed something or were disabled.

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25829

llvm-svn: 285273
2016-10-27 08:15:07 +00:00
Dylan McKay 00009d4824 [AVR] Compile the disassembler
This also updates references of 'TheAVRTarget' to the new
'getTheAVRTarget()' method.

llvm-svn: 285272
2016-10-27 08:09:15 +00:00
Dylan McKay ec47065795 [AVR] Add AVRISelDAGToDAG.cpp
Summary: This pulls the AVR instruction selector in-tree.

Reviewers: arsenm, kparzysz

Subscribers: llvm-commits, wdng, beanz, japaric, mgorny

Differential Revision: https://reviews.llvm.org/D25278

llvm-svn: 285270
2016-10-27 07:03:47 +00:00
Dylan McKay 6eaa4e4bcc [AVR] Add the machine code emitter
Reviewers: arsenm, kparzysz

Subscribers: wdng, beanz, japaric, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D25388

llvm-svn: 285269
2016-10-27 06:56:46 +00:00
Nemanja Ivanovic 32b5fed639 [PowerPC] - No SExt/ZExt needed for count trailing zeros
This patch corresponds to review:
https://reviews.llvm.org/D25896

It just eliminates the redundant ZExt after a count trailing zeros instruction.

llvm-svn: 285267
2016-10-27 05:17:58 +00:00
Kostya Serebryany 94c427c23e [libFuzzer] speculatively trying to fix the Mac build; second attempt
llvm-svn: 285262
2016-10-27 00:36:38 +00:00
NAKAMURA Takumi 3f286ad1fe xray-extract.cc: Quick fix for mingw, to avoid errc::protocol_error.
errc::protocol_error is winsock-oriented and mingw doesn't include it. (MS does, though)

llvm-svn: 285261
2016-10-27 00:34:24 +00:00
Kostya Serebryany 3d945f6247 [libFuzzer] revert 285259 -- hit commit too soon
llvm-svn: 285260
2016-10-27 00:24:34 +00:00
Kostya Serebryany 15cd6b4b10 [libFuzzer] speculatively trying to fix the Mac build
llvm-svn: 285259
2016-10-27 00:22:39 +00:00
Davide Italiano 48ef6ca0c3 [IR] Retire unused getGEPReturnType overload. NFCI.
llvm-svn: 285257
2016-10-26 23:46:16 +00:00
Tim Shen 7527752353 [APFloat] Fix APFloat::getExactInverse when the input is nullptr. This is a regression introduced by r285105.
Reviewers: kbarton, echristo, iteratee, eugenis

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26017

llvm-svn: 285256
2016-10-26 23:31:41 +00:00
Justin Bogner 31d8b7d21d llvm-objdump: Make some error messages more consistent
Most of the version of report_error were quoting the filename and
printing a colon between the file name and the error message, but this
one wasn't doing either of those. Fix the output to be more
consistent.

llvm-svn: 285252
2016-10-26 22:37:52 +00:00
Vedant Kumar f4df0edf3e [utils] Add a '--unified-report' option to the code coverage prep script
In --unified-report mode, a single coverage report is prepared for all
specified binaries and written to *report_dir*. This mode is compatible
with all existing script options, including the --restrict mode which is
used to limit coverage reporting to certain files or directories.

This should not break any existing users of the script.

llvm-svn: 285249
2016-10-26 22:07:39 +00:00
Vedant Kumar 7466987cb7 [utils] Use print_function in the code coverage prep script, NFC.
llvm-svn: 285248
2016-10-26 22:07:37 +00:00
Vedant Kumar 273b6dc3e2 [utils] Add an '--only-merge' option to the code coverage prep script
In --only-merge mode, the script terminates after the profile merging
step.  This makes the script less stateful: it's more natural to split
the merge out into a separate step instead of relying on the first
invocation of the script to do it.

This should not break any existing users of the script.

llvm-svn: 285247
2016-10-26 22:07:35 +00:00
Evandro Menezes ca8370396a [AArch64] Create feature set for Samsung Exynos-M2
Since Exynos-M2 improved the FP square root unit a bit over the one in
Exynos-M1, it does not benefit from using the Newton series for such
operations.

llvm-svn: 285246
2016-10-26 22:06:20 +00:00
Victor Leschuk a37660c669 DebugInfo: fix incorrect alignment type (NFC)
Change type of some missed DebugInfo-related alignment variables,
that are still uint64_t, to uint32_t.

Original change introduced in r284482.

llvm-svn: 285242
2016-10-26 21:32:29 +00:00
Reid Kleckner 4500f74858 [lit] Work around Windows MSys command line tokenization bug
Summary:
This will allow us to revert LLD r284768, which added spaces to get MSys
echo to print what we want.

Reviewers: ruiu, inglorion, rafael

Subscribers: modocache, llvm-commits

Differential Revision: https://reviews.llvm.org/D26009

llvm-svn: 285237
2016-10-26 20:29:27 +00:00
Ehsan Amiri 00b6e3bcc5 [PPC] Remove testcase from incorrect directory
During my last commit this testcase was put in an incorrect directory. Removing
it. Will put it in the right directory when I can verify everything is correct.

llvm-svn: 285233
2016-10-26 20:16:59 +00:00
Tim Northover a9cc385664 ARM: don't rely on push/pop reglists being in order when folding SP adjust.
It would be a very nice invariant to rely on, but unfortunately it doesn't
necessarily hold (and the causes of mis-sorted reglists appear to be quite
varied) so to be robust the frame lowering code can't assume that the first
register in the list is also the first one that actually gets pushed.

Should fix an issue where we were turning something like:

    push {r8, r4, r7, lr}
    sub sp, #24

into nonsense like:

    push {r2, r3, r4, r5, r6, r7, r8, r4, r7, lr}

llvm-svn: 285232
2016-10-26 20:01:00 +00:00
Nemanja Ivanovic 275853e777 Do not assume that FP vector operands are never legalized by expanding
This patch ensures that if a floating point vector operand is legalized by
expanding, it is legalized through the stack rather than by calling
DAGTypeLegalizer::IntegerToVector which will cause a failure since the operand
is a non-integer type.

This fixes PR 30715.

llvm-svn: 285231
2016-10-26 19:51:35 +00:00
Sanjoy Das 01969218a4 Simplify `x >=u x >> y` and `x >=u x udiv y`
Summary:
Extends InstSimplify to handle both `x >=u x >> y` and `x >=u x udiv y`.

This is a folloup of rL258422 and
https://github.com/rust-lang/rust/pull/30917 where llvm failed to
optimize away the bounds checking in a binary search.

Patch by Arthur Silva!

Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25941

llvm-svn: 285228
2016-10-26 19:18:43 +00:00
Chad Rosier 4447d7a816 Revert "[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory."
This reverts commit r285191.

LICM appears to rely on the Alias Set Tracker hitting lifetime markers to prevent
code from being moved outside of the original scope.

llvm-svn: 285227
2016-10-26 19:18:19 +00:00
Nemanja Ivanovic 0f45998bc6 [PowerPC] Implement vec_insert_exp builtins - llvm portion
This revision corresponds to review: https://reviews.llvm.org/D25957.
Committing on behalf of Zaara Syeda.

llvm-svn: 285225
2016-10-26 19:03:40 +00:00
Kostya Serebryany 2fabecaee3 [libFuzzer] simplify TracePC::HandleTrace even further. Also, when dealing with -exit_on_src_pos, symbolize every PC only once
llvm-svn: 285223
2016-10-26 18:52:04 +00:00
Chad Rosier 96e5e16acb Fix test from r285217.
llvm-svn: 285222
2016-10-26 18:49:16 +00:00
Chad Rosier 0c621fda0d [AArch64] Avoid materializing constant 1 when generating cneg instructions.
Instead of

 cmp w0, #1
 orr w8, wzr, #0x1
 cneg w0, w8, ne

we now generate

 cmp w0, #1
 csinv w0, w0, wzr, eq

PR28965

llvm-svn: 285217
2016-10-26 18:15:32 +00:00
Dan Gohman 68a423bf84 [WebAssembly] Update the README.txt.
Update the README.txt with newer information, add a link to the Emscripten
page explaining the current easiest way to use the LLVM wasm backend, and
mention that other ways of using the LLVM wasm backend are in development.

llvm-svn: 285215
2016-10-26 17:44:09 +00:00
Nirav Dave e2369aafb9 [MC] Fix comma typo in .loc parsing
llvm-svn: 285214
2016-10-26 17:28:58 +00:00
Robert Lougher 660f2f9560 Reapply: "Remove debug location from common tail when tail-merging"
This reapplies revision 285093.  Original commit message:

The branch folding pass tail merges blocks into a common-tail.  However, the
tail retains the debug information from one of the original inputs to the
merge (chosen randomly).  This is a problem for sampled-based PGO, as hits
on the common-tail will be attributed to whichever block was chosen,
irrespective of which path was actually taken to the common-tail.

This patch fixes the issue by nulling the debug location for the common-tail.

Differential Revision: https://reviews.llvm.org/D25742

llvm-svn: 285212
2016-10-26 17:01:47 +00:00
Yaxun Liu 94add85adb AMDGPU: Refactor processor definition to use ISA version features
Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend.

Refactor processor definition to use ISA version features.

Fixed ISA version for stoney.

Based on Laurent Morichetti's patch.

Differential Revision: https://reviews.llvm.org/D25919

llvm-svn: 285210
2016-10-26 16:37:56 +00:00
Dehao Chen e713000eb6 Introduce updateDiscriminator interface to DILocation to make it cleaner assigning discriminators.
Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later.

Reviewers: dnovillo, dblaikie, aprantl, echristo

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25959

llvm-svn: 285207
2016-10-26 15:48:45 +00:00
Matt Arsenault 39787bdcbb Reapply "AMDGPU: Don't use offen if it is 0"
This reverts r283003

llvm-svn: 285203
2016-10-26 15:08:16 +00:00
Matt Arsenault 1110f14b42 AMDGPU: Fix counting si_mask_branch as 4 bytes
llvm-svn: 285202
2016-10-26 14:53:54 +00:00
Matt Arsenault 8fac501602 Fix nondeterministic output in local stack slot alloc pass
This finds all of the references to a frame index in a function, and
sorts by the offset. If multiple instructions use the same offset,
nothing was breaking the tie for sorting.

This avoids the test failures the reverted r282999 introduced.

llvm-svn: 285201
2016-10-26 14:53:50 +00:00
Sanjay Patel 8d7196bfde [InstCombine] clean up commonCastTransforms; NFC
1. Use 'auto' with dyn_cast.
2. Variables start with a capital letter.
3. Use proper punctuation in comments.

llvm-svn: 285200
2016-10-26 14:52:35 +00:00
Tom Stellard 284cf32ab4 LegalizeDAG: Support promoting [US]DIV and [US]REM operations
Summary:
AMDGPU will need this one i16 is added as a legal type.  This is tested by:

test/CodeGen/AMDGPU/sdiv.ll
test/CodeGen/AMDGPU/sdivrem24.ll
test/CodeGen/AMDGPU/udiv.ll
test/CodeGen/AMDGPU/udivrem24.ll

Reviewers: bogner, efriedma

Subscribers: efriedma, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D25699

llvm-svn: 285199
2016-10-26 14:52:25 +00:00
Tom Stellard f8e6eaff6e AMDGPU/SI: Don't emit multi-dword flat memory ops when they might access scratch
Summary:
A single flat memory operations that might access the scratch buffer
can only access MaxPrivateElementSize bytes.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25788

llvm-svn: 285198
2016-10-26 14:38:47 +00:00
Tom Stellard 9daed22b04 AMDGPU/SI: Remove unnecessary run lines from test
Summary:
This test had run lines disabling/enabling the promote alloca pass, but
enabling/disabling promote alloca had no impact on the output.

Reviewers: arsenm

Subscribers: mgrang, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25787

llvm-svn: 285197
2016-10-26 14:21:09 +00:00
Zvi Rackover aa3402b41e [X86] AVX512 fallback for floating-point scalar selects
Summary:
In the case where of 'select i1 , f32, f32' or select i1, f64, f64 prefer lowering to masked-moves over branches.

Fixes pr30561

Reviewers: igorb, aymanmus, delena

Differential Revision: https://reviews.llvm.org/D25310

llvm-svn: 285196
2016-10-26 14:12:46 +00:00
Sanjay Patel 0bacecfb32 [InstCombine] consolidate zext tests and auto-generate checks; NFC
llvm-svn: 285195
2016-10-26 14:08:49 +00:00
Sanjay Patel 0b756ff1d1 [InstCombine] auto-generate better checks; NFC
llvm-svn: 285194
2016-10-26 13:58:22 +00:00
Chad Rosier 1408628ffa [AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory.
Differential Revision: https://reviews.llvm.org/D25969

llvm-svn: 285191
2016-10-26 12:42:11 +00:00
Victor Leschuk 3c9899842b DebugInfo: support for DWARFv5 DW_AT_alignment attribute
* Assume that clang passes non-zero alignment value to DIBuilder
only in case when it was forced by C++11 'alignas', C11 '_Alignas'
or compiler attribute '__attribute__((aligned (N)))'.

* Emit DW_AT_alignment if alignment is specified for type/object.

Differential Revision: https://reviews.llvm.org/D24425

llvm-svn: 285189
2016-10-26 11:59:03 +00:00
Andrea Di Biagio 9bcb064f19 [IndVarSimplify][DebugLoc] When widening the exit loop condition, correctly reuse the debug location of the original comparison.
When the loop exit condition is canonicalized as a != compaison, reuse the
debug location of the original (non canonical) comparison.

Before this patch, the debug location of the new icmp was obtained from the
loop latch terminator. This patch fixes the issue by correctly setting the
IRBuilder's "current debug location" to the location of the original compare.

Differential Revision: https://reviews.llvm.org/D25953

llvm-svn: 285185
2016-10-26 10:28:32 +00:00
Vassil Vassilev df5042ab61 Revert r285181 "DebugInfo: support for DWARFv5 DW_AT_alignment attribute".
The commit broke the builds.

llvm-svn: 285183
2016-10-26 10:13:47 +00:00
Victor Leschuk e398c6afa9 DebugInfo: support for DWARFv5 DW_AT_alignment attribute
* Assume that clang passes non-zero alignment value to DIBuilder
only in case when it was forced by C++11 'alignas', C11 '_Alignas'
or compiler attribute '__attribute__((aligned (N)))'.

* Emit DW_AT_alignment if alignment is specified for type/object.

Differential Revision: https://reviews.llvm.org/D24425

llvm-svn: 285181
2016-10-26 08:55:27 +00:00
Victor Leschuk 83d3f62120 DebugInfo: add bitcode upgrade test for alignment
Bitcode format was changed in D25073, this adds bitcode upgrade test.

llvm-svn: 285179
2016-10-26 08:34:19 +00:00
Dean Michael Berris 6cbd65e4df [XRay] Be case-insensitive for error strings
On Windows, "no such file or directory" is the default error translation
as opposed to the capitalized form on Linux.

llvm-svn: 285174
2016-10-26 05:10:39 +00:00
Craig Topper 812d3d30ae [AVX-512] Add scalar vfmsub/vfnmsub mask3 intrinsics
Summary: Clang's intrinsic header currently tries to negate the third operand of a vfmadd mask3 in order to create vfmsub, but this fails isel. This patch adds scalar vfmsub and vfnmsub mask3 that we can use instead to avoid the negate. This is consistent with the packed instructions.

Reviewers: igorb, delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25933

llvm-svn: 285173
2016-10-26 04:59:58 +00:00
Dean Michael Berris 2693245fc1 [XRay] Remove unnecessary include of <unistd.h>
llvm-svn: 285171
2016-10-26 04:46:50 +00:00
Dean Michael Berris 4fc2529d9d [XRay] Remove unnecessary include of <unistd.h>
llvm-svn: 285170
2016-10-26 04:36:31 +00:00
Dean Michael Berris 2661f25781 [XRay] Move specialisations into correct namespace
llvm-svn: 285168
2016-10-26 04:26:53 +00:00
Dean Michael Berris b278c21629 [XRay] Remove extra `;` to make -wpedantic happy
llvm-svn: 285167
2016-10-26 04:21:17 +00:00
Dean Michael Berris 8b327a1987 [XRay] Add llvm-xray as a dependency to test/CMakeLists.txt
llvm-svn: 285166
2016-10-26 04:16:05 +00:00
Dean Michael Berris c92bfb5a04 [XRay] Implement `llvm-xray extract`, start of the llvm-xray tool
Usage:

  llvm-xray extract <object file> [-o <filename or '-'>]

The tool gets the XRay instrumentation map from an object file and turns
it into YAML.  We first support ELF64 sleds on x86_64 binaries, with
provision for supporting other supported platforms and formats later.

This is the first of a many-part change to fully implement the
`llvm-xray` tool.

We also define a subcommand registration and dispatch mechanism to be
used by other further subcommand implementations for llvm-xray.

Diffusion Revision: https://reviews.llvm.org/D21987

llvm-svn: 285165
2016-10-26 04:14:34 +00:00
Peter Collingbourne 7b7bac367c Cloning: Also clone global variable attached metadata.
llvm-svn: 285161
2016-10-26 02:57:33 +00:00
Kostya Serebryany 8b6af7a9d3 [libFuzzer] refresh docs
llvm-svn: 285157
2016-10-26 01:55:17 +00:00
Dean Michael Berris f7bdbbcc58 Revert "[XRay] Implement `llvm-xray extract`, start of the llvm-xray tool"
Reverts r285155 -- misconfigured tests.

llvm-svn: 285156
2016-10-26 01:50:59 +00:00
Dean Michael Berris d21e0a7ba7 [XRay] Implement `llvm-xray extract`, start of the llvm-xray tool
Usage:

  llvm-xray extract <object file> [-o <filename or '-'>]

The tool gets the XRay instrumentation map from an object file and turns
it into YAML.  We first support ELF64 sleds on x86_64 binaries, with
provision for supporting other supported platforms and formats later.

This is the first of a many-part change to fully implement the
`llvm-xray` tool.

We also define a subcommand registration and dispatch mechanism to be
used by other further subcommand implementations for llvm-xray.

llvm-svn: 285155
2016-10-26 01:42:59 +00:00
Rui Ueyama 2241b84f48 Use printf instead of "echo -e" or "echo -n".
Not all echo commands support "-e". On the other hand, printf
command is in POSIX, so it's more portable than "echo -e".

llvm-svn: 285151
2016-10-26 01:07:26 +00:00
Kostya Serebryany 06b8757b57 [libFuzzer] simplify the code in TracePC::HandleTrace a bit more
llvm-svn: 285147
2016-10-26 00:42:52 +00:00
Kostya Serebryany a5b2e54fcb [libFuzzer] simplify the code to print new PCs
llvm-svn: 285145
2016-10-26 00:20:51 +00:00
Evgeniy Stepanov ea6d49d3ee Utility functions for appending to llvm.used/llvm.compiler.used.
llvm-svn: 285143
2016-10-25 23:53:31 +00:00
Kostya Serebryany 275e260258 [libFuzzer] simplify the code in TracePC::HandleTrace
llvm-svn: 285142
2016-10-25 23:52:25 +00:00
Lang Hames 8009f61c3d [docs] Avoid repetition of 'considerable' in Error docs.
llvm-svn: 285141
2016-10-25 23:08:32 +00:00
Lang Hames 497fd94109 [docs] Use consistent style for "do more stuff" in Error docs examples.
llvm-svn: 285138
2016-10-25 22:41:54 +00:00
Lang Hames ca20d9eb95 [docs] Fix yet another Error docs formatting issue...
llvm-svn: 285137
2016-10-25 22:38:50 +00:00
Lang Hames 4f8a9604d0 [docs] Fix a few more Error docs formatting issues.
Thanks to Pete Cooper for the review.

llvm-svn: 285136
2016-10-25 22:35:55 +00:00
Kostya Serebryany 117976818e [libFuzzer] add StandaloneFuzzTargetMain.c and a test for it
llvm-svn: 285135
2016-10-25 22:30:34 +00:00
Lang Hames 7a9ca33378 [docs] Fix a missing code-block in the new Error docs.
llvm-svn: 285134
2016-10-25 22:25:07 +00:00
Lang Hames 6b19ce6adb [docs] Fix a couple of typos in the new Error docs.
llvm-svn: 285133
2016-10-25 22:22:48 +00:00
James Y Knight 2e64b8b79e [Sparc] Don't overlap variable-sized allocas with other stack variables.
On SparcV8, it was previously the case that a variable-sized alloca
might overlap by 4-bytes the last fixed stack variable, effectively
because 92 (the number of bytes reserved for the register spill area) !=
96 (the offset added to SP for where to start a DYNAMIC_STACKALLOC).

It's not as simple as changing 96 to 92, because variables that should
be 8-byte aligned would then be misaligned.

For now, simply increase the allocation size by 8 bytes for each dynamic
allocation -- wastes space, but at least doesn't overlap. As the large
comment says, doing this more efficiently will require larger changes in
llvm.

Also adds some test cases showing that we continue to not support
dynamic stack allocation and over-alignment in the same function.

llvm-svn: 285131
2016-10-25 22:13:28 +00:00
Bob Haarman 26a87bd030 [codeview] support emitting indirect virtual base class information
Summary:
Fixes PR28281.

MSVC lists indirect virtual base classes in the field list of a class,
using LF_IVBCLASS records. This change makes LLVM emit such records
when processing DW_TAG_inheritance tags with the DIFlagVirtual and
(newly introduced) DIFlagIndirect tags.

Reviewers: rnk, ruiu, zturner

Differential Revision: https://reviews.llvm.org/D25578

llvm-svn: 285130
2016-10-25 22:11:52 +00:00
Simon Pilgrim de86241a09 [DAGCombiner] Enable (urem x, (shl pow2, y)) -> (and x, (add (shl pow2, y), -1)) combine for splatted vectors
llvm-svn: 285129
2016-10-25 22:01:09 +00:00
Rong Xu 33308f92eb [PGO] Fix select instruction annotation
Summary:
Select instruction annotation in IR PGO uses the edge count to infer the
branch count. It's currently placed in setInstrumentedCounts() where
no all the BB counts have been computed. This leads to wrong branch weights.
Move the annotation after all BB counts are populated.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25961

llvm-svn: 285128
2016-10-25 21:47:24 +00:00
Simon Pilgrim daf82f5f33 [X86][SSE] Regenerated known-bits test with srem->urem fix
llvm-svn: 285124
2016-10-25 21:24:33 +00:00
Simon Pilgrim f534573e8c [DAGCombiner] Enable srem(x.y) -> urem(x,y) combine for vectors
SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927

llvm-svn: 285123
2016-10-25 21:20:18 +00:00
Lang Hames 03a88ccba3 [docs] Add more Error documentation to the Programmer's Manual.
This patch updates some of the existing Error examples, expands on the
documentation for handleErrors, and includes new sections that cover
a number of helpful utilities and common error usage idioms.

llvm-svn: 285122
2016-10-25 21:19:30 +00:00
Simon Pilgrim efd5643419 [X86][SSE] Added vector srem combine tests
llvm-svn: 285121
2016-10-25 21:14:11 +00:00