Commit Graph

98599 Commits

Author SHA1 Message Date
Matt Arsenault 3b99f12a4e AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.

This should be applied to the release branch.

llvm-svn: 292472
2017-01-19 06:04:12 +00:00
Craig Topper c227529105 [X86] Merge LowerADD and LowerSUB into a single LowerADD_SUB since they are identical.
llvm-svn: 292469
2017-01-19 03:49:29 +00:00
Craig Topper b561e66384 [AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector broadcasts that can't fold the load.
llvm-svn: 292466
2017-01-19 02:34:29 +00:00
Michael Kuperstein 8ecc38ef85 [PM] Add LoopVectorize to the default module pipeline
LV no longer "requires" LCSSA and LoopSimplify, and instead forms
them internally as required. So, there's nothing preventing it from
being enabled.

llvm-svn: 292464
2017-01-19 02:21:54 +00:00
Peter Collingbourne 22d9d3cdce LowerTypeTests: Implement exporting of type identifiers.
Type identifiers are exported by:
- Adding coarse-grained information about how to test the type
  identifier to the summary.
- Creating symbols in the object file (aliases and absolute symbols)
  containing fine-grained information about the type identifier.

Differential Revision: https://reviews.llvm.org/D28424

llvm-svn: 292462
2017-01-19 01:20:11 +00:00
Justin Bogner d09c3ce6c0 GlobalISel: Implement narrowing for G_LOAD
llvm-svn: 292461
2017-01-19 01:05:48 +00:00
Justin Bogner 1f5c505437 GlobalISel: Fix text wrapping in a comment. NFC
llvm-svn: 292460
2017-01-19 01:04:46 +00:00
Dehao Chen 1ce8d6ca59 Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection
Summary:
SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info:

* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)

This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):

               -gmlt(orig) -gmlt(patched) -g
433.milc       4.68%       5.40%          19.73%
444.namd       8.45%       8.93%          45.99%
447.dealII     97.43%      115.21%        374.89%
450.soplex     27.75%      31.88%         126.04%
453.povray     21.81%      26.16%         92.03%
470.lbm        0.60%       0.67%          1.96%
482.sphinx3    5.77%       6.47%          26.17%
400.perlbench  17.81%      19.43%         73.08%
401.bzip2      3.73%       3.92%          12.18%
403.gcc        31.75%      34.48%         122.75%
429.mcf        0.78%       0.88%          3.89%
445.gobmk      6.08%       7.92%          42.27%
456.hmmer      10.36%      11.25%         35.23%
458.sjeng      5.08%       5.42%          14.36%
462.libquantum 1.71%       1.96%          6.36%
464.h264ref    15.61%      16.56%         43.92%
471.omnetpp    11.93%      15.84%         60.09%
473.astar      3.11%       3.69%          14.18%
483.xalancbmk  56.29%      81.63%         353.22%
geomean        15.60%      18.30%         57.81%

Debug info size change for -gmlt binary with this patch:

433.milc       13.46%
444.namd       5.35%
447.dealII     18.21%
450.soplex     14.68%
453.povray     19.65%
470.lbm        6.03%
482.sphinx3    11.21%
400.perlbench  8.91%
401.bzip2      4.41%
403.gcc        8.56%
429.mcf        8.24%
445.gobmk      29.47%
456.hmmer      8.19%
458.sjeng      6.05%
462.libquantum 11.23%
464.h264ref    5.93%
471.omnetpp    31.89%
473.astar      16.20%
483.xalancbmk  44.62%
geomean        16.83%

Reviewers: davidxl, echristo, dblaikie

Reviewed By: echristo, dblaikie

Subscribers: aprantl, probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25434

llvm-svn: 292457
2017-01-19 00:44:11 +00:00
Michael Kuperstein 230867e583 [LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them
This changes the vectorizer to explicitly use the loopsimplify and lcssa utils,
instead of "requiring" the transformations as if they were analyses.

This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA
for all loops, but rather only for the loops we expect to modify.

Differential Revision: https://reviews.llvm.org/D28868

llvm-svn: 292456
2017-01-19 00:42:28 +00:00
Matthias Braun 9f21a8d787 LiveIntervalAnalysis: Cleanup; NFC
- Fix doxygen comments: Do not repeat name, remove duplicated doxygen
  comment (on declaration + implementation), etc.
- Use more range based for

llvm-svn: 292455
2017-01-19 00:32:13 +00:00
Artem Belevich 3d3f6190ab [NVPTX] Fix lowering of fp16 ISD::FNEG.
There's no neg.f16 instruction, so negation has to
be done via subtraction from zero.

Differential Revision: https://reviews.llvm.org/D28876

llvm-svn: 292452
2017-01-19 00:14:45 +00:00
Eli Friedman f1f49c8265 [SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.

Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.

No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.

Differential Revision: https://reviews.llvm.org/D28587

llvm-svn: 292449
2017-01-18 23:56:42 +00:00
Eli Friedman 0a2174533e Preserve domtree and loop-simplify for runtime unrolling.
Mostly straightforward changes; we just didn't do the computation before.
One sort of interesting change in LoopUnroll.cpp: we weren't handling
dominance for children of the loop latch correctly, but
foldBlockIntoPredecessor hid the problem for complete unrolling.

Currently punting on loop peeling; made some minor changes to isolate
that problem to LoopUnrollPeel.cpp.

Adds a flag -unroll-verify-domtree; it verifies the domtree immediately
after we finish updating it. This is on by default for +Asserts builds.

Differential Revision: https://reviews.llvm.org/D28073

llvm-svn: 292447
2017-01-18 23:26:37 +00:00
Krzysztof Parzyszek de44c9d857 Treat segment [B, E) as not overlapping block with boundaries [A, B)
llvm-svn: 292446
2017-01-18 23:12:19 +00:00
Krzysztof Parzyszek 954dd8d9ba [Hexagon] Remove dead defs from the live set when expanding wstores
llvm-svn: 292445
2017-01-18 23:11:40 +00:00
Michael Kuperstein d3d2925933 Revert r291670 because it introduces a crash.
r291670 doesn't crash on the original testcase from PR31589,
but it crashes on a slightly more complex one.

PR31589 has the new reproducer.

llvm-svn: 292444
2017-01-18 23:05:58 +00:00
Mehdi Amini 062b3fed4c Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed
Before, it would print a sequence of:

  *** IR Dump After Function Integration/Inlining ******
  *** IR Dump After Function Integration/Inlining ******
  *** IR Dump After Function Integration/Inlining ******
  ...

for every single function in the module.

llvm-svn: 292442
2017-01-18 21:37:11 +00:00
Sanjay Patel ae23d65a7d [InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI
llvm-svn: 292440
2017-01-18 21:16:12 +00:00
Haicheng Wu 8ce2d14356 [CodeGenPrepare] Fix a typo in the comment. NFC.
encode => endcode.

Differential Revision: https://reviews.llvm.org/D28866

llvm-svn: 292438
2017-01-18 21:12:10 +00:00
Sanjay Patel 589de5ea4e [InstCombine] remove a redundant check; NFCI
I missed deleting this check when I refactored this chunk in:
https://reviews.llvm.org/rL292260

llvm-svn: 292433
2017-01-18 20:09:59 +00:00
Peter Collingbourne 20a00933fb ThinLTOBitcodeWriter: Clear comdats on filtered globals.
Differential Revision: https://reviews.llvm.org/D28839

llvm-svn: 292431
2017-01-18 20:03:02 +00:00
Peter Collingbourne 10e3b12c7a Cloning: Copy comdats when cloning globals.
Differential Revision: https://reviews.llvm.org/D28838

llvm-svn: 292430
2017-01-18 20:02:31 +00:00
Michael Kuperstein 0de990da16 Fix up a comment. NFC.
llvm-svn: 292425
2017-01-18 19:05:48 +00:00
Michael Kuperstein 7cefb409b0 [LV] Allow reductions that have several uses outside the loop
We currently check whether a reduction has a single outside user. We don't
really need to require that - we just need to make sure a single value is
used externally. The number of external users of that value shouldn't actually
matter.

Differential Revision: https://reviews.llvm.org/D28830

llvm-svn: 292424
2017-01-18 19:02:52 +00:00
Evandro Menezes 7960b2e19a [AArch64] Generate literals by the little end
ARM seems to prefer that long literals be formed from their little end in
order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on
Cortex A57 and others (v.  "Cortex A57 Software Optimisation Guide", section
4.14).

Differential revision: https://reviews.llvm.org/D28697

llvm-svn: 292422
2017-01-18 18:57:08 +00:00
Davide Italiano bca9d73309 [NewGVN] We don't use postdom info anymore. Update.
Differential Revision:  https://reviews.llvm.org/D28842

llvm-svn: 292421
2017-01-18 18:42:28 +00:00
Mehdi Amini 67d2cc1fad [ThinLTO] Add a recursive step in Metadata lazy-loading
Summary:
Without this, we're stressing the RAUW of unique nodes,
which is a costly operation. This is intended to limit
the number of RAUW, and is very effective on the total
link-time of opt with ThinLTO, before:

  real 4m4.587s  user 15m3.401s  sys 0m23.616s

after:

  real 3m25.261s user 12m22.132s sys 0m24.152s

Reviewers: tejohnson, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28751

llvm-svn: 292420
2017-01-18 18:36:21 +00:00
Stanislav Mekhanoshin a4e63ead4b [AMDGPU] Do not allow register coalescer to create big superregs
Limit register coalescer by not allowing it to artificially increase
size of registers beyond dword. Such super-registers are in fact
register sequences and not distinct HW registers.

With more super-regs we would need to allocate adjacent registers
and constraint regalloc more than needed. Moreover, our super
registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2,
VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers
allocation even more, resulting in excessive spilling.

Differential Revision: https://reviews.llvm.org/D28782

llvm-svn: 292413
2017-01-18 17:30:05 +00:00
Justin Bogner fde0104649 GlobalISel: Implement narrowing for G_STORE
Legalize stores of types that are too wide by breaking them up into
sequences of smaller stores.

llvm-svn: 292412
2017-01-18 17:29:54 +00:00
Teresa Johnson 2d384ac381 Don't create a comdat group for a dropped def with initializer
Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to
available_externally when possible. If they had an initializer in the
global_ctors list, a comdat group was being created. This code
already had logic to skip available_externally defs, but now the
EliminateAvailableExternally pass will drop these symbols to
declarations earlier. Change the check to skip all declarations for
linker (which includes available_externally along with declarations).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28737

llvm-svn: 292408
2017-01-18 16:58:43 +00:00
Kirill Bobyrev 6afbaf0944 Revert 292404 due to buildbot failures.
llvm-svn: 292407
2017-01-18 16:34:25 +00:00
Kirill Bobyrev 9ad06dbe17 [X86] Minor code cleanup to fix several clang-tidy warnings. NFC
llvm-svn: 292404
2017-01-18 16:15:47 +00:00
Sam Parker b0de00d545 [ARM] Create SubtargetFeatures from build attrs
An ELFObjectFile can now create SubtargetFeatures from the available
ARM build attributes, in a similar manner to MIPS. I've moved the
MIPS code into its own function and the ARM handler also has a
separate function.

Differential Revision: https://reviews.llvm.org/D28291

llvm-svn: 292403
2017-01-18 15:52:11 +00:00
Pavel Labath 97c7cf1d5c raw_fd_ostream: Make file handles non-inheritable by default
Summary:
This makes the file descriptors on unix platform non-inheritable (O_CLOEXEC).

There is no change in behavior on windows, as the handles were already
non-inheritable there.

Reviewers: rnk, rafael

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D28854

llvm-svn: 292401
2017-01-18 15:46:50 +00:00
Chad Rosier 771db6f895 [Assembler] Fix crash when assembling .quad for AArch32.
A 64-bit relocation does not exist in 32-bit ARMELF. Report an error
instead of crashing.

PR23870
Patch by Sanne Wouda (sanwou01).
Differential Revision: https://reviews.llvm.org/D28851

llvm-svn: 292373
2017-01-18 15:02:54 +00:00
Florian Hahn 8485cecd3f [thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue.
Summary:
In this function, virtual registers can be introduced (for example
through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs
will replace those virtual registers with concrete registers later on
in PrologEpilogInserter, which sets NoVRegs again.

This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which
failed with expensive checks.
https://llvm.org/bugs/show_bug.cgi?id=27484


Reviewers: rnk, bkramer, olista01

Reviewed By: olista01

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D28829

llvm-svn: 292372
2017-01-18 15:01:22 +00:00
Simon Pilgrim fe2c0ed4cf [InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles
Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded.

llvm-svn: 292371
2017-01-18 14:47:49 +00:00
Daniel Sanders af76f989b5 Re-revert: [globalisel] Tablegen-erate current Register Bank Information
More missing guards. My build didn't notice it due to a stale file left over
from a Global ISel build.

llvm-svn: 292369
2017-01-18 14:26:12 +00:00
Daniel Sanders 517b61cb69 Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since last commit:
The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and
this should fix the buildbots however it may not be the whole fix. The previous
buildbot failures suggest there may be a memory bug lurking that I'm unable to
reproduce (including when using asan) or spot in the source. If they re-occur
on this commit then I'll need assistance from the bot owners to track it down.

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292367
2017-01-18 14:17:50 +00:00
Sam Parker df7c6ef96f [ARM] Create objdump subtarget from build attrs
Enable an ELFObjectFile to read the its arm build attributes to
produce a target triple with a specific ARM architecture.
llvm-objdump now uses this functionality to automatically produce
a more accurate target.

Differential Revision: https://reviews.llvm.org/D28769

llvm-svn: 292366
2017-01-18 13:52:12 +00:00
Simon Pilgrim a22c3a1c0f [InstCombine] Remove unnecessary intrinsics demanded elts handling
As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need.

llvm-svn: 292365
2017-01-18 13:44:04 +00:00
Michael Zuckerman 0c0240ce84 [X86] Improve mul combine for negative multiplayer (2^c - 1)
This patch improves the mul instruction combine function (combineMul) 
by adding new layer of logic. 
In this patch, we are adding the ability to fold (mul x, -((1 << c) -1)) 
or (mul x, -((1 << c) +1)) into (neg(X << c) -x) or (neg((x << c) + x) respective.

Differential Revision: https://reviews.llvm.org/D28232

llvm-svn: 292358
2017-01-18 09:31:13 +00:00
Renato Golin 03c5e69d07 Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier"
This reverts commit r292210, as it broke the Thumb buldbot with:

clang-5.0: error: the clang compiler does not support '-fxray-instrument
on thumbv7-unknown-linux-gnueabihf'.

llvm-svn: 292357
2017-01-18 09:08:43 +00:00
Jonas Paulsson a9bb00d82b [SystemZ] Proper handling of undef flag while expanding pseudo.
During post-RA pseudo expansion, an 'undef' flag of the source operand should
be propagated by emitGRX32Move().

Review: Ulrich Weigand
llvm-svn: 292353
2017-01-18 08:32:54 +00:00
Marina Yatsina 197db00e3e [X86] Fix for bugzilla 31576 - add support for "data32" instruction prefix
This patch fixes bugzilla 31576 (https://llvm.org/bugs/show_bug.cgi?id=31576).

"data32" instruction prefix was not defined in the llvm.
An exception had to be added to the X86 tablegen and AsmPrinter because both "data16" and "data32" are encoded to 0x66 (but in different modes).

Differential Revision: https://reviews.llvm.org/D28468

llvm-svn: 292352
2017-01-18 08:07:51 +00:00
Chandler Carruth 8aaad7c4d9 [LoopDeletion] (cleanup, NFC) Fix one more local variable that didn't
follow LLVM's naming conventions while I'm here.

Again, sorry I didn't spot this earlier to coalesce with other cleanup
changes.

llvm-svn: 292333
2017-01-18 02:43:01 +00:00
Chandler Carruth d50c5fb13f [PM] Teach LoopDeletion to correctly update the LPM when loops are
deleted.

I've expanded its test coverage a bit including adding one test that
will crash clearly without this change.

llvm-svn: 292332
2017-01-18 02:41:26 +00:00
Matt Arsenault f411071d63 DAG: Consider nnan in isKnownNeverNaN
llvm-svn: 292328
2017-01-18 02:10:08 +00:00
Wei Mi ce9d04ce58 Revert rL292292 since it causes a SEGV on sanitizer-x86_64-linux-fuzzer build bot.
llvm-svn: 292327
2017-01-18 01:53:53 +00:00
Kostya Serebryany bb91170cb5 [libFuzzer] remove stale code
llvm-svn: 292325
2017-01-18 01:10:18 +00:00