Commit Graph

106963 Commits

Author SHA1 Message Date
Francis Ricci b4e77d98ed Revert "[llvm-dsymutil] Add support for __swift_ast MachO DWARF section"
Breaks aarch64 builders

This reverts commit r315014.

llvm-svn: 315034
2017-10-05 23:09:17 +00:00
Xin Tong 27e66fb579 [MBP] Remove an invalid assert.
The patch that this assert comes with is fixing a bug in MBP. The assert is
invalid however.

Thanks to @sergey.k.okunev for finding this

Currently this fails SPECCPU2006 LTO. I will add a test case when I do more
investigation and have one.

llvm-svn: 315032
2017-10-05 23:00:04 +00:00
Peter Collingbourne 715bcfe0c9 ModuleUtils: Stop using comdat members to generate unique module ids.
It is possible for two modules to define the same set of external
symbols without causing a duplicate symbol error at link time,
as long as each of the symbols is a comdat member. So we cannot
use them as part of a unique id for the module.

Differential Revision: https://reviews.llvm.org/D38602

llvm-svn: 315026
2017-10-05 21:54:53 +00:00
Reid Kleckner 676941909d [X86] Extract CATCHRET handling from emitEpilogue, NFC
llvm-svn: 315023
2017-10-05 21:37:39 +00:00
Derek Schuff 885dc59297 [WebAssembly] Add the rest of the atomic loads
Add extending loads and constant offset patterns
A bit more refactoring of the tablegen to make the patterns fairly nice and
uniform between the regular and atomic loads.

Differential Revision: https://reviews.llvm.org/D38523

llvm-svn: 315022
2017-10-05 21:18:42 +00:00
Sanjay Patel 7ac2db6a48 [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2
We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C'
This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold:

IC: Visiting:   %c = icmp ugt i4 %s, 1
IC: ADD:   %s = lshr i4 %x, 1
IC: ADD:   %1 = udiv i4 %x, 2
IC: Old =   %c = icmp ugt i4 %1, 1
    New =   <badref> = icmp uge i4 %x, 4
IC: ADD:   %c = icmp uge i4 %x, 4
IC: ERASE   %2 = icmp ugt i4 %1, 1
IC: Visiting:   %c = icmp uge i4 %x, 4
IC: Old =   %c = icmp uge i4 %x, 4
    New =   <badref> = icmp ugt i4 %x, 3
IC: ADD:   %c = icmp ugt i4 %x, 3
IC: ERASE   %2 = icmp uge i4 %x, 4
IC: Visiting:   %c = icmp ugt i4 %x, 3
IC: DCE:   %1 = udiv i4 %x, 2
IC: ERASE   %1 = udiv i4 %x, 2
IC: DCE:   %s = lshr i4 %x, 1
IC: ERASE   %s = lshr i4 %x, 1
IC: Visiting:   ret i1 %c

When we could go directly to canonical icmp form:

IC: Visiting:   %c = icmp ugt i4 %s, 1
IC: Old =   %c = icmp ugt i4 %s, 1
    New =   <badref> = icmp ugt i4 %x, 3
IC: ADD:   %c = icmp ugt i4 %x, 3
IC: ERASE   %1 = icmp ugt i4 %s, 1
IC: ADD:   %s = lshr i4 %x, 1
IC: DCE:   %s = lshr i4 %x, 1
IC: ERASE   %s = lshr i4 %x, 1
IC: Visiting:   %c = icmp ugt i4 %x, 3

...but then I noticed that the folds were incomplete too:
https://godbolt.org/g/aB2hLE

Here are attempts to prove the logic with Alive:
https://rise4fun.com/Alive/92o

Name: lshr_ult
Pre: ((C2 << C1) u>> C1) == C2
%sh = lshr i8 %x, C1
%r = icmp ult i8 %sh, C2
  =>
%r = icmp ult i8 %x, (C2 << C1)

Name: ashr_slt
Pre: ((C2 << C1) >> C1) == C2
%sh = ashr i8 %x, C1
%r = icmp slt i8 %sh, C2
  =>
%r = icmp slt i8 %x, (C2 << C1)

Name: lshr_ugt
Pre: (((C2+1) << C1) u>> C1) == (C2+1)
%sh = lshr i8 %x, C1
%r = icmp ugt i8 %sh, C2
  =>
%r = icmp ugt i8 %x, ((C2+1) << C1) - 1

Name: ashr_sgt
Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1)
%sh = ashr i8 %x, C1
%r = icmp sgt i8 %sh, C2
  =>
%r = icmp sgt i8 %x, ((C2+1) << C1) - 1

Name: ashr_exact_sgt
Pre: ((C2 << C1) >> C1) == C2
%sh = ashr exact i8 %x, C1
%r = icmp sgt i8 %sh, C2
  =>
%r = icmp sgt i8 %x, (C2 << C1)

Name: ashr_exact_slt
Pre: ((C2 << C1) >> C1) == C2
%sh = ashr exact i8 %x, C1
%r = icmp slt i8 %sh, C2
  =>
%r = icmp slt i8 %x, (C2 << C1)

Name: lshr_exact_ugt
Pre: ((C2 << C1) u>> C1) == C2
%sh = lshr exact i8 %x, C1
%r = icmp ugt i8 %sh, C2
  =>
%r = icmp ugt i8 %x, (C2 << C1)

Name: lshr_exact_ult
Pre: ((C2 << C1) u>> C1) == C2
%sh = lshr exact i8 %x, C1
%r = icmp ult i8 %sh, C2
  =>
%r = icmp ult i8 %x, (C2 << C1)

We did something similar for 'shl' in D28406.

Differential Revision: https://reviews.llvm.org/D38514

llvm-svn: 315021
2017-10-05 21:11:49 +00:00
Krzysztof Parzyszek a114941fa8 [Hexagon] Make PS_fi and PS_fia extendable (they both expand to A2_addi)
llvm-svn: 315019
2017-10-05 20:20:06 +00:00
Dehao Chen 16f01fb1db Annotate VP prof on indirect call if it is ICPed in the profiled binary.
Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38477

llvm-svn: 315016
2017-10-05 20:15:29 +00:00
Francis Ricci 2b513b5c99 [llvm-dsymutil] Add support for __swift_ast MachO DWARF section
Summary:
Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging,
and which contains a byte-for-byte dump of the swiftmodule file.
Add this feature to llvm-dsymutil.

Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of
`__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil
(Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the
__swift_ast section).

Reviewers: aprantl, friss

Subscribers: llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D38504

llvm-svn: 315014
2017-10-05 20:03:01 +00:00
Krzysztof Parzyszek 7ae3ae9ef4 [Hexagon] Give uniform names to functions changing addressing modes, NFC
The new format is changeAddrMode_xx_yy, where xx is the current mode,
and yy is the new one.

Old name:               New name:
getBaseWithImmOffset    changeAddrMode_abs_io
getAbsoluteForm         changeAddrMode_io_abs
getBaseWithRegOffset    changeAddrMode_io_rr
xformRegToImmOffset     changeAddrMode_rr_io
getBaseWithLongOffset   changeAddrMode_rr_ur
getRegShlForm           changeAddrMode_ur_rr

llvm-svn: 315013
2017-10-05 20:01:38 +00:00
Francis Ricci 5f689d0db3 Revert "[llvm-dsymutil] Add support for __swift_ast MachO DWARF section"
This reverts commit r315004, because of a failing test on non-apple platforms

llvm-svn: 315009
2017-10-05 19:47:13 +00:00
Francis Ricci 7767277639 [llvm-dsymutil] Add support for __swift_ast MachO DWARF section
Summary:
Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging,
and which contains a byte-for-byte dump of the swiftmodule file.
Add this feature to llvm-dsymutil.

Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of
`__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil
(Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the
__swift_ast section).

Reviewers: aprantl, friss

Subscribers: llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D38504

llvm-svn: 315004
2017-10-05 19:17:28 +00:00
Davide Italiano e070721308 [NewPassManager] Run global dead code elimination after the inliner.
This is the same exact change we did for the current pass manager
in rL314997, but the new pass manager pipeline already happened
to run GlobalOpt after the inliner, so we just insert a run of
GDCE here.

llvm-svn: 315003
2017-10-05 18:36:01 +00:00
Reid Kleckner 7344282c36 [X86] Simplify X86 epilogue frame size calculation, NFC
Sink the insertion of "pop ebp" out of the frame size calculation
branches. They all check for HasFP.

Our handling of CLEANUPRET and CATCHRET was equivalent, both are
funclets and use the same frame size. We can eliminate the CLEANUPRET
case.

Hoist the hasFP(MF) query into a local bool.

Rename TargetMBB to CatchRetTarget to be more descriptive.

Eliminate the Optional<unsigned> RetOpcode local, now that it has one
use.

It's only a net savings of 10 lines, but hopefully it's *slightly* more
readable.

llvm-svn: 315000
2017-10-05 18:27:08 +00:00
Davide Italiano c8708e59e8 [PassManager] Improve the interaction between -O2 and ThinLTO.
Run GDCE slightly later so that we don't have to repeat it
twice when preparing for Thin. Thanks to Mehdi for the suggestion.

llvm-svn: 314999
2017-10-05 18:23:25 +00:00
Davide Italiano ff829cea8b [PassManager] Run global optimizations after the inliner.
The inliner performs some kind of dead code elimination as it goes,
but there are cases that are not really caught by it. We might
at some point consider teaching the inliner about them, but it
is OK for now to run GlobalOpt + GlobalDCE in tandem as their
benefits generally outweight the cost, making the whole pipeline
faster.

This fixes PR34652.

Differential Revision: https://reviews.llvm.org/D38154

llvm-svn: 314997
2017-10-05 18:06:37 +00:00
Matthew Simpson 49ee814996 [SparsePropagation] Move member definitions to header (NFC)
AbstractLatticeFunction and SparseSolver are class templates parameterized by a
lattice value, so we need to move these member functions over to the header.

Differential Revision: https://reviews.llvm.org/D38561

llvm-svn: 314996
2017-10-05 18:03:30 +00:00
Petar Jovanovic 65f10246bb [mips] implement .set dspr2 directive
Implement .set dspr2 directive with appropriate feature bits. This
directive is a counterpart of -mattr=dspr2 command line option with the
exception that it does not influence elf header flags.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D38537

llvm-svn: 314994
2017-10-05 17:40:32 +00:00
Matt Arsenault 2d3f8f333d AMDGPU: Set v2i32 any_extend to expand
llvm-svn: 314993
2017-10-05 17:38:30 +00:00
Krzysztof Parzyszek 9f3e88ae64 [RDF] Simplify construction of maximal registers
The old algoritm was not correct, although it worked most of the time.
Avoid the complex reachability analysis and simply calculate the maximal
registers out of the set of all referenced registers.

llvm-svn: 314991
2017-10-05 17:12:49 +00:00
Rong Xu 289da65698 [ProfileData] Fix data racing in merging indexed profiles
There is data racing to the static variable RecordIndex in index profile reader
when merging in multiple threads. Make it a member variable in
IndexedInstrProfReader to fix this.

Differential Revision: https://reviews.llvm.org/D38431

llvm-svn: 314990
2017-10-05 17:05:20 +00:00
Artur Pilipenko 7b15254c8f [X86] Fix chains update when lowering BUILD_VECTOR to a vector load
The code which lowers BUILD_VECTOR of consecutive loads into a single vector
load doesn't update chains properly. As a result the vector load can be
reordered with the store to the same location.

The current code in EltsFromConsecutiveLoads only updates the chain following
the first load. The fix is to update the chains following all the loads
comprising the vector.

This is a fix for PR10114.

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D38547

llvm-svn: 314988
2017-10-05 16:28:21 +00:00
Konstantin Zhuravlyov aa0835a7ab AMDGPU: Add and set AMDGPU-specific e_flags
Differential Revision: https://reviews.llvm.org/D38556

llvm-svn: 314987
2017-10-05 16:19:18 +00:00
Ayal Zaks c9e0f886e5 [LV] Fix PR34743 - handle casts that sink after interleaved loads
When ignoring a load that participates in an interleaved group, make sure to
move a cast that needs to sink after it.

Testcase derived from reproducer of PR34743.

Differential Revision: https://reviews.llvm.org/D38338

llvm-svn: 314986
2017-10-05 15:45:14 +00:00
Clement Courbet 922e5bc698 Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."""
broken test on windows

This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca.

llvm-svn: 314985
2017-10-05 14:42:06 +00:00
Sanjay Patel f11b5b4f87 revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)
There is a bot failure that appears to be related to this change:
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117

...so reverting to confirm that and attempting to keep the bot green while investigating.

llvm-svn: 314984
2017-10-05 14:26:15 +00:00
Ayal Zaks fc3f7a4f0c [LV] Fix PR34711 - widen instruction ranges when sinking casts
Instead of trying to keep LastWidenRecipe updated after creating each recipe,
have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check
if it's a VPWidenRecipe when attempting to extend its range. This ensures that
such extensions, optimized to maintain the original instruction order, do so
only when the instructions are to maintain their relative order. The latter does
not always hold, e.g., when a cast needs to sink to unravel first order
recurrence (r306884).

Testcase derived from reproducer of PR34711.

Differential Revision: https://reviews.llvm.org/D38339

llvm-svn: 314981
2017-10-05 12:41:49 +00:00
Clement Courbet 4cafbb9b5e Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.""
llvm-svn: 314980
2017-10-05 12:39:57 +00:00
Simon Dardis 51a7ae2a29 [mips] Place certain 64 bit FPU instructions in their own decoder namespace
Previously, instructions that were defined to use the FGR64 register class
were associated with the Mips64 table which was incorrect.

Reviewers: nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D38454

llvm-svn: 314976
2017-10-05 10:27:37 +00:00
Karl-Johan Karlsson 8d8d201c17 [DebugInfo] Insert DEBUG_VALUEs after each register redefinition
Summary:
When reinserting debug values after register allocation, make sure to
insert debug values after each redefinition of debug value register in
the slot index range. The reason for this is that DwarfDebug will end
the range of a debug variable when the physical reg is defined. For
instructions with e.g. tied operands this result in prematurely ended
debug range.

This resolves pr34545

Patch by Karl-Johan Karlsson and Bjorn Pettersson

Reviewers: rnk, aprantl

Reviewed By: rnk

Subscribers: bjope, llvm-commits

Differential Revision: https://reviews.llvm.org/D38229

llvm-svn: 314974
2017-10-05 08:37:31 +00:00
George Rimar b074fbcb48 [MC] - llvm-mc hangs on non-english characters.
Currently llvm-mc just hangs inside infinite loop
while trying to parse file which has ".section .с" inside,
where section name is non-english character.
Patch fixes the issue.

In this patch I also moved content of non-english-characters.s
to test/MC/AsmParser/Inputs folder  so that non-english-characters.s
becomes a single testcase for all invalid inputs containing non-english
symbols. That is convinent because llvm-mc otherwise tries
to parse and tokenize the whole testcase file with tools invocations and
it is harder to isolate the issue.

Differential revision: https://reviews.llvm.org/D38545

llvm-svn: 314973
2017-10-05 08:15:55 +00:00
Clement Courbet 6603fc0e7b Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."
Breaks
clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll

This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa.

llvm-svn: 314972
2017-10-05 08:03:39 +00:00
Craig Topper 17b0c78447 [InstCombine] Fix a vector splat handling bug in transformZExtICmp.
We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us.

This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first.

Fixes PR34841.

llvm-svn: 314971
2017-10-05 07:59:11 +00:00
Clement Courbet 902eef32eb [MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.
Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later.

Reviewers: dberlin, spatel

Subscribers: davide, nemanjai

Differential Revision: https://reviews.llvm.org/D38232

llvm-svn: 314970
2017-10-05 07:49:09 +00:00
Mikael Holmen 0ec1d25d33 Minor refactoring regarding Cast::isNoopCast(), NFC
Summary:
FastISel::hasTrivialKill() was the only user of the "IntPtrTy" version of
Cast::isNoopCast(). According to review comments in D37894 we could instead
use the "DataLayout" version of the method, and thus get rid of the
"IntPtrTy" versions of isNoopCast() completely.

With the above done, the remaining isNoopCast() could then be simplified
a bit more.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D38497

llvm-svn: 314969
2017-10-05 07:07:09 +00:00
Dean Michael Berris 0a465d7a01 [XRay][tools] Support arg1 logging entries in the basic logging mode
Summary:
The arg1 logging handler changed in compiler-rt to start writing a
different type for entries encountered when logging the first argument
of XRay-instrumented functions. This change allows the trace loader to
support reading these record types as well as prepare for when the
basic (naive) mode implementation starts writing down the argument
payloads.

Without this change, binaries with arg1 logging support enabled start
writing unreadable logs for any of the XRay tracing tools.

Reviewers: pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38550

llvm-svn: 314967
2017-10-05 05:18:17 +00:00
Xinliang David Li 04ab11a08a Revert r314928 to investigate thinLTO bootstrap failure
llvm-svn: 314961
2017-10-05 01:40:13 +00:00
Eugene Zelenko 60433b682f [X86] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314953
2017-10-05 00:33:50 +00:00
Matt Arsenault f48e5c9ce5 AMDGPU: Add comment about clamps
llvm-svn: 314952
2017-10-05 00:13:20 +00:00
Matt Arsenault aafff87dda AMDGPU: Do not fold clamp instructions when sources are different
Patch by hakzsam (Samuel Pitoiset)

llvm-svn: 314951
2017-10-05 00:13:17 +00:00
Craig Topper 7a93092399 [InstCombine] Improve support for ashr in foldICmpAndShift
We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users.

Differential Revision: https://reviews.llvm.org/D38521

llvm-svn: 314945
2017-10-04 23:06:13 +00:00
Matt Arsenault 9ab1fa6803 AMDGPU: Fix not accounting for instruction size in bundles
These were counted as 0. Fixes branch limit exceeded errors
in some large programs.

llvm-svn: 314944
2017-10-04 22:59:12 +00:00
Konstantin Zhuravlyov 8684f7b4f9 AMDGPU: Correctly set EI_OSABI based on the os
Differential Revision: https://reviews.llvm.org/D38555

llvm-svn: 314943
2017-10-04 22:44:13 +00:00
Adrian Prantl b4a67907b7 clang-format file.
llvm-svn: 314942
2017-10-04 22:26:19 +00:00
Adrian Prantl 617a007b7c delete commented out code.
llvm-svn: 314941
2017-10-04 22:26:19 +00:00
Sanjoy Das 005b88c0a6 Do not call Loop::getName on possibly dead loops
This fixes PR34832.

llvm-svn: 314938
2017-10-04 22:02:27 +00:00
Xin Tong d8d97972de [MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new loop is processed
Summary: Rotate on exit that actually exits the current loop.

Reviewers: davidxl, danielcdh, iteratee, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38563

llvm-svn: 314937
2017-10-04 21:39:25 +00:00
Hans Wennborg 899809d531 Fix a -Wparentheses warning. NFC.
llvm-svn: 314936
2017-10-04 21:14:07 +00:00
Marcello Maggioni df3e71e037 [LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC
llvm-svn: 314934
2017-10-04 20:42:46 +00:00
Rafael Espindola 8c0ff9508d Bring r314809 back.
But now include a check for CPU_COUNT so we still build on 10 year old
versions of glibc.

Original message:

Use sched_getaffinity instead of std:🧵:hardware_concurrency.

The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.

With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.

This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.

llvm-svn: 314931
2017-10-04 20:27:01 +00:00