Commit Graph

45351 Commits

Author SHA1 Message Date
Jinsong Ji 9ddb625890 [NFC] Update renamed option in comments
c98ebda325 Rename fp-op fusion option (yet
again) for compatibility with GCC option.

The comment in the header should be updated too to avoid confusion.
2021-06-15 19:44:31 +00:00
Duncan P. N. Exon Smith 3302af9d4c Support: Remove F_{None,Text,Append} compatibility synonyms, NFC
Remove the compatibility spellings of `OF_{None,Text,Append}` that
were left behind by 1f67a3cba9.

No functionality change here, just an API cleanup.

Differential Revision: https://reviews.llvm.org/D101506
2021-06-15 12:04:09 -07:00
Roman Lebedev e52364532a
[NewPM] Remove SpeculateAroundPHIs pass
Addition of this pass has been botched.
There is no particular reason why it had to be sold as an inseparable part
of new-pm transition. It was added when old-pm was still the default,
and very *very* few users were actually tracking new-pm,
so it's effects weren't measured.

Which means, some of the turnoil of the new-pm transition
are actually likely regressions due to this pass.

Likewise, there has been a number of post-commit feedback
(post new-pm switch), namely
* https://reviews.llvm.org/D37467#2787157 (regresses HW-loops)
* https://reviews.llvm.org/D37467#2787259 (should not be in middle-end, should run after LSR, not before)
* https://reviews.llvm.org/D95789 (an attempt to fix bad loop backedge metadata)
and in the half year past, the pass authors (google) still haven't found time to respond to any of that.

Hereby it is proposed to backout the pass from the pipeline,
until someone who cares about it can address the issues reported,
and properly start the process of adding a new pass into the pipeline,
with proper performance evaluation.

Furthermore, neither google nor facebook reports any perf changes
from this change, so i'm dropping the pass completely.
It can always be re-reverted should/if anyone want to pick it up again.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D104099
2021-06-15 20:35:55 +03:00
Lang Hames 0672d5d104 [ORC] Fix missing std::move. 2021-06-15 21:42:58 +10:00
Lang Hames 48fb8ecf44 [ORC] Fix narrowing-in-initializer-list warnings. 2021-06-15 21:39:16 +10:00
Lang Hames 5a28bdeeb6 [ORC] Fix missing function in unit test. 2021-06-15 21:39:00 +10:00
Lang Hames 5188b9af84 [ORC] Make WrapperFunctionResult's ValuePtr member non-const.
The const qualifier was a hangover from an earlier iteration that allowed
wrapper functions to return pointers to const memory. This feature has
been removed, so there's no reason for this to be const any more, and
removing it eliminates const-cast warnings.
2021-06-15 21:24:12 +10:00
Lang Hames 4eb9fe2e1a [ORC] Port WrapperFunctionUtils and SimplePackedSerialization from ORC runtime.
Replace the existing WrapperFunctionResult type in
llvm/include/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.h with a
version adapted from the ORC runtime's implementation.

Also introduce the SimplePackedSerialization scheme (also adapted from the ORC
runtime's implementation) for wrapper functions to avoid manual serialization
and deserialization for calls to runtime functions involving common types.
2021-06-15 21:13:57 +10:00
Neil Henning 1540da3b78 ABI breaking changes fixes.
This commit mostly just replaces bad uses of `NDEBUG` with uses of
`LLVM_ENABLE_ABI_BREAKING_CHANGES` - the safe way to include ABI
breaking changes (normally extra struct elements in headers).

Differential Revision: https://reviews.llvm.org/D104216
2021-06-15 11:08:13 +01:00
Jay Foad f5dc511c53 [IR] Remove forward declaration of GraphTraits from Type.h
This has been unnecessary since r352353 removed GraphTraits
specializations for Type, except that a couple of other headers were
accidentally relying on this declaration.

Differential Revision: https://reviews.llvm.org/D104119
2021-06-15 09:23:45 +01:00
Adrian Prantl 035217ff51 Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.

This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.

This reapplies the previously reverted patch with additional include
order fixes for non-modular builds of LLDB.

Differential Revision: https://reviews.llvm.org/D103575
2021-06-14 16:53:41 -07:00
Adrian Prantl 7a7c00761f Revert "Allow signposts to take advantage of deferred string substitution"
This reverts commit 03841edde7.

Unfortunately this still breaks the LLDB standalone bot.
2021-06-14 16:09:04 -07:00
Adrian Prantl 03841edde7 Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.

This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.

This reapplies the previsously reverted patch with additional MachO.h
macro #undefs.

Differential Revision: https://reviews.llvm.org/D103575
2021-06-14 14:19:41 -07:00
wlei 863184dd69 [CSSPGO] Aggregation by the last K context frames for cold profiles
This change provides the option to merge and aggregate cold context by the last k frames instead of context-less name. By default K = 1 means the context-less one.

This is for better perf tuning. The more selective merging and trimming will rely on llvm-profgen's preinliner.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D104131
2021-06-14 10:33:43 -07:00
zhijian 7ed515d168 [AIX][XCOFF] emit vector info of traceback table.
Summary:

emit vector info of traceback table.

Reviewers: Jason Liu,Hubert Tong
Differential Revision: https://reviews.llvm.org/D93659
2021-06-14 11:15:22 -04:00
Florian Hahn d767d1dd2c
[ADT] Use unnamed argument for unused arg in StringMapEntryStorage.
This silences an 'unsused argument' warning.

Similar to c2006f857d.
2021-06-14 15:54:57 +01:00
Jeroen Dobbelaere bb8ce25e88 Intrinsic::getName: require a Module argument
Ensure that we provide a `Module` when checking if a rename of an intrinsic is necessary.

This fixes the issue that was detected by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288
(as mentioned by @fhahn), after committing D91250.

Note that the `LLVMIntrinsicCopyOverloadedName` is being deprecated in favor of `LLVMIntrinsicCopyOverloadedName2`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99173
2021-06-14 14:52:29 +02:00
Guillaume Chatelet 1d49e5352f [llvm] remove Sequence::asSmallVector()
There's no need for `toSmallVector()` as `SmallVector.h` already provides a `to_vector` free function that takes a range.

Reviewed By: Quuxplusone

Differential Revision: https://reviews.llvm.org/D104024
2021-06-14 08:28:05 +00:00
Simon Moll 74d45b884c [VP] Binary floating-point intrinsics.
This patch implements vector-predicated intrinsics on IR level for fadd,
fsub, fmul, fdiv and frem.  There operate in the default floating-point
environment. We will use constrained fp operand bundles for constrained
vector-predicated fp math (D93455).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93470
2021-06-14 08:51:41 +02:00
Xuanda Yang e0bb502064 [LLParser] Remove outdated deplibs
The comment mentions deplibs should be removed in 4.0. Removing it in this patch.

Reviewed By: compnerd, dexonsmith, lattner

Differential Revision: https://reviews.llvm.org/D102763
2021-06-14 12:46:12 +08:00
RamNalamothu 167e7afcd5 Implement DW_CFA_LLVM_* for Heterogeneous Debugging
Add support in MC/MIR for writing/parsing, and DebugInfo.

This is part of the Extensions for Heterogeneous Debugging defined at
https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html

Specifically the CFI instructions implemented here are defined at
https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#cfa-definition-instructions

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D76877
2021-06-14 08:51:50 +05:30
Simon Pilgrim 4089e0bbfa RawError.h - remove unused <string> include. NFCI. 2021-06-13 17:32:57 +01:00
Simon Pilgrim 033e594c59 DIPrinter.h - tidy implicit header dependencies. NFCI.
We don't use <string> but we do use std::unique_ptr (<memory>) and llvm::Optional<>
2021-06-13 17:00:15 +01:00
Simon Pilgrim 3dc727e81b ProfiledCallGraph.h - remove unused <string> include. NFCI. 2021-06-13 15:19:25 +01:00
Florian Hahn b4583a5ad7
Revert "Allow signposts to take advantage of deferred string substitution"
This reverts commit 4fc93a3a1f because it
breaks LLDB builds on certain macOS platform & SDK combinations, e.g.
http://green.lab.llvm.org/green/job/lldb-cmake-standalone/3288/consoleFull#-195476041949ba4694-19c4-4d7e-bec5-911270d8a58c
2021-06-12 12:08:25 +01:00
Florian Hahn 5cd66420cc
Revert "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB"
This reverts commit 1b748faf2b because it
breaks building the llvm-test-suite with -verify-machineinstrs on X86:
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O3/9585/

Running llc -verify-machineinstr on X86 crashes on the IR below:

    target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

    %struct.widget = type { i32, i32, i32, i32, i32*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [16 x [16 x i16]], [6 x [32 x i32]], [16 x [16 x i32]], [4 x [12 x [4 x [4 x i32]]]], [16 x i32], i8**, i32*, i32***, i32**, i32, i32, i32, i32, %struct.baz*, %struct.wobble.1*, i32, i32, i32, i32, i32, i32, %struct.quux.2*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x i32], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32***, i32***, i32****, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x [2 x i32]], [3 x [2 x i32]], i32, i32, i64, i64, %struct.zot.3, %struct.zot.3, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }
    %struct.baz = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, %struct.snork*, %struct.wombat.0*, %struct.wobble*, i32, i32*, i32*, i32*, i32, i32*, i32*, i32*, i32 (%struct.widget*, %struct.eggs*)*, i32, i32, i32, i32 }
    %struct.snork = type { %struct.spam*, %struct.zot, i32 (%struct.wombat*, %struct.widget*, %struct.snork*)* }
    %struct.spam = type { i32, i32, i32, i32, i8*, i32 }
    %struct.zot = type { i32, i32, i32, i32, i32, i8*, i32* }
    %struct.wombat = type { i32, i32, i32, i32, i32, i32, i32, i32, void (i32, i32, i32*, i32*)*, void (%struct.wombat*, %struct.widget*, %struct.zot*)* }
    %struct.wombat.0 = type { [4 x [11 x %struct.quux]], [2 x [9 x %struct.quux]], [2 x [10 x %struct.quux]], [2 x [6 x %struct.quux]], [4 x %struct.quux], [4 x %struct.quux], [3 x %struct.quux] }
    %struct.quux = type { i16, i8 }
    %struct.wobble = type { [2 x %struct.quux], [4 x %struct.quux], [3 x [4 x %struct.quux]], [10 x [4 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]] }
    %struct.eggs = type { [1000 x i8], [1000 x i8], [1000 x i8], i32, i32, i32, i32, i32, i32, i32, i32 }
    %struct.wobble.1 = type { i32, [2 x i32], i32, i32, %struct.wobble.1*, %struct.wobble.1*, i32, [2 x [4 x [4 x [2 x i32]]]], i32, i64, i64, i32, i32, [4 x i8], [4 x i8], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }
    %struct.quux.2 = type { i32, i32, i32, i32, i32, %struct.quux.2* }
    %struct.zot.3 = type { i64, i16, i16, i16 }

    define void @blam(%struct.widget* %arg, i32 %arg1) local_unnamed_addr {
    bb:
      %tmp = load i32, i32* undef, align 4
      %tmp2 = sdiv i32 %tmp, 6
      %tmp3 = sdiv i32 undef, 6
      %tmp4 = load i32, i32* undef, align 4
      %tmp5 = icmp eq i32 %tmp4, 4
      %tmp6 = select i1 %tmp5, i32 %tmp3, i32 %tmp2
      %tmp7 = getelementptr inbounds [4 x [4 x i32]], [4 x [4 x i32]]* undef, i64 0, i64 0, i64 0
      %tmp8 = zext i16 undef to i32
      %tmp9 = zext i16 undef to i32
      %tmp10 = load i16, i16* undef, align 2
      %tmp11 = zext i16 %tmp10 to i32
      %tmp12 = zext i16 undef to i32
      %tmp13 = zext i16 undef to i32
      %tmp14 = zext i16 undef to i32
      %tmp15 = load i16, i16* undef, align 2
      %tmp16 = zext i16 %tmp15 to i32
      %tmp17 = zext i16 undef to i32
      %tmp18 = sub nsw i32 %tmp8, %tmp9
      %tmp19 = shl nsw i32 undef, 1
      %tmp20 = add nsw i32 %tmp19, %tmp18
      %tmp21 = sub nsw i32 %tmp11, %tmp12
      %tmp22 = shl nsw i32 undef, 1
      %tmp23 = add nsw i32 %tmp22, %tmp21
      %tmp24 = sub nsw i32 %tmp13, %tmp14
      %tmp25 = shl nsw i32 undef, 1
      %tmp26 = add nsw i32 %tmp25, %tmp24
      %tmp27 = sub nsw i32 %tmp16, %tmp17
      %tmp28 = shl nsw i32 undef, 1
      %tmp29 = add nsw i32 %tmp28, %tmp27
      %tmp30 = sub nsw i32 %tmp20, %tmp29
      %tmp31 = sub nsw i32 %tmp23, %tmp26
      %tmp32 = shl nsw i32 %tmp30, 1
      %tmp33 = add nsw i32 %tmp32, %tmp31
      store i32 %tmp33, i32* undef, align 4
      %tmp34 = mul nsw i32 %tmp31, -2
      %tmp35 = add nsw i32 %tmp34, %tmp30
      store i32 %tmp35, i32* undef, align 4
      %tmp36 = select i1 %tmp5, i32 undef, i32 undef
      br label %bb37

    bb37:                                             ; preds = %bb
      %tmp38 = load i32, i32* undef, align 4
      %tmp39 = ashr i32 %tmp38, %tmp6
      %tmp40 = load i32, i32* undef, align 4
      %tmp41 = sdiv i32 %tmp39, %tmp40
      store i32 %tmp41, i32* undef, align 4
      ret void
    }
2021-06-12 11:41:38 +01:00
spupyrev 0a0800c4d1 A post-processing for BFI inference
The current implementation for computing relative block frequencies does
not handle correctly control-flow graphs containing irreducible loops. This
results in suboptimally generated binaries, whose perf can be up to 5%
worse than optimal.

To resolve the problem, we apply a post-processing step, which iteratively
updates block frequencies based on the frequencies of their predesessors.
This corresponds to finding the stationary point of the Markov chain by
an iterative method aka "PageRank computation". The algorithm takes at
most O(|E| * IterativeBFIMaxIterations) steps but typically converges faster.

It is turned on by passing option `use-iterative-bfi-inference`
and applied only for functions containing profile data and irreducible loops.

Tested on SPEC06/17, where it is helping to get correct profile counts for one of
the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5%
for binaries containing functions with hot irreducible loops.

Reviewed By: hoy, wenlei, davidxl

Differential Revision: https://reviews.llvm.org/D103289
2021-06-11 21:46:04 -07:00
Adrian Prantl 4fc93a3a1f Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.

This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.

This reapplies the previsously reverted patch with support for
platforms where signposts are unavailable.

Differential Revision: https://reviews.llvm.org/D103575
2021-06-11 16:52:34 -07:00
Adrian Prantl b90f9bea96 Revert "Allow signposts to take advantage of deferred string substitution"
I forgot to make the LLDB macro conditional on Linux.

This reverts commit 541ccd1c1b.
2021-06-11 16:46:34 -07:00
Andrew Litteken f6dea2e732 [IRSim] Strip out the findSimilarity call from the constructor
Both doInitialize and runOnModule were running the entire analysis
due to the actual work being done in the constructor. Strip it out here
and only get the similarity during runOnModule.

Author: lanza
Reviewers: AndrewLitteken, paquette, plofti

Differential Revision: https://reviews.llvm.org/D92524
2021-06-11 18:41:28 -05:00
Adrian Prantl 541ccd1c1b Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.

This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.

Differential Revision: https://reviews.llvm.org/D103575
2021-06-11 16:35:43 -07:00
Daniil Fukalov 79ffbc9c9f [NFC][CostModel] Fixed comment that comparisons work regardless of the state.
Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D104068
2021-06-11 23:48:49 +03:00
Kevin Athey e0b469ffa1 [clang-cl][sanitizer] Add -fsanitize-address-use-after-return to clang.
Also:
  - add driver test (fsanitize-use-after-return.c)
  - add basic IR test (asan-use-after-return.cpp)
  - (NFC) cleaned up logic for generating table of __asan_stack_malloc
    depending on flag.

for issue: https://github.com/google/sanitizers/issues/1394

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D104076
2021-06-11 12:07:35 -07:00
Matt Arsenault 93f3c7cc3e CodeGen: Fix missing const 2021-06-11 13:45:24 -04:00
Simon Pilgrim 0fc4016d91 APInt.h - add missing <utility> header.
Some buildbots are complaining about std::move() after rG61cdaf66fe22be2b5942ddee4f46a998b4f3ee29
2021-06-11 13:35:12 +01:00
Simon Pilgrim 61cdaf66fe [ADT] Remove APInt/APSInt toString() std::string variants
<string> is currently the highest impact header in a clang+llvm build:

https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html

One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps.

This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code.

Differential Revision: https://reviews.llvm.org/D103888
2021-06-11 13:19:15 +01:00
Fraser Cormack 71a02ddda1 [VP][NFC] Format comment to 80 columns 2021-06-11 12:53:48 +01:00
Bing1 Yu 56d5c46b49 [X86] Support __tile_stream_loadd intrinsic for new AMX interface
Adding support for __tile_stream_loadd intrinsic.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D103784
2021-06-11 17:28:43 +08:00
Simon Pilgrim f0a68bbc96 SampleProf.h - fix spelling mistake in assert message. NFC. 2021-06-11 10:24:14 +01:00
Simon Pilgrim 5e6bfb661e [Analysis] Pass RecurrenceDescriptor as const reference. NFCI.
We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.).

Differential Revision: https://reviews.llvm.org/D104029
2021-06-11 10:24:14 +01:00
Sjoerd Meijer c4a0969b9c Function Specialization Pass
This adds a function specialization pass to LLVM. Constant parameters
like function pointers and constant globals are propagated to the callee by
specializing the function.

This is a first version with a number of limitations:
- The pass is off by default, so needs to be enabled on the command line,
- It does not handle specialization of recursive functions,
- It does not yet handle constants and constant ranges,
- Only 1 argument per function is specialised,
- The cost-model could be further looked into, and perhaps related,
- We are not yet caching analysis results.

This is based on earlier work by Matthew Simpson (D36432) and Vinay Madhusudan.
More recently this was also discussed on the list, see:

https://lists.llvm.org/pipermail/llvm-dev/2021-March/149380.html.

The motivation for this work is that function specialisation often comes up as
a reason for performance differences of generated code between LLVM and GCC,
which has this enabled by default from optimisation level -O3 and up. And while
this certainly helps a few cpu benchmark cases, this also triggers in real
world codes and is thus a generally useful transformation to have in LLVM.

Function specialisation has great potential to increase compile-times and
code-size.  The summary from some investigations with this patch is:
- Compile-time increases for short compile jobs is high relatively, but the
  increase in absolute numbers still low.
- For longer compile-jobs, the extra compile time is around 1%, and very much
  in line with GCC.
- It is difficult to blame one thing for compile-time increases: it looks like
  everywhere a little bit more time is spent processing more functions and
  instructions.
- But the function specialisation pass itself is not very expensive; it doesn't
  show up very high in the profile of the optimisation passes.

The goal of this work is to reach parity with GCC which means that eventually
we would like to get this enabled by default. But first we would like to address
some of the limitations before that.

Differential Revision: https://reviews.llvm.org/D93838
2021-06-11 09:11:29 +01:00
Carl Ritson 2c2d2922a2 [ValueTypes] Define MVTs for v6i32, v6f32, v7i32, v7f32
For use in AMDGPU selection DAG.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103881
2021-06-11 08:58:16 +09:00
Nick Desaulniers fc018ebb60 [IR] make -warn-frame-size into a module attr
-Wframe-larger-than= is an interesting warning; we can't know the frame
size until PrologueEpilogueInsertion (PEI); very late in the compilation
pipeline.

-Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then
was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO
and needed to be re-specified via -plugin-opt.

Instead, make it part of the IR proper as a module level attribute,
similar to D103048. Introduce -fwarn-stack-size CC1 option.

Reviewed By: rsmith, qcolombet

Differential Revision: https://reviews.llvm.org/D103928
2021-06-10 16:15:27 -07:00
Jessica Paquette 933df6ca79 [AArch64][GlobalISel] Legalize scalar G_CTTZ + G_CTTZ_ZERO_UNDEF
This adds legalization for scalar G_CTTZ and G_CTTZ_ZERO_UNDEF. Vector support
requires handling vector G_BITREVERSE, which I haven't gotten around to yet.

For G_CTTZ_ZERO_UNDEF, we just lower it to G_CTTZ.

For G_CTTZ, we match SelectionDAG's lowering to a G_BITREVERSE + G_CTLZ.

e.g. https://godbolt.org/z/nPEseYh1s

(With this patch, we have slightly worse codegen than SDAG for types smaller
than s32; it seems like we're missing a combine.)

Also, this adds in a function to build G_BITREVERSE to MachineIRBuilder.

Differential Revision: https://reviews.llvm.org/D104065
2021-06-10 15:29:51 -07:00
Joachim Meyer 4f01122c3f [LV] Parallel annotated loop does not imply all loads can be hoisted.
As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is annotated parallel (`!llvm.loop.parallel_accesses`), is not expectable, the documentation for this behavior was since removed from the LangRef again, and can lead to invalid reads.
This was observed in POCL (https://github.com/pocl/pocl/issues/757) and would require similar workarounds in current work at hipSYCL.

The question remains why this was initially added and what the implications of removing this optimization would be.
Do we need an alternative mechanism to propagate the information about legality of if-conversion?
Or is the idea that conditional loads in `#pragma clang loop vectorize(assume_safety)` can be executed unmasked without additional checks flawed in general?
I think this implication is not part of what a user of that pragma (and corresponding metadata) would expect and thus dangerous.

Only two additional tests failed, which are adapted in this patch. Depending on the further direction force-ifcvt.ll should be removed or further adapted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103907
2021-06-10 23:37:57 +02:00
Philip Reames 7629b2a09c [LI] Add a cover function for checking if a loop is mustprogress [nfc]
Essentially, the cover function simply combines the loop level check and the function level scope into one call.  This simplifies several callers and is (subjectively) less error prone.
2021-06-10 13:37:32 -07:00
Philip Reames b6ee5f2b1d Move code for checking loop metadata into Analysis [nfc]
I need the mustprogress loop metadata in ScalarEvolution and it makes sense to keep all the accessors for quering loop metadate together.
2021-06-10 13:01:22 -07:00
Michael Kruse a22236120f [OpenMP] Implement '#pragma omp unroll'.
Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations).

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99459
2021-06-10 14:30:17 -05:00
Guillaume Chatelet e0569033e2 [llvm] Make Sequence reverse-iterable
This is a roll forward of D102679.
This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse.
It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++().

Note: Compared to D102679, this patch introduces a `asSmallVector()` member function and fixes compilation issue with GCC 5.

Differential Revision: https://reviews.llvm.org/D103948
2021-06-10 11:15:28 +00:00
Esme-Yi ec43c1213a [NFC][XCOFF] Replace structs FileHeader32/SectionHeader32 with constants.
Summary: Some structs like FileHeader32/SectionHeader32
defined in llvm/include/llvm/BinaryFormat/XCOFF.h seem
unnecessary, because we only need their size. So this
patch removes them and defines size constants directly.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D103901
2021-06-10 11:10:45 +00:00
David Spickett 64de8763aa Revert "Implementation of global.get/set for reftypes in LLVM IR"
This reverts commit 31859f896c.

Causing SVE and RISCV-V test failures on bots.
2021-06-10 10:11:17 +00:00
Simon Pilgrim 4eb47e3cd4 [TargetLowering] getABIAlignmentForCallingConv - pass DataLayout by const reference. NFCI.
Avoid unnecessary copies and match every other method in TargetLowering that takes DataLayout as an argument.
2021-06-10 10:55:24 +01:00
Paulo Matos 31859f896c Implementation of global.get/set for reftypes in LLVM IR
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D95425
2021-06-10 10:07:45 +02:00
Esme-Yi c8e980ab4a [XCOFF][llvm-objdump] Dump the debug type in `--section-headers` option.
Summary: Add XCOFF recognition of debug section types
under `--section-headers` option.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D103079
2021-06-10 07:08:23 +00:00
Esme-Yi 8a23f74eb7 [llvm-objdump][XCOFF] Enable the -l (--line-numbers) option.
Summary: Add support for dumping line number
information for XCOFF object files in llvm-objdump.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D101272
2021-06-10 04:37:06 +00:00
Sam Powell 5b5ab80e31 Reland "[llvm] llvm-tapi-diff"
This is relanding commit d1d36f7ad2 .
This patch additionally addresses failures found in buildbots due to unstable build ordering & post review comments.

This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files.

Reviewed By: ributzka, JDevlieghere

Differential Revision: https://reviews.llvm.org/D101835
2021-06-09 21:17:34 -07:00
Jinsong Ji 4a89ed373c [AIX] Add traceback ssp canary bit support
We will need to set the ssp canary bit in traceback table to communicate
with unwinder about the canary.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D103202
2021-06-10 02:40:02 +00:00
Justin Lebar c962491a41
Save/restore OuterTemplateParams in AbstractManglingParser::parseEncoding.
Previously we were only saving plain TemplateParams.

Differential Revision: https://reviews.llvm.org/D103996
2021-06-09 17:56:23 -07:00
Cyndy Ishida e7b755ecb1 Revert "Reland "[llvm] llvm-tapi-diff""
This reverts commit 20126c9fd4.
The sorting fixes failed to have stable output on different platforms.
2021-06-09 13:48:09 -07:00
Sam Powell 20126c9fd4 Reland "[llvm] llvm-tapi-diff"
This is relanding commit d1d36f7ad2 .
This patch additionally addresses failures found in buildbots & post review comments.

This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files.

Reviewed By: ributzka, JDevlieghere

Differential Revision: https://reviews.llvm.org/D101835
2021-06-09 10:35:41 -07:00
Fraser Cormack 502edebd9d [ValueTypes][RISCV] Cap RVV fixed-length vectors by size
This patch changes RVV's policy for its supported list of fixed-length
vector types by capping by vector size rather than element count. Now
all 1024-byte vectors (of supported element types) are supported, rather
than all 256-element vectors.

This is a more natural fit for the architecture, and allows us to, for
example, improve the support for vector bitcasts.

This change necessitated the adding of some new simple types to avoid
"regressing" on the number of currently-supported vectors. We round out
the 1024-byte types by adding `v512i8`, `v1024i8`, `v512i16` and
`v512f16`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103884
2021-06-09 12:15:37 +01:00
Florian Hahn e978f6bc97
[LTO] Support new PM in ThinLTOCodeGenerator.
This patch adds initial support for using the new pass manager when
doing ThinLTO via libLTO.

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D102627
2021-06-09 10:05:14 +01:00
Jan Kratochvil 09ac4eca66 Revert "[llvm] Sync DebugInfo.h with DebugInfoFlags.def"
This reverts commit 093750dd0b.

It broke buildbots, goint to investigate it more.
2021-06-09 10:39:57 +02:00
Jan Kratochvil 093750dd0b [llvm] Sync DebugInfo.h with DebugInfoFlags.def
Command to see the differences:
  diff -u <(sed -n 's#^HANDLE_DI_FLAG *([^,]*, *\([^()]*\)) *\(//.*\)\?$#\1#p' <llvm/include/llvm/IR/DebugInfoFlags.def | grep -vw Largest) <(sed -n 's#^ *LLVMDIFlag\([^ ]*\) *= (\?[0-9].*$#\1#p' <llvm/include/llvm-c/DebugInfo.h)

OCaml binding is more seriously out of sync but I have not tried to sync it.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D103910
2021-06-09 10:11:23 +02:00
Guillaume Chatelet 20f571dbff [NFC] Reformat MachineValueType
This is a follow up patch based on https://reviews.llvm.org/D103251#2804016.

Differential Revision: https://reviews.llvm.org/D103893
2021-06-09 07:20:51 +00:00
Sterling Augustine e11b5b87be Add Twine support for std::string_view.
With Twine now ubiquitous after rG92a79dbe91413f685ab19295fc7a6297dbd6c824,
it needs support for string_view when building clang with newer C++ standards.

This is similar to how StringRef is handled.

Differential Revision: https://reviews.llvm.org/D103935
2021-06-08 20:19:04 -07:00
Brendon Cahoon 294efbbd3e Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2.

Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08 21:15:35 -04:00
Quinn Pham 898e38a3c1 [NFC] In the future, all intrinsics defined for compatibility with the XL
compiler will be placed in this collection.

This patch has no functional changes.

Differential revision: https://reviews.llvm.org/D103921
2021-06-08 17:58:02 -05:00
Whitney Tsang 9b022a679b Revert "Revert "[LoopNest] Fix Wdeprecated-copy warnings""
This reverts commit 07ef5805ab.

The broke of the sanitizer-windows bot:
https://lab.llvm.org/buildbot/#/builders/127/builds/12064
is not caused by the original commit.

Differential Revision: https://reviews.llvm.org/D103752
2021-06-08 21:51:53 +00:00
Whitney Tsang 07ef5805ab Revert "[LoopNest] Fix Wdeprecated-copy warnings"
This reverts commit dee1f0cb34.

It appears that this change broke the sanitizer-windows bot:
https://lab.llvm.org/buildbot/#/builders/127/builds/12064

Differential Revision: https://reviews.llvm.org/D103752
2021-06-08 20:46:12 +00:00
Brendon Cahoon 211e584fa2 Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984.

A sanitizer buildbot reports an error.
2021-06-08 16:29:41 -04:00
Abhina Sreeskantharajan 0e8506deba [SystemZ][z/OS] Pass OpenFlags when creating tmp files
This patch https://reviews.llvm.org/D102876 caused some lit regressions on z/OS because tmp files were no longer being opened based on binary/text mode. This patch passes OpenFlags when creating tmp files so we can open files in different modes.

Reviewed By: amccarth

Differential Revision: https://reviews.llvm.org/D103806
2021-06-08 14:45:34 -04:00
Nick Desaulniers 3787ee4571 reland [IR] make -stack-alignment= into a module attr
Relands commit 433c8d950c with fixes for
MIPS.

Similar to D102742, specifying the stack alignment via CodegenOpts means
that this flag gets dropped during LTO, unless the command line is
re-specified as a plugin opt. Instead, encode this information as a
module level attribute so that we don't have to expose this llvm
internal flag when linking the Linux kernel with LTO.

Looks like external dependencies might need a fix:
* https://github.com/llvm-hs/llvm-hs/issues/345
* https://github.com/halide/Halide/issues/6079

Link: https://github.com/ClangBuiltLinux/linux/issues/1377

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D103048
2021-06-08 10:59:46 -07:00
Mehdi Amini a4e2cf712a Revert "[llvm] Make Sequence reverse-iterable"
This reverts commit e772216e70
(and fixup 7f6c878a2c).

The build is broken with gcc5 host compiler:

In file included from
                 from mlir/lib/Dialect/Utils/StructuredOpsUtils.cpp:9:
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: error: type/value mismatch at argument 1 in template parameter list for 'template<class ItTy, class FuncTy, class FuncReturnTy> class llvm::mapped_iterator'
                               std::function<T(ptrdiff_t)>>;
                                                         ^
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: note:   expected a type, got 'decltype (seq<ptrdiff_t>(0, 0))::const_iterator'
2021-06-08 17:03:10 +00:00
Brendon Cahoon ea10a86984 [AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
2021-06-08 12:49:49 -04:00
Nick Desaulniers a596b54d47 Revert "[IR] make -stack-alignment= into a module attr"
This reverts commit 433c8d950c.

Breaks the MIPS build.
2021-06-08 08:55:50 -07:00
Nick Desaulniers 433c8d950c [IR] make -stack-alignment= into a module attr
Similar to D102742, specifying the stack alignment via CodegenOpts means
that this flag gets dropped during LTO, unless the command line is
re-specified as a plugin opt. Instead, encode this information as a
module level attribute so that we don't have to expose this llvm
internal flag when linking the Linux kernel with LTO.

Looks like external dependencies might need a fix:
* https://github.com/llvm-hs/llvm-hs/issues/345
* https://github.com/halide/Halide/issues/6079

Link: https://github.com/ClangBuiltLinux/linux/issues/1377

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D103048
2021-06-08 08:31:04 -07:00
Whitney Tsang dee1f0cb34 [LoopNest] Fix Wdeprecated-copy warnings
error: definition of implicit copy constructor for 'LoopNest' is
deprecated because it has a user-declared copy assignment operator
[-Werror,-Wdeprecated-copy]
  LoopNest &operator=(const LoopNest &) = delete;

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D103752
2021-06-08 14:48:13 +00:00
Guillaume Chatelet 7f6c878a2c Fix missing header and namespace qualifier in ADT Sequence 2021-06-08 14:11:54 +00:00
Guillaume Chatelet e772216e70 [llvm] Make Sequence reverse-iterable
This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse.
It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++().

Differential Revision: https://reviews.llvm.org/D102679
2021-06-08 13:18:57 +00:00
Hans Wennborg 386b66b2fc Revert "3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands""
> This reapplies c0f3dfb9, which was reverted following the discovery of
> crashes on linux kernel and chromium builds - these issues have since
> been fixed, allowing this patch to re-land.

This reverts commit 36ec97f76a.

The change caused non-determinism in the compiler, see comments on the code
review at https://reviews.llvm.org/D91722.

Reverting to unbreak people's builds until that can be addressed.

This also reverts the follow-up "[DebugInfo] Limit the number of values
that may be referenced by a dbg.value" in
a0bd6105d8.
2021-06-08 14:54:08 +02:00
Simon Moll 0f9d299122 [VP] getDeclarationForParams
`VPIntrinsic::getDeclarationForParams` creates a vp intrinsic
declaration for parameters you want to call it with.  This is in
preparation of a new builder class that makes emitting vp intrinsic code
nearly as convenient as using a plain ir builder (aka `VectorBuilder`,
to be used by D99750).

Reviewed By: frasercrmck, craig.topper, vkmr

Differential Revision: https://reviews.llvm.org/D102686
2021-06-08 14:21:28 +02:00
Timm Bäder 22875b2ce3 [NFC] Remove some include cycles
These files include themselves directly.
2021-06-08 14:00:39 +02:00
maekawatoshiki 09e92c607c [LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass
This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.

Also, a crash problem on legacy pass manager is fixed.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D99149
2021-06-08 20:30:02 +09:00
Lang Hames 4f16ccdab2 [JITLink] Clarify LinkGraph::splitBlock contract in comment. 2021-06-08 18:51:12 +10:00
Tomasz Miąsko 82b7e822d0 [Demangle][Rust] Parse path backreferences
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103459
2021-06-08 10:01:49 +02:00
hsmahesha 3af5f3e692 [IR] Add utility to convert constant expression operands (of an instruction) to instructions.
In the situation where we need to replace a constant operand C from a constant expression CE
by an instruction NI, it not possible without converting CE itself into an instruction. This
utility helps to convert the given set of constant expression operands from an instruction I
into a corresponding set of instructions.

The current use-case for this utility is from the patches - https://reviews.llvm.org/D103225
and https://reviews.llvm.org/D103655.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103661
2021-06-08 03:22:32 +05:30
Amir Ayupov 8ec73e96b7 [ELF] getRelocatedSection: remove the check for ET_REL object file
getRelocatedSection interface should not check that the object file is
relocatable, as executable files may have relocations preserved with
`--emit-relocs` linker flag. The relocations are useful in context of post-link
binary analysis for function reference identification. For example, BOLT relies
on relocations to perform function reordering.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D102296
2021-06-07 13:17:00 -07:00
Harald van Dijk 75521bd9d8
[X32] Add Triple::isX32(), use it.
So far, support for x86_64-linux-gnux32 has been handled by explicit
comparisons of Triple.getEnvironment() to GNUX32. This worked as long as
x86_64-linux-gnux32 was the only X32 environment to worry about, but we
now have x86_64-linux-muslx32 as well. To support this, this change adds
an isX32() function and uses it. It replaces all checks for GNUX32 or
MuslX32 by isX32(), except for the following:

- Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat
  GNUX32 and MuslX32 differently.
- computeTargetTriple() needs to be able to transform triples to add or
  remove X32 from the environment and needs to map GNU to GNUX32, and
  Musl to MuslX32.
- getMultiarchTriple() completely lacks any Musl support and retains the
  explicit check for GNUX32 as it can only return x86_64-linux-gnux32.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D103777
2021-06-07 20:48:39 +01:00
Philip Reames 38540d71c7 [SCEV] Compute exit counts for unsigned IVs using mustprogress semantics
The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be:
for (unsigned i = 0; i < N; i += 2) { body; }

Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body".

The basic intuition behind this patch is as follows:
* A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max.
* Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this.  In LLVM, this is tied to the mustprogress attribute.

Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop.

A couple notes for those reading along:
* I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules.
* We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path.

Differential Revision: https://reviews.llvm.org/D103118
2021-06-07 11:24:00 -07:00
jasonliu 8e84311a84 [XCOFF][AIX] Enable tooling support for 64 bit symbol table parsing
Add in the ability of parsing symbol table for 64 bit object.

Reviewed By: jhenderson, DiggerLin

Differential Revision: https://reviews.llvm.org/D85774
2021-06-07 17:24:13 +00:00
Raphael Isemann f10b9ca9c6 [NFC] Add missing include to LaneBitmask.h to fix modules build 2021-06-07 18:43:00 +02:00
Sander de Smalen c908196e10 [CostModel] Return Invalid cost in getArithmeticCost instead of crashing for scalable vectors.
This fixes an issue in BasicTTIImpl.h where it tries to do a
cast<FixedVectorType> on a scalable vector type in order to get the
scalarization cost. Because scalarization of scalable vectors is not
supported, we return Invalid instead.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103798
2021-06-07 17:26:23 +01:00
Tomasz Miąsko 619a65e5e4 [Demangle][Rust] Parse dyn-trait-assoc-binding
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103364
2021-06-07 18:18:31 +02:00
Tomasz Miąsko 1499afa09b [Demangle][Rust] Parse dyn-trait
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103361
2021-06-07 18:18:31 +02:00
Tomasz Miąsko 89615a5e92 [Demangle][Rust] Parse dyn-bounds
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103151
2021-06-07 18:18:30 +02:00
Bradley Smith 60c9b5f35c [AArch64][SVE] Improve codegen for dupq SVE ACLE intrinsics
Use llvm.experimental.vector.insert instead of storing into an alloca
when generating code for these intrinsics. This defers the codegen of
the generated vector to instruction selection, allowing existing
shufflevector style optimizations to apply.

Additionally, introduce a new target transform that can recognise fixed
predicate patterns in the svbool variants of these intrinsics.

Differential Revision: https://reviews.llvm.org/D103082
2021-06-07 12:21:38 +01:00
Guillaume Chatelet 1da2c7d25c [NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE
Differential Revision: https://reviews.llvm.org/D103251
2021-06-07 10:04:16 +00:00
Jingu Kang a2a0ac42ab [SimpleLoopBoundSplit] Split Bound of Loop which has conditional branch with IV
This pass transforms loops that contain a conditional branch with induction
variable. For example, it transforms left code to right code:

                             newbound = min(n, c)
 while (iv < n) {            while(iv < newbound) {
   A                           A
   if (iv < c)                 B
     B                         C
   C                         }
 }                           if (iv != n) {
                               while (iv < n) {
                                 A
                                 C
                               }
                             }

Differential Revision: https://reviews.llvm.org/D102234
2021-06-07 10:55:25 +01:00
Esme-Yi 50bb1b930d [yaml2obj] Initial the support of yaml2obj for 32-bit XCOFF.
Summary: The patch implements the mapping of the Yaml
information to XCOFF object file to enable the yaml2obj
tool for XCOFF. Currently only 32-bit is supported.

Reviewed By: jhenderson, shchenz

Differential Revision: https://reviews.llvm.org/D95505
2021-06-07 04:14:44 +00:00
maekawatoshiki 0a9d079931 Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass"
This reverts commit 2165360003.

To fix the crash problem in legacy pass manager
2021-06-07 01:26:47 +09:00
Nikita Popov 1ffa6499ea [TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC)
Don't require a specific kind of IRBuilder for TargetLowering hooks.
This allows us to drop the IRBuilder.h include from TargetLowering.h.

Differential Revision: https://reviews.llvm.org/D103759
2021-06-06 16:29:50 +02:00
Nikita Popov 506875c879 [TargetLowering] Move methods out of line (NFC)
Move methods using IRBuilder out of line, so we can drop the
dependency on the header.
2021-06-06 16:02:10 +02:00
Simon Pilgrim 5fc8cdcb03 BreadthFirstIterator.h - fix uninitialized variable warning in default constructor. NFCI. 2021-06-06 14:13:08 +01:00
Simon Pilgrim 6e90192fdf PatternMatch.h - wrap WrapFlags tests inside brackets to stop static analysis warning about & vs && usage. NFCI. 2021-06-06 13:38:39 +01:00
Simon Pilgrim 139a36454f SmallVector.h - remove unused MathExtras.h header. NFCI. 2021-06-06 12:05:39 +01:00
Simon Pilgrim c18df1e156 Fix uninitialized variable warnings. NFCI. 2021-06-06 11:09:55 +01:00
Simon Pilgrim b47a7bb703 Revert rG0b18c4c0ec03f0321ee83b9976da5777d0e4f53f "SmallVector.h - remove unused MathExtras.h header (REAPPLIED). NFCI."
Buildbots still seem to find implicit header dependencies that I can't locally....
2021-06-06 09:39:19 +01:00
Simon Pilgrim 0b18c4c0ec SmallVector.h - remove unused MathExtras.h header (REAPPLIED). NFCI.
Try again to remove this header - I think I've found the implicit dependencies (mainly for <cmath>) on linux builds now.
2021-06-06 09:30:15 +01:00
Simon Pilgrim e3ae4ce66e Revert rG7b839b3542983a313a9bf9f8d8039ceeea35c4d7 - "SmallVector.h - remove unused MathExtras.h header. NFCI."
Breaks on linux buildbots as I seem to have missed some implicit header dependencies....
2021-06-05 20:59:46 +01:00
Simon Pilgrim 7b839b3542 SmallVector.h - remove unused MathExtras.h header. NFCI. 2021-06-05 20:19:58 +01:00
Simon Pilgrim fe6c45dd27 BitstreamWriter.h - add missing implicit MathExtras.h header dependency. NFCI.
Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h
2021-06-05 19:20:14 +01:00
Simon Pilgrim 128f5d16ef ELFTypes.h - add missing implicit MathExtras.h header dependency. NFCI.
Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h
2021-06-05 19:11:40 +01:00
Simon Pilgrim d118fa2914 [MCA] Support.h - add missing implicit MathExtras.h header dependency. NFCI.
Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h
2021-06-05 19:10:49 +01:00
Simon Pilgrim 6ebb28d32e EndianStream.h - add missing implicit MathExtras.h header dependency. NFCI.
Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h
2021-06-05 18:05:40 +01:00
Roman Lebedev e350494fb0
[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper
We might want to use it when creating SCEV proper in createSCEV(),
now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`,
which might have caused us to loose some optimization potential.
2021-06-05 12:17:51 +03:00
Nikita Popov db45746821 [LoopUnroll] Separate peeling from unrolling
Loop peeling is currently performed as part of UnrollLoop().
Outside test scenarios, it is always performed with an unroll
count of 1. This means that unrolling doesn't actually do anything
apart from performing post-unroll simplification.

When testing, it's currently possible to specify both an explicit
peel count and an explicit unroll count. This doesn't perform any
sensible operation and may result in miscompiles, see
https://bugs.llvm.org/show_bug.cgi?id=45939.

This patch moves peeling from UnrollLoop() into tryToUnrollLoop(),
so that peeling does not also perform a susequent unroll. We only
run the post-unroll simplifications. Specifying both an explicit
peel count and unroll count is forbidden.

In the future, we may want to support both (non-PGO) peeling a
loop and unrolling it, but this needs to be done by first performing
the peel and then recalculating unrolling heuristics on a now
possibly analyzable loop.

Differential Revision: https://reviews.llvm.org/D103362
2021-06-05 10:32:00 +02:00
Amir Ayupov 065a9316aa [MC] Add getLSDASection interface
This diff adds getLSDASection method to MCObjectFileInfo.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D102298
2021-06-05 00:28:20 -07:00
Rong Xu 8d581857d7 [SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part)
This patch was split from https://reviews.llvm.org/D102246
[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO
This is for llvm-profdata part of change. It sets the bit masks for the
profile reader in llvm-profdata. Also add an internal option
"-fs-discriminator-pass" for show and merge command to process the profile
offline.

This patch also moved setDiscriminatorMaskedBitFrom() to
SampleProfileReader::create() to simplify the interface.

Differential Revision: https://reviews.llvm.org/D103550
2021-06-04 11:22:06 -07:00
Joseph Huber 4a08163c73 [Attributor] Check HeapToStack's state for isKnownHeapToStack
This patch changes the `isKnownHeapToStack` and `isAssumedHeapToStack`
member functions to return if a function call is going to be altered by
HeapToStack.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103574
2021-06-04 12:38:33 -04:00
Joseph Huber 8bb713207d [Attributor] Allow lookupAAFor to return null on invalid state
This patch adds an option to `lookupAAFor` that allows it to return a
nullptr if the state of the looked up attribute is invalid. This is so
future passes can use this to query other attributes with the guarantee
that they are valid.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103556
2021-06-04 12:29:15 -04:00
Alexey Bataev c84a5448b5 [OPENMP]Fix PR50129: omp cancel parallel not working as expected.
Need to emit a call for __kmpc_cancel_barrier in the exit block for
__kmpc_cancel function call if cancellation of the parallel block is
requested.

Differential Revision: https://reviews.llvm.org/D103646
2021-06-04 08:24:55 -07:00
Mirko Brkusanin 35ef4c940b [AMDGPU][GlobalISel] Legalize G_ABS
Legalize and select G_ABS so that we can use llvm.abs intrinsic

Differential Revision: https://reviews.llvm.org/D102391
2021-06-04 14:46:43 +02:00
Cyndy Ishida 5337c7550d Revert "[llvm] llvm-tapi-diff"
This reverts commit d1d36f7ad2.
Reverting this patch to investigate linux bot failures
 + fix with author offline
2021-06-03 21:10:51 -07:00
Arthur Eubanks 738abfdbea [NFC] Remove checking pointee type for byval/preallocated type
These currently always require a type parameter. The bitcode reader
already upgrades old bitcode without the type parameter to use the
pointee type.
2021-06-03 19:09:09 -07:00
zero9178 619fa0d7fc [NFC] Add missing includes for LLVM_ENABLE_MODULES builds
Building LLVM with the LLVM_ENABLE_MODULES cmake option fails when the modules are being compiled due to missing includes. This is a side effect of some transitive includes that changed recently.

Differential Revision: https://reviews.llvm.org/D103645
2021-06-03 23:29:03 +02:00
Philip Reames 5c0d1b2f90 [LoopUnroll] Eliminate PreserveCondBr parameter and fix a bug in the process
This builds on D103584. The change eliminates the coupling between unroll heuristic and implementation w.r.t. knowing when the passed in trip count is an exact trip count or a max trip count. In theory the new code is slightly less powerful (since it relies on exact computable trip counts), but in practice, it appears to cover all the same cases. It can also be extended if needed.

The test change shows what appears to be a bug in the existing code around the interaction of peeling and unrolling. The original loop only ran 8 iterations. The previous output had the loop peeled by 2, and then an exact unroll of 8. This meant the loop ran a total of 10 iterations which appears to have been a miscompile.

Differential Revision: https://reviews.llvm.org/D103620
2021-06-03 14:09:16 -07:00
Sam Powell d1d36f7ad2 [llvm] llvm-tapi-diff
This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files.

Reviewed By: ributzka, JDevlieghere

Differential Revision: https://reviews.llvm.org/D101835
2021-06-03 11:38:00 -07:00
Eli Friedman 44cdf771fe [AtomicExpand] Merge cmpxchg success and failure ordering when appropriate.
If we're not emitting separate fences for the success/failure cases, we
need to pass the merged ordering to the target so it can emit the
correct instructions.

For the PowerPC testcase, we end up with extra fences, but that seems
like an improvement over missing fences.  If someone wants to improve
that, the PowerPC backed could be taught to emit the fences after isel,
instead of depending on fences emitted by AtomicExpand.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33332 .

Differential Revision: https://reviews.llvm.org/D103342
2021-06-03 11:34:35 -07:00
Artur Pilipenko 5a2aec3f27 NFC. Mark DOTFuncInfo getters as const
This is a preparatory refactoring for introducing new
types of hidden blocks.
2021-06-03 11:27:06 -07:00
Artur Pilipenko a06e63fa52 NFC. Refactor DOTGraphTraits::isNodeHidden
Restructure handling of cfg-hide-unreachable-paths and
cfg-hide-deoptimize-paths options so as to make it easier
to introduce new types of hidden blocks.
2021-06-03 11:27:06 -07:00
Philip Reames 44d70d298a [LoopUnroll] Eliminate PreserveOnlyFirst parameter [nfc]
This is a first step towards simplifying the transform interface to be less error prone. The basic idea is that querying SCEV is cheap (since it's cached) and we can just check for properties related to branch folding in the transform method instead of relying on the heuristic part to pass everything in correctly.

Differential Revision: https://reviews.llvm.org/D103584
2021-06-03 10:33:14 -07:00
Nikita Popov b0ab79ee2d [MC] Add missing include (NFC)
Try to fix buildbots after 983565a6fe.
2021-06-03 18:50:00 +02:00
Nikita Popov 983565a6fe [ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC)
This is a followup to D103422. The DenseMapInfo implementations for
ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h
headers, which means that these two headers no longer need to be
included by DenseMapInfo.h.

This required adding a few additional includes, as many files were
relying on various things pulled in by ArrayRef.h.

Differential Revision: https://reviews.llvm.org/D103491
2021-06-03 18:34:36 +02:00
David Spickett f4543dce5d [clang][ARM] Remove arm2/3/6/7m CPU names
These legacy CPUs are known to clang but not llvm.
Their use was ignored by llvm and it would print a
warning saying it did not recognise them.

However because some of them are default CPUs for their
architecture, you would get those warnings even if you didn't
choose a cpu explicitly.
(now those architectures will default to a "generic" CPU)

Information is thin on the ground for these older chips
so this is the best I could find:
https://en.wikichip.org/wiki/acorn/microarchitectures/arm2
https://en.wikichip.org/wiki/acorn/microarchitectures/arm3
https://en.wikichip.org/wiki/arm_holdings/microarchitectures/arm6
https://en.wikichip.org/wiki/arm_holdings/microarchitectures/arm7

Final part of fixing https://bugs.llvm.org/show_bug.cgi?id=50454.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D103028
2021-06-03 08:55:44 +00:00
Min-Yih Hsu 344e919b1a [CodeGen][NFC] Remove unused virtual function
`TargetFrameLowering::emitCalleeSavedFrameMoves` with 4 arguments is not
used anywhere in CodeGen. Thus it shouldn't be exposed as a virtual
function. NFC.

Differential Revision: https://reviews.llvm.org/D103328
2021-06-02 13:11:12 -07:00
Stefan Pintilie 1ed2e9b9a0 [NFC] Remove variable that was set but not used.
The buildbot ppc64le-lld-multistage-test has been failing because the variable
Tag in Waymaking.h is set but not used. This patch removes that varaible.
2021-06-02 13:20:32 -05:00
Rong Xu 6745ffe4fa [SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part)
This patch was split from https://reviews.llvm.org/D102246
[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO
This is mainly for ProfileData part of change. It will load
FS Profile when such profile is detected. For an extbinary format profile,
create_llvm_prof tool will add a flag to profile summary section.
For other format profiles, the users need to use an internal option
(-profile-isfs) to tell the compiler that the profile uses FS discriminators.

This patch also simplified the bit API used by FS discriminators.

Differential Revision: https://reviews.llvm.org/D103041
2021-06-02 10:32:52 -07:00
Qunyan Mangus cbde248736 Add getDemandedBits for uses.
Add getDemandedBits method for uses so we can query demanded bits for each use.  This can help getting better use information. For example, for the code below
define i32 @test_use(i32 %a) {
  %1 = and i32 %a, -256
  %2 = or i32 %1, 1
  %3 = trunc i32 %2 to i8 (didn't optimize this to 1 for illustration purpose)
  ... some use of %3
  ret %2
}
if we look at the demanded bit of %2 (which is all 32 bits because of the return), we would conclude that %a is used regardless of how its return is used. However, if we look at each use separately, we will see that the demanded bit of %2 in trunc only uses the lower 8 bits of %a which is redefined, therefore %a's usage depends on how the function return is used.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97074
2021-06-02 10:07:40 -04:00
Sander de Smalen d41cb6bb26 [LV] Build and cost VPlans for scalable VFs.
This patch uses the calculated maximum scalable VFs to build VPlans,
cost them and select a suitable scalable VF.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D98722
2021-06-02 14:47:47 +01:00
Sean Fertile 81f7607f7c [PowerPC][AIX} FIx AIX bootstrap build.
A recent patch:
https://reviews.llvm.org/rGe0921655b1ff8d4ba7c14be59252fe05b705920e
changed clangs AIX bitfield handling to use 4-byte bitfield containers,
matching XLs behavior. This change triggers static assert failures when
bootstrapping. Change the macro we check to enable bitfield packing on
AIX to `__clang__` which is defined by both xlclang and clang.

Differential Revision: https://reviews.llvm.org/D103474
2021-06-02 09:31:11 -04:00
Daniil Fukalov 0195e594fe [TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102915
2021-06-02 16:04:11 +03:00
Bjorn Pettersson 536e02a23c [CodeGen] Refactor libcall lookups for RTLIB::POWI_*
Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_*
libcall for a given value type.

This includes a small refactoring of ExpandFPLibCall and
ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code,
plus adding an ExpandFPLibCall version that can be called directly
when expanding FPOWI/STRICT_FPOWI to ensure that we actually use
the same RTLIB::Libcall when expanding the libcall as we used when
checking the legality of such a call by doing a getLibcallName check.

Differential Revision: https://reviews.llvm.org/D103050
2021-06-02 11:40:34 +02:00
Bjorn Pettersson 9c54ee4378 [SimplifyLibCalls] Take size of int into consideration when emitting ldexp/ldexpf
When rewriting
  powf(2.0, itofp(x)) -> ldexpf(1.0, x)
  exp2(sitofp(x)) -> ldexp(1.0, sext(x))
  exp2(uitofp(x)) -> ldexp(1.0, zext(x))

the wrong type was used for the second argument in the ldexp/ldexpf
libc call, for target architectures with 16 bit "int" type.
The transform incorrectly used a bitcasted function pointer with
a 32-bit argument when emitting the ldexp/ldexpf call for such
targets.

The fault is solved by using the correct function prototype
in the call, by asking TargetLibraryInfo about the size of "int".
TargetLibraryInfo by default derives the size of the int type by
assuming that it is 16 bits for 16-bit architectures, and
32 bits otherwise. If this isn't true for a target it should be
possible to override that default in the TargetLibraryInfo
initializer.

Differential Revision: https://reviews.llvm.org/D99438
2021-06-02 11:40:34 +02:00
Tomasz Miąsko a67a234ec7 [Demangle][Rust] Parse binders
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102729
2021-06-02 10:36:45 +02:00
Arthur Eubanks 8961293851 [OpaquePtr] Create API to make a copy of a PointerType with some address space
Some existing places use getPointerElementType() to create a copy of a
pointer type with some new address space.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103429
2021-06-01 16:52:32 -07:00
Daniel Sanders 9372662050 fixup: Missing operator in [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one
My local compiler was fine with it but the bots complain about ambiguous types.
2021-06-01 13:58:03 -07:00
Daniel Sanders aaac268285 [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one
It's still in use in a few places so we can't delete it yet but there's not
many at this point.

Differential Revision: https://reviews.llvm.org/D103352
2021-06-01 13:23:48 -07:00
Andrew Kelley 936ca1e21a WindowsSupport.h: do not depend on private config header
WindowsSupport.h is a public header, however if it gets included, will cause a compile error indicating that llvm/Config/config.h cannot be found, because config.h is a private header. However there is no actual dependency on the private things in this header, so it can be changed to the public config header.

Reviewed By: amccarth

Differential Revision: https://reviews.llvm.org/D103370
2021-06-01 23:05:03 +03:00
Jessica Paquette e7f501b5e7 [GlobalISel][AArch64] Combine and (lshr x, cst), mask -> ubfx x, cst, width
Also add a target hook which allows us to get around custom legalization on
AArch64.

Differential Revision: https://reviews.llvm.org/D99283
2021-06-01 10:56:17 -07:00
Guozhi Wei 1b748faf2b [X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB
This patch transforms the sequence

    lea (reg1, reg2), reg3
    sub reg3, reg4

to two sub instructions

    sub reg1, reg4
    sub reg2, reg4

Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register of ADD).

Differential Revision: https://reviews.llvm.org/D101970
2021-06-01 10:31:30 -07:00
Eli Friedman fd229caa01 [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
Nikita Popov fd7e309e02 [ADT] Move DenseMapInfo for APInt into APInt.h (PR50527)
As suggested in https://bugs.llvm.org/show_bug.cgi?id=50527, this
moves the DenseMapInfo for APInt and APSInt into the respective
headers, removing the need to include APInt.h and APSInt.h from
DenseMapInfo.h.

We could probably do the same from StringRef and ArrayRef as well.

Differential Revision: https://reviews.llvm.org/D103422
2021-06-01 18:31:41 +02:00
Daniil Seredkin 13140120dc [InstCombine] Relax constraints of uses for exp(X) * exp(Y) -> exp(X + Y)
InstCombine didn't perform the transformations when fmul's operands were
the same instruction because it required to have one use for each of them
which is false in the case. This patch fixes this + adds tests for them
and introduces a new function isOnlyUserOfAnyOperand to check these cases
in a single place.

This patch is a result of discussion in D102574.

Differential Revision: https://reviews.llvm.org/D102698
2021-06-01 08:33:23 -04:00
Andy Wingo 82f92e35c6 [WebAssembly][CodeGen] IR support for WebAssembly local variables
This patch adds TargetStackID::WasmLocal.  This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1.  SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there.  Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Differential Revision: https://reviews.llvm.org/D101140
2021-06-01 11:31:39 +02:00
Arthur Eubanks 372237487e [OpaquePtr] Remove some uses of PointerType::getElementType() 2021-05-31 16:11:25 -07:00
Florian Hahn aa00b1d763
[LV] Try to sink users recursively for first-order recurrences.
Update isFirstOrderRecurrence to  explore all uses of a recurrence phi
and check if we can sink them. If there are multiple users to sink, they
are all mapped to the previous instruction.

Fixes PR44286 (and another PR or two).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D84951
2021-05-31 19:55:33 +01:00
Andrea Di Biagio 9853d0db1e [MCA][NFCI] Minor changes to InstrBuilder and Instruction.
This is based on the assumption that most simulated instructions don't define
more than one or two registers. This is true for example on x86, where
most instruction definitions don't declare more than one register write.

The default code region size has been increased from 8 to 16. This is based on
the assumption that, for small microbenchmarks, the typical code snippet size is
often less than 16 instructions.

mca::Instruction now uses bitfields to pack flags.
No functional change intended.
2021-05-31 17:05:13 +01:00
Daniil Fukalov e853d3b274 [NFC] MemoryDependenceAnalysis cleanup.
1. Removed redundant includes,
2. Removed never defined and used `releaseMemory()`.
3. Fixed member functions names first letter case.
4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name
   `NonLocalDeps` to `NonLocalDepsMap`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102358
2021-05-31 18:07:55 +03:00
Roman Lebedev f7c95c3322
[NFC] ScalarEvolution: apply SSO to the ExprValueMap value
ExprValueMap is a map from SCEV * to a set-vector of (Value *, ConstantInt *) pair,
and while the map itself will likely be big-ish (have many keys),
it is a reasonable assumption that each key will refer to a small-ish
number of pairs.

In particular looking at n=512 case from
https://bugs.llvm.org/show_bug.cgi?id=50384,
the small-size of 4 appears to be the sweet spot,
it results in the least allocations while minimizing memory footprint.
```
$ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i | tail -n 6; echo ""; done
heaptrack.opt.0-orig.gz
total runtime: 14.32s.
calls to allocation functions: 8222442 (574192/s)
temporary memory allocations: 2419000 (168924/s)
peak heap memory consumption: 190.98MB
peak RSS (including heaptrack overhead): 239.65MB
total memory leaked: 67.58KB

heaptrack.opt.1-n1.gz
total runtime: 13.72s.
calls to allocation functions: 7184188 (523705/s)
temporary memory allocations: 2419017 (176338/s)
peak heap memory consumption: 191.38MB
peak RSS (including heaptrack overhead): 239.64MB
total memory leaked: 67.58KB

heaptrack.opt.2-n2.gz
total runtime: 12.24s.
calls to allocation functions: 6146827 (502355/s)
temporary memory allocations: 2418997 (197695/s)
peak heap memory consumption: 163.31MB
peak RSS (including heaptrack overhead): 211.01MB
total memory leaked: 67.58KB

heaptrack.opt.3-n4.gz
total runtime: 12.28s.
calls to allocation functions: 6068532 (494260/s)
temporary memory allocations: 2418985 (197017/s)
peak heap memory consumption: 155.43MB
peak RSS (including heaptrack overhead): 201.77MB
total memory leaked: 67.58KB

heaptrack.opt.4-n8.gz
total runtime: 12.06s.
calls to allocation functions: 6068042 (503321/s)
temporary memory allocations: 2418992 (200646/s)
peak heap memory consumption: 166.03MB
peak RSS (including heaptrack overhead): 213.55MB
total memory leaked: 67.58KB

heaptrack.opt.5-n16.gz
total runtime: 12.14s.
calls to allocation functions: 6067993 (499958/s)
temporary memory allocations: 2418999 (199307/s)
peak heap memory consumption: 187.24MB
peak RSS (including heaptrack overhead): 233.69MB
total memory leaked: 67.58KB
```

While that test may be an edge worst-case scenario,
https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions
agrees that this also results in improvements in the usual situations.
2021-05-31 15:34:03 +03:00
Andy Wingo bc1ad6e3c4 Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables"
This reverts commit bf35f4af51.  There was
an error in a shared-library build.
2021-05-31 10:55:15 +02:00
Andy Wingo bf35f4af51 [WebAssembly][CodeGen] IR support for WebAssembly local variables
This patch adds TargetStackID::WasmLocal.  This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1.  SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there.  Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Differential Revision: https://reviews.llvm.org/D101140
2021-05-31 10:40:38 +02:00
Sanjay Patel 7bb8bfa062 [InstCombine] fix miscompile from vector select substitution
This is similar to the fix in c590a9880d ( PR49832 ), but
we missed handling the pattern for select of bools (no compare
inst).

We can't substitute a vector value because the equality condition
replacement that we are attempting requires that the condition
is true/false for the entire value. Vector select can be partly
true/false.

I added an assert for vector types, so we shouldn't hit this again.
Fixed formatting while auditing the callers.

https://llvm.org/PR50500
2021-05-30 07:11:58 -04:00
Arthur Eubanks 3a6f12f915 Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering"
This reverts commit bc7d15c61d.

Dependent change is to be reverted.
2021-05-29 22:40:33 -07:00
Fangrui Song 38dbdde792 [Internalize] Simplify comdat renaming with noduplicates after D103043
I realized that we can use `comdat noduplicates` which is available on ELF.
Add a special case for wasm which doesn't support the feature.
2021-05-28 16:58:38 -07:00
Eli Friedman 0b3b0a727a [AArch64][RISCV] Make sure isel correctly honors failure orderings.
If a cmpxchg specifies acquire or seq_cst on failure, make sure we
generate code consistent with that ordering even if the success ordering
is not acquire/seq_cst.

At one point, it was ambiguous whether this sort of construct was valid,
but the C++ standad and LLVM now accept arbitrary combinations of
success/failure orderings.

This doesn't address the corresponding issue in AtomicExpand. (This was
reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .)

Fixes https://bugs.llvm.org/show_bug.cgi?id=50512.

Differential Revision: https://reviews.llvm.org/D103284
2021-05-28 12:47:40 -07:00
Craig Topper 2830d924b0 [VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names.
Parameter positions seem like they should be unsigned.

While there, make function names lowercase per coding standards.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D103224
2021-05-28 11:28:47 -07:00
eopXD fa488ea864 [LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass
This patch changes LoopFlattenPass from FunctionPass to LoopNestPass.

Utilize LoopNest and let function 'Flatten' generate information from it.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D102904
2021-05-28 15:43:12 +00:00
Andy Wingo ca5f07f8c4 Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables"
This reverts commit 00ecf18979, as it
broke the AMDGPU build.  Will reland later with a fix.
2021-05-28 12:42:12 +02:00
Andy Wingo 00ecf18979 [WebAssembly][CodeGen] IR support for WebAssembly local variables
This patch adds TargetStackID::WasmLocal.  This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1.  SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there.  Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Differential Revision: https://reviews.llvm.org/D101140
2021-05-28 11:07:41 +02:00
eopXD e96d6f4821 Revert "[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass"
This reverts commit 7952ddb21f.

Differential Revision: https://reviews.llvm.org/D103302
2021-05-28 07:58:06 +00:00
eopXD 7e06cf8f1b Revert "[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass"
This reverts commit ffc4d3e068.
2021-05-28 07:48:04 +00:00
eopXD ffc4d3e068 [LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass
This patch changes LoopFlattenPass from FunctionPass to LoopNestPass.

Utilize LoopNest and let function 'Flatten' generate information from it.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D102904
2021-05-28 07:25:53 +00:00
eopXD 7952ddb21f [LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass
This patch changes LoopFlattenPass from FunctionPass to LoopNestPass.

Utilize LoopNest and let function 'Flatten' generate information from it.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D102904
2021-05-28 07:11:26 +00:00
Jordan Rupprecht e41aaea262 [NFC][libObject] clang-format Archive{.h,.cpp}
In preparation for D100651
2021-05-27 16:48:40 -07:00
Andrea Di Biagio 57646d38d5 [MCA] Minor changes to the InOrderIssueStage. NFC
The constructor of InOrderIssueStage no longer takes as input a reference to the
target scheduling model. The stage can always query the subtarget to obtain a
reference to the scheduling model.
The ResourceManager is no longer stored internally as a unique_ptr.
Moved a couple of method definitions to the .cpp file.
2021-05-28 00:33:59 +01:00
Andrea Di Biagio 50770d8de5 [MCA] Refactor the InOrderIssueStage stage. NFCI
Moved the logic that checks for RAW hazards from the InOrderIssueStage to the
RegisterFile.

Changed how the InOrderIssueStage keeps track of backend stalls. Stall events
are now generated from method notifyStallEvent().

No functional change intended.
2021-05-27 22:28:04 +01:00
Quinn Pham 62b5df7fe2 [PowerPC] Added multiple PowerPC builtins
This is the first in a series of patches to provide builtins for
compatibility with the XL compiler. Most of the builtins already had
intrinsics and only needed to be implemented in the front end.
Intrinsics were created for the three iospace builtins, eieio, and icbt.
Pseudo instructions were created for eieio and iospace_eieio to
ensure that nops were inserted before the eieio instruction.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D102443
2021-05-27 16:23:03 -05:00
Adrian Prantl f3869a5c32 Support stripping indirectly referenced DILocations from !llvm.loop metadata
in stripDebugInfo().  This patch fixes an oversight in
https://reviews.llvm.org/D96181 and also takes into account loop
metadata pointing to other MDNodes that point into the debug info.

rdar://78487175

Differential Revision: https://reviews.llvm.org/D103220
2021-05-27 13:23:33 -07:00
Saleem Abdulrasool 4cc5a97101 MC: mark `dump` with `LLVM_DUMP_METHOD`
Mark the `ELFRelocationEntry::dump` method as `LLVM_DUMP_METHOD` to
annotate it properly as used to prevent the function being dead stripped
away.  This allows use of `dump` in the debugger.  This is purely to
improve the developer experience.
2021-05-27 10:47:39 -07:00
maekawatoshiki 2165360003 [LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass
This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D99149
2021-05-28 01:17:23 +09:00
Fraser Cormack 5a80dc4988 [VP][SelectionDAG] Add a target-configurable EVL operand type
This patch adds a way for the target to configure the type it uses for
the explicit vector length operands of VP SDNodes. The type must be a
legal integer type (there is still no target-independent legalization of
this operand) and must currently be at least as big as i32, the type
used by the IR intrinsics. An implicit zero-extension takes place on
targets which choose a larger type. All VP nodes should be created with
this type used for the EVL operand.

This allows 64-bit RISC-V to avoid custom legalization of all VP nodes,
keeping them in their target-independent form for that bit longer.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D103027
2021-05-27 15:27:36 +01:00
Mats Petersson 9091ecdae0 [OpenMP]Add support for workshare loop modifier in lowering
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008
2021-05-27 15:33:05 +01:00
Simon Giesecke 5f2d4b23b4 Add --quiet option to llvm-gsymutil to suppress output of warnings.
Differential Revision: https://reviews.llvm.org/D102829
2021-05-27 12:36:34 +00:00
Mats Petersson 86627be233 Revert "[OpenMP]Add support for workshare loop modifier in lowering"
This reverts commit ea4c5fb04c.
2021-05-27 13:09:47 +01:00
Mats Petersson ea4c5fb04c [OpenMP]Add support for workshare loop modifier in lowering
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008
2021-05-27 12:28:27 +01:00
Amara Emerson 9f39ba13b5 [GlobalISel] Implement splitting of G_SHUFFLE_VECTOR.
Thhis is a port from the DAG legalization. We're still missing some of the
canonicalizations of shuffles but it's a start.

Differential Revision: https://reviews.llvm.org/D102828
2021-05-27 00:28:38 -07:00
Hasyimi Bahrudin 8d25762720 Fix non-global-value-max-name-size not considered by LLParser
`non-global-value-max-name-size` is used by `Value` to cap the length of local value name. However, this flag is not considered by `LLParser`, which leads to unexpected `use of undefined value error`. The fix is to move the responsibility of capping the length to `ValueSymbolTable`.

The test is the one provided by [[ https://bugs.llvm.org/show_bug.cgi?id=45899 | Mikael in the bug report ]].

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D102707
2021-05-27 04:20:03 +00:00
Yevgeny Rouban 4d26f41f76 [RS4GC] Introduce intrinsics to get base ptr and offset
There can be a need for some optimizations to get (base, offset)
for any GC pointer. The base can be calculated by generating
needed instructions as it is done by the
RewriteStatepointsForGC::findBasePointer() function. The offset
can be calculated in the same way. Though to not expose the base
calculation and to make the offset calculation as simple as
ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal
outside RS4GC, this patch introduces 2 intrinsics:

 @llvm.experimental.gc.get.pointer.base(%derived_ptr)
 @llvm.experimental.gc.get.pointer.offset(%derived_ptr)

These intrinsics are inlined by RS4GC along with generation of
statepoint sequences.

With these new intrinsics the GC parseable lowering for atomic
memcpy intrinsics (6ec2c5e402)
could be implemented as a separate pass.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100445
2021-05-27 09:14:14 +07:00
Jessica Paquette 324af79dbc [GlobalISel] Don't emit lost debug location remarks when legalizing tail calls
There were a bunch of lost debug location remarks that show up when legalizing
tail calls on AArch64.

This would happen because we drop the return in the block where we emit the
tail call. So, we end up dropping the debug location, which makes the
LostDebugLocObserver report a missing debug location.

Although it's *true* that we lose these debug locations, this isn't
a particularly useful remark. We expect to drop these debug locations when
emitting tail calls. Suppressing remarks in this case is preferable, since the
amount of noise could hide actual debug location related bugs.

To do this, I just plumbed the LostDebugLocObserver through the relevant
LegalizerHelper functions. This is the only case I can think of where we need
the LostDebugLocObserver in the LegalizerHelper. So, rather than storing it
in the LegalizerHelper proper and mucking around with the constructors, I
figured it'd be cleanest to take the simplest path for now.

This clears up ~20 noisy lost debug location remarks on CTMark in AArch64 at
-Os.

Differential Revision: https://reviews.llvm.org/D103128
2021-05-26 17:16:11 -07:00
Jacob Hegna 1494fa6943 Update documentation for InlineModel features.
Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D103193
2021-05-26 12:52:28 -07:00
Jeremy Morse 8496fc2ec8 [DebugInstrRef][1/3] Track PHI values through register allocation
This patch introduces "DBG_PHI" instructions, a marker of where a PHI
instruction used to be, before PHI elimination. Under the instruction
referencing model, we want to know where every value in the function is
defined -- and a PHI, even if implicit, is such a place.

Just like instruction numbers, we can use this to identify a value to be
used as a variable value, but we don't need to know what instruction
defines that value, for example:

bb1:
   DBG_PHI $rax, 1
   [... more insts ... ]
bb2:
   DBG_INSTR_REF 1, 0, !1234, !DIExpression()

This specifies that on entry to bb1, whatever value is in $rax is known
as value number one -- and the later DBG_INSTR_REF marks the position
where variable !1234 should take on value number one.

PHI locations are stored in MachineFunction for the duration of the
regalloc phase in the DebugPHIPositions map. The map is populated by
PHIElimination, and then flushed back into the instruction stream by
virtregrewriter. A small amount of maintenence is needed in
LiveDebugVariables to account for registers being split, but only for
individual positions, not for entire ranges of blocks.

Differential Revision: https://reviews.llvm.org/D86812
2021-05-26 20:24:00 +01:00
Philip Reames ff08c3468f [SCEV] Compute trip multiple for multiple exit loops
This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case.

Differential Revision: https://reviews.llvm.org/D103189
2021-05-26 11:52:25 -07:00
Heejin Ahn 5dd86aadf0 [WebAssembly] Add TargetInstrInfo::getCalleeOperand
DwarfDebug unconditionally assumes for all call instructions the 0th
operand is the callee operand, which seems to be true for other targets,
but not for WebAssembly. This adds `TargetInstrInfo::getCallOperand`
method whose default implementation returns `getOperand(0)` and makes
WebAssembly overrides it to use its own utility method to get the callee
operand.

This also fixes an existing bug in `WebAssembly::getCalleeOp`, which was
uncovered by this CL.

Reviewed By: dschuff, djtodoro

Differential Revision: https://reviews.llvm.org/D102978
2021-05-26 11:43:59 -07:00
Philip Reames 9306bb638f [SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops
This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context.

I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns.  There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable.  If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired.

Differential Revision: https://reviews.llvm.org/D103182
2021-05-26 11:18:25 -07:00
Fangrui Song 73a1179535 [llvm-mc] Add -M to replace -riscv-no-aliases and -riscv-arch-reg-names
In objdump, many targets support `-M no-aliases`.  Instead of having a
`-*-no-aliases` for each target when LLVM adds the support, it makes more sense
to introduce objdump style `-M`.

-riscv-arch-reg-names is removed. -riscv-no-aliases has too many uses and thus is retained for now.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D103004
2021-05-26 10:43:32 -07:00
Philip Reames 921d3f7af0 [SCEV] Add a utility for converting from "exit count" to "trip count"
(Mostly as a logical place to put a comment since this is a reoccuring confusion.)
2021-05-26 10:41:49 -07:00
Philip Reames fb14577d0c [SCEV] Extract out a helper for computing trip multiples 2021-05-26 10:15:03 -07:00
Philip Reames 9cc2181ec3 [unroll] Use value domain for symbolic execution based cost model
The current full unroll cost model does a symbolic evaluation of the loop up to a fixed limit. That symbolic evaluation currently simplifies to constants, but we can generalize to arbitrary Values using the InstructionSimplify infrastructure at very low cost.

By itself, this enables some simplifications, but it's mainly useful when combined with the branch simplification over in D102928.

Differential Revision: https://reviews.llvm.org/D102934
2021-05-26 08:41:25 -07:00
Jonas Paulsson d058262b14 [SystemZ] Support i128 inline asm operands.
Support virtual, physical and tied i128 register operands in inline assembly.

i128 is on SystemZ not really supported and is not a legal type and generally
such a value will be split into two i64 parts. There are however some
instructions that require a pair of two GPR64 registers contained in the GR128
bit reg class, which is untyped.

For inline assmebly operands, it proved to be very cumbersome to first follow
the general behavior of splitting an i128 operand into two parts and then
later rebuild the INLINEASM MI to have one GR128 register. Instead, some
minor common code changes were made to SelectionDAGBUilder to only create one
GR128 register part to begin with. In particular:

- getNumRegisters() now has an optional parameter "RegisterVT" which is
  passed by AddInlineAsmOperands() and GetRegistersForValue().

- The bitcasting in GetRegistersForValue is not performed if RegVT is
  Untyped.

- The RC for a tied use in AddInlineAsmOperands() is now computed either from
  the tied def (virtual register), or by getMinimalPhysRegClass() (physical
  register).

- InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register
  class (DstRC) can also be computed for an illegal type.

In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and
joinRegisterPartsIntoValue() have been implemented to handle i128 operands.

Differential Revision: https://reviews.llvm.org/D100788

Review: Ulrich Weigand
2021-05-26 10:08:32 -05:00