Commit Graph

383542 Commits

Author SHA1 Message Date
Joe Nash 538bda0b80 [AMDGPU] Refactor DPPCombine
NFC. Extract IsShrinkable into a helper function, and
make Subtarget a member variable.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D99099

Change-Id: If4bc97a88a9ae4eb1df47e717345d46a6ed515bf
2021-03-23 11:53:53 -04:00
Craig Topper 839a46d88f [RISCV] Use selectImm for RV32. NFC
Previously we used selectImm for RV64 and isel patterns for
RV32. This should be NFC, but will allow RV32 and RV64 to share
improvements in the future. For example, it might be useful to
use BSETI from Zbs to make single bit constants.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98877
2021-03-23 08:57:15 -07:00
Jay Foad fc7e3e7dd9 [AMDGPU] Set SchedRW on real instructions
Coyp SchedRW from pseudos to real instructions so that llvm-mca has
access to it. This is NFC for normal compiler codegen, which schedules
pseudos not real instructions.

Add an llvm-mca test for some high latency double-precision instructions
as a smoke test.

Differential Revision: https://reviews.llvm.org/D99187
2021-03-23 15:38:11 +00:00
Stefan Gränitz d9069dd9b5 [lli] Workaround missing architecture support in LazyCallThroughManager for non-lazy mode
Next attempt to prevent PowerPC/s390x/etc. failures when landing D98931.
2021-03-23 16:37:15 +01:00
Sanjay Patel 9d45daf465 [PhaseOrdering] add AVX attribute to make test less fragile; NFC
This doesn't change anything currently, but as discussed in
D98981 and D98152, some tests may fail to vectorize because
the cost model becomes more accurate as we switch over to
using min/max intrinsics.
2021-03-23 11:34:33 -04:00
Florian Hahn 7fb6d9f958
[LV] Add 'fast' flag to test to make sure it will be vectorized.
This makes the test more robust with respect to when LV checks if the
floating point instructions in a loop can be vectorized.
2021-03-23 15:32:23 +00:00
Roman Lebedev b5822026dd
[SimplifyCFG] 'Fold branch to common dest': don't overestimate the cost
`FoldBranchToCommonDest()` has a certain budget (`-bonus-inst-threshold=`)
for bonus instruction duplication. And currently it calculates the cost
as-if it will actually duplicate into each predecessor.

But ignoring the budget, it won't always duplicate into each predecessor,
there are some correctness and profitability checks.
So when calculating the cost, we should first check into which blocks
will we *actually* duplicate, and only then use that block count
to do budgeting.
2021-03-23 18:30:26 +03:00
Roman Lebedev a866f72eb2
[NFC][SimplifyCFG] 'Fold branch to common dest': add test for cost overestimation
We should not count the cost of duplication into predecessors into which
we won't ultimately duplicate.
2021-03-23 18:30:26 +03:00
Andrzej Warzynski af8056889a [flang][cmake] Improve how CLANG_DIR is handled
* Added a sanity check with `Clang_FOUND` to verify that find_package
succeeded
* Made sure that find_package won't use any of CMake's standard paths
to guarantee that only the path provided with CLANG_DIR is considered
(implemented through NO_DEFAULT_PATH)
* Made the call to get_filename_component more explicit (so that it is
clear what the base directory is)
* Updated comments to clarify what CLANG_DIR means

Differential Revision: https://reviews.llvm.org/D99088
2021-03-23 15:14:51 +00:00
Frederik Gossen 94ef248d7b Revert "[MLIR] Canonicalize `shape.assuming` op to yield only inner values"
This reverts commit 5f8acd4fd2.
2021-03-23 16:05:55 +01:00
Andrea Di Biagio f5bdc88e4d [MCA] Improved handling of negative read-advance cycles.
Before this patch, register writes were always invalidated by the
RegisterFile at instruction commit stage. So,
the RegisterFile was often losing the knowledge about the `execute
cycle` of writes already committed. While this was not problematic
for non-delayed reads, this was sometimes leading to inaccurate read
latency computations in the presence of negative read-advance cycles.

This patch fixes the issue by changing how the RegisterFile component
internally keeps track of the `execute cycle` information of each
write. On every instruction executed, the RegisterFile gets notified
by the RetireStage, so that it can internally record the execute
cycle of each executed write.
The `execute cycle` information is stored within WriteRef itself, and
it is not invalidated when the write is committed.
2021-03-23 14:47:23 +00:00
Roman Lebedev 514bc01ca3
[SimplifyCFG] FoldBranchToCommonDest(): properly handle same-block external uses (PR49510/PR49689)
We clone bonus instructions to the end of the predecessor block,
and then use `SSAUpdater::RewriteUseAfterInsertions()`.
But that only deals with the cases where the use-to-be-rewritten
are either in different block from the def, or come after the def.

But in some loop cases, the external use may be in the beginning of
predecessor block, before the newly cloned bonus instruction.
`SSAUpdater::RewriteUseAfterInsertions()` does not deal with that.
Notably, the external use can't happen to be both in the same block
and *after* the newly-cloned instruction, because of the fold preconditions.

To properly handle these cases, when the use is in the same block,
we should instead use `SSAUpdater::RewriteUse()`.
TBN, they do the same thing for PHI users.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49510
Likely Fixes https://bugs.llvm.org/show_bug.cgi?id=49689
2021-03-23 17:37:28 +03:00
Timm Bäder bc6b139392 [clang][parser] Don't prohibit attributes on objc @try/@throw
This line has a TODO comment, but the answer to it seems to be "no"
given that clang itself uses attributes on @try statements in its tests.

This ProhibitAttributes() statement is also dead code since
ProhibitAttributs() does not handle GNU attributes at the moment but
those are the only attributes valid in objc.

Differential Revision: https://reviews.llvm.org/D97371
2021-03-23 15:26:25 +01:00
Stefan Gränitz 0ef51db5a4 Revert "[Orc] Allow OrcGenericABI variant of LazyCallThroughManager"
This reverts commit 61974268269f96b672a50eac40a5a8eeb4acd6d3.
2021-03-23 15:23:33 +01:00
Fraser Cormack feff66a082 [RISCV] Further optimize BUILD_VECTORs with repeated elements
This patch builds upon the initial BUILD_VECTOR work introduced in
D98700. It further optimizes the lowering of BUILD_VECTOR by using
VSELECT operations to effectively insert repeated elements into the
vector with relatively few instructions. This allows us to optimize more
BUILD_VECTORs without significantly increasing the size of the generated
code.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98969
2021-03-23 14:14:48 +00:00
Sanjay Patel 1bf8f9e228 [SimplifyCFG] use profile metadata to refine merging branch conditions
2nd try (original: 27ae17a6b0) with fix/test for crash. We must make
sure that TTI is available before trying to use it because it is not
required (might be another bug).

Original commit message:

This is one step towards solving:
https://llvm.org/PR49336

In that example, we disregard the recommended usage of builtin_expect,
so an expensive (unpredictable) branch is folded into another branch
that is guarding it.
Here, we read the profile metadata to see if the 1st (predecessor)
condition is likely to cause execution to bypass the 2nd (successor)
condition before merging conditions by using logic ops.

Differential Revision: https://reviews.llvm.org/D98898
2021-03-23 10:19:37 -04:00
Nico Weber ed0558a09d [gn build] (manually) port d709dcc090 2021-03-23 10:13:14 -04:00
Jamie Schmeiser 64336d3421 Revert "A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash."
This reverts commit 9544a32287.
2021-03-23 10:09:27 -04:00
Nemanja Ivanovic 4146864735 [PowerPC][NFC] Use valid type for offset in altivec.h
We currently use signed long long instead of ptrdiff_t for offsets
in altivec.h. This has never really presented a problem because
all platforms where we use these are 64-bit. However, now that
we have 32-bit targets, we need to use a meaningful type.
2021-03-23 08:45:37 -05:00
Jamie Schmeiser 9544a32287 A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash.
Summary:
The IR is saved in its print form before each pass is started and a
signal handler is registered.  If the compilation crashes, the signal
handler will print the saved IR to dbgs().  This option
can be modified using -print-module-scope to get the IR for the complete
module.  Note that this option only works with the new pass manager.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban)
Differential Revision: https://reviews.llvm.org/D86657
2021-03-23 09:29:17 -04:00
serge-sans-paille e19884cd74 Introduce a generic operator to apply complex operations to BitVector
This avoids temporary and memcpy call when computing large expressions.

It's basically some kind of poor man's expression template, but it seems easier
to maintain to have a single generic `apply` call instead of the whole
expression template machinery here.

Differential Revision: https://reviews.llvm.org/D98176
2021-03-23 14:23:26 +01:00
Yvan Roux 241032a205 [llvm-symbolizer][llvm-nm] Fix AArch64 and ARM mapping symbols handling.
Exclude AArch64 mapping symbols ($x and $d) for symtab symbolization as
it was done for ARM since D95916 tom bring bots back to green state.

This is implemented by setting SF_FormatSpecific such that
llvm-symbolizer will ignore them, and use this flag to re-implement
llvm-nm --special-syms option which make it work for both targets.

Differential Revision: https://reviews.llvm.org/D98803
2021-03-23 14:17:12 +01:00
Valentin Clement d709dcc090 [openacc][openmp] Reduce number of generated file and prefer inclusion of .inc
Follow up from D92955 and D83636. This patch makes the base cpp files
OMP.cpp and ACC.cpp normal files and they now include the XXX.inc file
generated by tablegen. This reduces the number of file generated by the
DirectiveEmitter backend and makes it closer to the proposal in D83636.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D93560
2021-03-23 09:16:53 -04:00
Matt Arsenault b24436ac96 GlobalISel: Lower funnel shifts 2021-03-23 09:11:17 -04:00
Stefan Gränitz 5949bd9125 [Orc] Allow OrcGenericABI variant of LazyCallThroughManager
Apply the way createLocalIndirectStubsManagerBuilder() deals with unsupported achritectures to createLocalLazyCallThroughManager(). The returned call-through manager is dysfunctional: It runs into an unreachable as soon as a lazy JIT attempts to use it. However, this results in broader platform support for lli in default (greedy) ORC mode where no lazy materialization is required.
2021-03-23 14:08:53 +01:00
Zakk Chen 0bc1959f51 [RISCV][NFC] Fix RVV intrinsic tests.
1. Skip the temporary file
2. Test cc1 with -S to verify codegen work well. Add '-target-feature
   +m' because the backend requires it to calculate the vscaled size/offset.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99082
2021-03-23 06:06:05 -07:00
LLVM GN Syncbot 308d40fe66 [gn build] Port 274907c0a4 2021-03-23 13:01:57 +00:00
Raphael Isemann 274907c0a4 [ASTImporter] Split out Objective-C related unit tests
This moves the two tests we have for importing Objective-C nodes to their own
file. The motivation is that this means I can add more Objective-C tests without
making the compilation time of ASTImporterTest even longer. Also it seems nice
to separate the Apple-specific stuff from the ASTImporter test.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D99162
2021-03-23 13:58:45 +01:00
Sanjay Patel 3c8473ba53 [SLP] allow matching integer min/max intrinsics as reduction ops
As noted in D98152, we need to patch SLP to avoid regressions when
we start canonicalizing to integer min/max intrinsics.
Most of the real work to make this possible was in:
7202f47508

Differential Revision: https://reviews.llvm.org/D98981
2021-03-23 08:56:44 -04:00
Luke Drummond 520f70e94d [NFC] clang-format llvm/lib/Transforms/Utils/CloneFunction.cpp
Differential Revision: https://reviews.llvm.org/D98957
2021-03-23 12:53:28 +00:00
Luke Drummond ab44ec1b22 [NFC] Minor refactor
- Give unwieldy repeated expression a name
- Use a ranged `for` basic block iterator

Reviewed by: nikic, dexonsmith
Differential Revisision: https://reviews.llvm.org/D98957
2021-03-23 12:53:28 +00:00
Luke Drummond 0448ddd169 [NFCI] cleanup CloneFunctionInto
Hoist early return for decl-only clones to before DIFinder
calculation.
Also fix an out of date assert message after invariants changed in
22a52dfddc.

Reviewed by: nikic, dexonsmith
Differential Revisision: https://reviews.llvm.org/D98957
2021-03-23 12:53:27 +00:00
Benjamin Kramer 39e36fff3d [AArch64] Fix unused variable warning 2021-03-23 13:42:14 +01:00
Fraser Cormack 38cf50bc04 [LangRef] Fix typos in the vector-type memory layout section
Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D99163
2021-03-23 12:28:50 +00:00
Nashe Mncube 5d929794a8 [llvm-opt] Bug fix within combining FP vectors
A bug was found within InstCombineCasts where a function call
is only implemented to work with FixedVectors. This caused a
crash when a ScalableVector was passed to this function.
This commit introduces a regression test which recreates the
failure and a bug fix.

Differential Revision: https://reviews.llvm.org/D98351
2021-03-23 12:13:41 +00:00
Martin Storsjö 2f18e51d8b [lldb] Silence GCC warnings about format not being a string literal in LLDB_SCOPED_TIMER
Pass "%s" as the format string literal and LLVM_PRETTY_FUNCTION as
argument to it.

Differential Revision: https://reviews.llvm.org/D99120
2021-03-23 14:11:50 +02:00
Florian Hahn e43e8e9138
[AnnotationRemarks] Use subprogram location for summary remarks.
The summary remarks are generated on a per-function basis. Using the
first instruction's location is sub-optimal for 2 reasons:
  1. Sometimes the first instruction is missing !dbg
  2. The location of the first instruction may be mis-leading.

Instead, just use the location of the function directly.
2021-03-23 12:05:41 +00:00
Kadir Cetinkaya 8f80c66bd2
[clang] Fix a crash when CTAD fails
Differential Revision: https://reviews.llvm.org/D99145
2021-03-23 13:03:30 +01:00
David Green 003fab9e8d [ARM] Additional Upper bound unrolling test. NFC 2021-03-23 12:00:40 +00:00
Florian Hahn 4ed0a5506a
[AnnotationRemarks] Add test for annotation remarks with dbg locations.
The test illustrates that we not pick the debug location from the
function directly. This will be fixed in a follow-up patch.
2021-03-23 11:52:27 +00:00
Victor Campos f22b4c7122 [ARM] Handle debug instrs in ARM Low Overhead Loop pass
In function ConvertVPTBlocks(), it is assumed that every instruction
within a vector-predicated block is predicated. This is false for debug
instructions, used by LLVM.

Because of this, an assertion failure is reached when an input contains
debug instructions inside VPT blocks. In non-assert builds, an out of
bounds memory access took place.

The present patch properly covers the case of debug instructions.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99075
2021-03-23 11:49:06 +00:00
OCHyams faf5f1cbba [dexter] Fix DexLimitSteps when breakpoint can't be set at requested location
Using a DexLimitSteps command forces dexter to use the ConditionalController
debugger controller. At each breakpoint the ConditionalController needs to
understand which one has been hit. Prior to this patch, upon hitting a
breakpoint, dexter used the current source location to look up which requested
breakpoint had been hit.

A breakpoint may not get set at the exact location that the user (dexter)
requests. For example, if the requested breakpoint location doesn't exist
in the line table then then debuggers will (usually, AFAICT) set the breakpoint
at the next available valid breakpoint location.

This meant that, occasionally in unoptimised programs and frequently in
optimised programs, the ConditionalController was failing to determine which
breakpoint had been hit.

This is the fix:

Change the DebuggerBase breakpoint interface to use opaque breakpoint ids
instead of using source location to identify breakpoints, and update the
ConditionalController to track breakpoints instead of locations.

These now return a breakpoint id:

    add_breakpoint(self, file_, line)
    _add_breakpoint(self, file_, line)
    add_conditional_breakpoint(self, file_, line, condition)
    _add_conditional_breakpoint(self, file_, line, condition)

Replace:

    delete_conditional_breakpoint(self, file_, line, condition)
    _delete_conditional_breakpoint(self, file_, line, condition)

with:

    delete_breakpoint(self, id)

Add:

    get_triggered_breakpoint_ids(self)

A breakpoint id is guaranteed to be unique for each requested breakpoint, even
for duplicate breakpoint requests. Identifying breakpoints like this, instead
of by location, removes the possibility of mixing up requested and bound
breakpoints.

This closely matches the LLDB debugger interface so little work was required in
LLDB.py, but some extra bookkeeping is required in VisualStudio.py to maintain
the new breakpoint id semantics. No implementation work has been done in
dbgeng.py as DexLimitSteps doesn't seem to support dbgeng at the moment.

Testing
Added:
dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp

There were no unexpected failures running the full debuginfo-tests suite.

The regression tests use dbgeng on windows by default, and as mentioned above
dbgeng isn't supported yet, so I have also manually tested (i.e. without lit)
that this specific test works as expected with clang and Visual Studio 2017 on
Windows.

Reviewed By: TWeaver

Differential Revision: https://reviews.llvm.org/D98699
2021-03-23 11:33:43 +00:00
Frederik Gossen 5f8acd4fd2 [MLIR] Canonicalize `shape.assuming` op to yield only inner values
Differential Revision: https://reviews.llvm.org/D99156
2021-03-23 12:34:50 +01:00
David Sherwood d70251163f [LoopVectorize][NFC] Refactor code to use IRBuilder::CreateStepVector
In places where we create a ConstantVector whose elements are a
linear sequence of the form <start, start + 1, start + 2, ...>
I've changed the code to make use of CreateStepVector, which creates
a vector with the sequence <0, 1, 2, ...>, and a vector addition
operation. This patch is a non-functional change, since the output
from the vectoriser remains unchanged for fixed length vectors and
there are existing asserts that still fire when attempting to use
scalable vectors for vectorising induction variables.

In a later patch we will enable support for scalable vectors
in InnerLoopVectorizer::getStepVector(), which relies upon the new
stepvector intrinsic in IRBuilder::CreateStepVector.

Differential Revision: https://reviews.llvm.org/D97861
2021-03-23 11:29:05 +00:00
Frederik Gossen f368b3a029 [MLIR][Shape] Canonicalize duplicate operands in `shape.cstr_broadcastable`
Differential Revision: https://reviews.llvm.org/D99159
2021-03-23 12:23:22 +01:00
Jay Foad d42f63beeb [AMDGPU] Use non-compressed exports in a test. NFC.
I don't think there's any need for this test to use compressed exports.
Using normal exports seems a bit more straightforwards and avoids a tiny
bit of bitcasting.

Differential Revision: https://reviews.llvm.org/D99167
2021-03-23 11:18:12 +00:00
Abhina Sreeskantharajan a234d03198 [NFC] Formatting changes
This patch addresses some formatting changes from the comments in https://reviews.llvm.org/D97785.

Reviewed By: anirudhp

Differential Revision: https://reviews.llvm.org/D99072
2021-03-23 07:17:54 -04:00
Stefan Gränitz 581adb4f1a Temporarily revert "[lli] Make -jit-kind=orc the default JIT engine"
This reverts commit eaee4f2696.
2021-03-23 12:01:30 +01:00
Nemanja Ivanovic 2f782a796a [PowerPC] Add more missing overloads to altivec.h
Add overloads that perform subtraction on v1i128 that take and
produce vector unsigned char to avoid needing to use __int128.
The overloads are suffixed with _u128 and are needed for targets
where __int128 isn't supported (AIX).
2021-03-23 05:52:36 -05:00
Frederik Gossen d78374b2d3 [MLIR] Add callback builder for `shape.assuming` op
Differential Revision: https://reviews.llvm.org/D99153
2021-03-23 11:46:01 +01:00