Commit Graph

140299 Commits

Author SHA1 Message Date
Serguei Katkov 75d0e0cd5f [IRCE] consolidate profitability check
Use BFI if it is available and BPI otherwise.
This is a promised follow-up after D89541.

Reviewers: ebrevnov, mkazantsev
Reviewed By: ebrevnov
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D89773
2020-10-22 11:26:45 +07:00
Xiang1 Zhang 7c3fea7721 [X86] Support customizing stack protector guard
Reviewed By: nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D88631
2020-10-22 10:08:14 +08:00
Craig Topper 9e884169a2 [FPEnv][X86][SystemZ] Use different algorithms for i64->double uint_to_fp under strictfp to avoid producing -0.0 when rounding toward negative infinity
Some of our conversion algorithms produce -0.0 when converting unsigned i64 to double when the rounding mode is round toward negative. This switches them to other algorithms that don't have this problem. Since it is undefined behavior to change rounding mode with the non-strict nodes, this patch only changes the behavior for strict nodes.

There are still problems with unsigned i32 conversions too which I'll try to fix in another patch.

Fixes part of PR47393

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87115
2020-10-21 18:12:54 -07:00
Zequan Wu 2f29341114 Revert "Revert "SimplifyCFG: Clean up optforfuzzing implementation""
This reverts commit 716f7636e1.
2020-10-21 17:08:56 -07:00
Zequan Wu 716f7636e1 Revert "SimplifyCFG: Clean up optforfuzzing implementation"
See discussion: https://reviews.llvm.org/D89590
This reverts commit cdd006eec9.
2020-10-21 16:56:32 -07:00
Gaurav Jain 4634ad6c0b [NFC] Set return type of getStackPointerRegisterToSaveRestore to Register
Differential Revision: https://reviews.llvm.org/D89858
2020-10-21 16:19:38 -07:00
David Blaikie a67d164a82 Revert several changes related to llvm-symbolizer exiting non-zero on failure.
Seems users have enough different uses of the symbolizer where they
might have unknown binaries and offsets such that "best effort" behavior
is all that's expected of llvm-symbolizer - so even erroring on unknown
executables and out of bounds offsets might not be suitable.

This reverts commit 1de0199748.
This reverts commit a7b209a6d4.
This reverts commit 338dd138ea.
2020-10-21 15:21:44 -07:00
Quentin Colombet ee6abef532 [ValueTracking] Interpret GEPs as a series of adds multiplied by the related scaling factor
Prior to this patch, computeKnownBits would only try to deduce trailing zeros
bits for getelementptrs. This patch adds the logic to treat geps as a series
of add * scaling factor.

Thanks to this patch, using a gep or performing an address computation
directly "by hand" (ptrtoint followed by adds and mul followed by inttoptr)
offers the same computeKnownBits information.

Previously, the "by hand" approach would have given more information.

This is related to https://llvm.org/PR47241.

Differential Revision: https://reviews.llvm.org/D86364
2020-10-21 15:07:04 -07:00
Arthur Eubanks 8d9466a385 [BlockExtract][NewPM] Port -extract-blocks to NPM
Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D89015
2020-10-21 12:51:11 -07:00
Arthur Eubanks aa6c305344 [LowerMatrixIntrinsics][NewPM] Fix PreservedAnalyses result
PreservedCFGCheckerInstrumentation was saying that LowerMatrixIntrinsics
didn't properly preserve CFG even though it claimed to. The legacy pass
says it doesn't. Match the legacy pass's preserved analyses.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D89175
2020-10-21 12:42:16 -07:00
Artur Pilipenko e8cce5ad89 [RS4GC] NFC. Preparatory refactoring to make GC parseable memcpy
For GC parseable element atomic memcpy/memmove we'll need to
shuffle statepoint arguments. Make it possible by storing the
arguments as Value *, not Use *.
2020-10-21 12:38:20 -07:00
Arthur Eubanks 958abe0180 [NFC] Clean up always false variables
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D89023
2020-10-21 10:54:55 -07:00
Evgeny Leviant bf9edcb6fd [ARM][SchedModels] Convert IsLdrAm3RegOffPred to MCSchedPredicate
Differential revision: https://reviews.llvm.org/D89876
2020-10-21 20:49:10 +03:00
Stanislav Mekhanoshin 611959f004 [AMDGPU] Fixed v_swap_b32 match
1. Fixed liveness issue with implicit kills.
2. Fixed potential problem with an indirect mov.

Fixes: SWDEV-256848

Differential Revision: https://reviews.llvm.org/D89599
2020-10-21 10:14:24 -07:00
Joe Nash f6d7832f4c [AMDGPU] Refactor SOPC & SOPP .td for extension
We use the Real vs Pseudo instruction abstraction for other
types of instructions to facilitate changes in opcode
between gpu generations.
This patch introduces that abstraction to SOPC and SOPP.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D89738

Change-Id: I59d53c2c7058b49d05b60350f4062a9b542d3138
2020-10-21 12:35:52 -04:00
Matt Arsenault 1ed4caff1d AMDGPU: Lower the threshold reported for maximum stack size exceeded
Check the actual maximum supported stack size for a kernel.
2020-10-21 12:06:27 -04:00
Matt Arsenault 53c43431bc AMDGPU: Propagate amdgpu-flat-work-group-size attributes
Fixes being overly conservative with the register counts in called
functions. This should try to do a conservative range merge, but for
now just clone.

Also fix not being able to functionally run the pass standalone.
2020-10-21 12:06:24 -04:00
Paul C. Anagnostopoulos dfd6b69e01 [ARM] [TableGen] Clean up !if(!eq(boolean, 1) and related booleans
Differential Revision: https://reviews.llvm.org/D89822
2020-10-21 09:52:45 -04:00
Jeremy Morse 537f0fbe82 [DebugInfo] Follow up c521e44def with an API improvement
As mentioned post-commit in D85749, the 'substituteDebugValuesForInst'
method added in c521e44def would be better off with a limit on the
number of operands to substitute. This handles the common case of
"substitute the first operand between these two differing instructions",
or possibly up to N first operands.
2020-10-21 14:45:55 +01:00
Jonas Paulsson 1606755da0 [SystemZ] Mark unsaved argument R6 as live throughout function.
For historical reasons, the R6 register is a callee-saved argument
register. This means that if it is used to pass an argument to a function
that does not clobber it, it is live throughout the function.

This patch makes sure that in this special case any kill flags of it are
removed.

Review: Ulrich Weigand, Eli Friedman

Differential Revision: https://reviews.llvm.org/D89451
2020-10-21 14:38:59 +02:00
Kirill Bobyrev 96685faf6d
[llvm] Use early exits and get rid of if-return-else-return pattern; NFC
https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D89857
2020-10-21 14:18:42 +02:00
Simon Pilgrim 7b4a828452 [InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI. 2020-10-21 11:53:45 +01:00
Simon Pilgrim 88523f6f4b [DAG] getNode(ISD::EXTRACT_SUBVECTOR) Drop unnecessary N2C null check - we assert that this isn't null and have already used the pointer. NFCI.
Fixes cppcheck + null dereference warning.
2020-10-21 11:53:44 +01:00
Nicholas Guy 9a2d2bedb7 Add "SkipDead" parameter to TargetInstrInfo::DefinesPredicate
Some instructions may be removable through processes such as IfConversion,
however DefinesPredicate can not be made aware of when this should be considered.
This parameter allows DefinesPredicate to distinguish these removable instructions
on a per-call basis, allowing for more fine-grained control from processes like
ifConversion.

Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose

Differential Revision: https://reviews.llvm.org/D88494
2020-10-21 11:52:47 +01:00
Sven van Haastregt bfc961aeb2 [TargetLowering] Check boolean content when folding bit compare
Updates an optimization that relies on boolean contents being either 0
or 1 to properly check for this before triggering.

The following:
  (X & 8) != 0 --> (X & 8) >> 3
Produces unexpected results when a boolean 'true' value is represented
by negative one.

Patch by Erik Hogeman.

Differential Revision: https://reviews.llvm.org/D89390
2020-10-21 11:46:55 +01:00
Sebastian Neubauer 5290f50e44 [AMDGPU] Fix off by one in assert
D89217 did not subtract one when accessing SubRegFromChannelTable in one
place.

Differential Revision: https://reviews.llvm.org/D89804
2020-10-21 12:37:52 +02:00
Florian Hahn 88241ffb56 [Passes] Move ADCE before DSE & LICM.
The adjustment seems to have very little impact on optimizations.
The only binary change with -O3 MultiSource/SPEC2000/SPEC2006 on X86 is
in consumer-typeset and the size there actually decreases by -0.1%, with
not significant changes in the stats.

On its own, it is mildly positive in terms of compile-time, most likely
due to LICM & DSE having to process slightly less instructions. It
should also be unlikely that DSE/LICM make much new code dead.

http://llvm-compile-time-tracker.com/compare.php?from=df63eedef64d715ce1f31843f7de9c11fe1e597f&to=e3bdfcf94a9eeae6e006d010464f0c1b3550577d&stat=instructions

With DSE & MemorySSA, it gives some nice compile-time improvements, due
to the fact that DSE can re-use the PDT from ADCE, if it does not make
any changes:

http://llvm-compile-time-tracker.com/compare.php?from=15fdd6cd7c24c745df1bb419e72ff66fd138aa7e&to=481f494515fc89cb7caea8d862e40f2c910dc994&stat=instructions

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D87322
2020-10-21 10:30:56 +01:00
Jay Foad f6a5699c6c [AMDGPU][TableGen] Make more use of !ne !not !and !or. NFC. 2020-10-21 09:56:43 +01:00
Craig Topper d4d0b41a82 [X86] Remove period from end of error message in assembler
Addresses post-commit feedback from D89837.
2020-10-21 00:43:23 -07:00
David Sherwood 5b17b323a6 [SVE][CodeGen] Replace use of TypeSize comparison operator in CreateStackTemporary
We were previously relying upon the TypeSize comparison operators to
obtain the maximum size of two types, however use of such operators is
being deprecated in favour of making the caller aware that it could
be dealing with scalable vector types. I have changed the code to assert
that the two types have the same scalable property and thus we can
simply take the maximum of the known minimum sizes instead.

Differential Revision: https://reviews.llvm.org/D88563
2020-10-21 08:31:36 +01:00
Martin Storsjö 4de215ff18 Revert "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support"
Also revert "[InstCombine] foldOrOfICmps - use m_Specific instead of
explicit comparisons. NFCI." to make the primarily intended revert
work.

This reverts commits ce13549761 and
e372a5f86f.

This commit caused failed asserts e.g. like this:

$ cat repro.cpp
bool a(char b) {
  return b >= '0' && b <= '9' || (b | 32) >= 'a' && (b | 32) <= 'z';
$ clang++ -target x86_64-linux-gnu -c -O2 repro.cpp
clang++: ../include/llvm/ADT/APInt.h:1151: bool llvm::APInt::operator==(const
llvm::APInt&) const: Assertion `BitWidth == RHS.BitWidth && "Comparison
requires equal bit widths"' failed.
2020-10-21 09:47:18 +03:00
Max Kazantsev bed02fa8b0 Revert "[SCEV] Prove implications of different type via truncation"
This reverts commit 80852a4f2f.

Test is now broken because underlying required patch was also reverted SUDDENLY.
2020-10-21 13:03:46 +07:00
Max Kazantsev 80852a4f2f [SCEV] Prove implications of different type via truncation
When we need to prove implication of expressions of different type width,
the default strategy is to widen everything to wider type and prove in this
type. This does not interact well with AddRecs with negative steps and
unsigned predicates: such AddRec will likely not have a `nuw` flag, and its
`zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec
in some cases can easily be proved to be an `AddRec` too.

This patch introduces an alternative way to handling implications of different
type widths. If we can prove that wider type values actually fit in the narrow type,
we truncate them and prove the implication in narrow type.

Differential Revision: https://reviews.llvm.org/D89548
Reviewed By: fhahn
2020-10-21 12:53:22 +07:00
Craig Topper 79a69f558f [X86] Error on using h-registers with REX prefix in the assembler instead of leaving it to a fatal error in the encoder.
Using a fatal error is bad for user experience.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D89837
2020-10-20 21:35:44 -07:00
Fangrui Song d9f91a3d14 Revert D89381 "[SCEV] Recommit "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 2"
This reverts commit a10a64e7e3.

It broke polly/test/ScopInfo/NonAffine/non-affine-loop-condition-dependent-access_3.ll
The difference suggests that this may be a serious issue.
2020-10-20 21:03:58 -07:00
Mircea Trofin 5e731625f3 [NFC][MC] Use [MC]Register in MachineVerifier
Differential Revision: https://reviews.llvm.org/D89815
2020-10-20 20:42:35 -07:00
Geoffrey Martin-Noble c17ae2916c Remove unnecessary header include which violates layering
This was introduced in https://reviews.llvm.org/D89774, but I don't
think it should be necessary.

Reviewed By: TaWeiTu, aeubanks

Differential Revision: https://reviews.llvm.org/D89843
2020-10-20 20:14:03 -07:00
Carl Ritson 324a15cead [AMDGPU][NFC] Fix missing size in comment 2020-10-21 11:38:21 +09:00
Cyndy Ishida acb33cba6d [llvm] Fix ODRViolations for VersionTuple YAML specializations NFC
It appears for Swift there was confusing errors when trying to parse APINotes, when libAPINotes and libInterfaceStub are linked, they both export symbol
`__ZN4llvm4yaml7yamlizeINS_12VersionTupleEEENSt3__19enable_ifIXsr16has_ScalarTraitsIT_EE5valueEvE4typeERNS0_2IOERS5_bRNS0_12EmptyContextE`, and discovered
same symbol defined within llvm-ifs.

This consolidates the boilerplate into YAMLTraits and defers the specific validation in reading the whole input.
fixes: rdar://problem/70450563

Reviewed By: phosek, dblaikie

Differential Revision: https://reviews.llvm.org/D89764
2020-10-20 18:29:15 -07:00
Hubert Tong 134ffa8138 NFC: Fix -Wsign-compare warnings on 32-bit builds
Comparing 32-bit `ptrdiff_t` against 32-bit `unsigned` results in
`-Wsign-compare` warnings for both GCC and Clang.

The warning for the cases in question appear to identify an issue
where the `ptrdiff_t` value would be mutated via conversion to an
unsigned type.

The warning is resolved by using the usual arithmetic conversions to
safely preserve the value of the `unsigned` operand while trying to
convert to a signed type. Host platforms where `unsigned` has the same
width as `unsigned long long` will need to make a different change, but
using an explicit cast has disadvantages that can be avoided for now.

Reviewed By: dantrushin

Differential Revision: https://reviews.llvm.org/D89612
2020-10-20 20:52:10 -04:00
Austin Kerbow ebdcef20ce [AMDGPU] Avoid inserting noops during scheduling
Passes that are run after the post-RA scheduler may insert instructions like
waitcnt which eliminate the need for certain noops. After this patch the
scheduler is still aware of possible latency from hazards but noops will
not be inserted until the dedicated hazard recognizer pass is run.

Depends on D89753.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D89754
2020-10-20 17:11:36 -07:00
Austin Kerbow 37d907899f [HazardRec] Allow inserting multiple wait-states simultaneously
If a target can encode multiple wait-states into a noop allow emitting such
instructions directly.

Reviewed By: rampitec, dmgreen

Differential Revision: https://reviews.llvm.org/D89753
2020-10-20 17:03:47 -07:00
Tony 1bc7bfffdb [AMDGPU] Optimize waitcnt insertion for flat memory operations
Change waitcnt insertion to check the memory operand tokens to see if
flat memory operations access VMEM in the same way it does to check if
accessing LDS. This avoids adding waitcnt for counters for address
spaces that are not accessed.

In addition, only generate the pessimistic waitcnt 0 if a flat memory
operation appears to access both VMEM and LDS.

This benefits flat memory operations that explicitly specify the
address space as GLOBAL or LOCAL.

Differential Revision: https://reviews.llvm.org/D89618
2020-10-20 22:55:12 +00:00
Craig Topper 1298252f80 [X86] Move 'int $3' -> 'int3' handling in the assembler to processInstruction.
Instead of handling before parsing, just fix it after parsing.
2020-10-20 15:22:00 -07:00
Craig Topper 702aae368a [X86] Move 's{hr,ar,hl} , <op>' to 'shift <op>' optimization in the assembler into processInstruction.
Instead of detecting the mnemonic and hacking the operands before
parsing. Just fix it up after parsing.
2020-10-20 15:20:46 -07:00
Kazu Hirata 96f372c1e7 [AsmWriter] Construct SlotTracker with the function
This patch teaches BasicBlock::print to construct an instance of
SlotTracker with the containing function.

Without this patch, we dump:

*** IR Dump After LoopInstSimplifyPass ***
; Preheader:
  br label %1

; Loop:
<badref>:                                         ; preds = %1, %0
  br label %1

Note "<badref>" above.  This happens because BasicBlock::print calls:

  SlotTracker SlotTable(this->getModule());

Note that this constructor does not add the contents of functions to
the slot table.  That is, basic blocks are left unnumbered.

This patch fixes the problem by switching to:

  SlotTracker SlotTable(this->getParent());

which does add the contents of the Module and the function,
this->getParent(), to the slot table.

Differential Revision: https://reviews.llvm.org/D89567
2020-10-20 15:01:40 -07:00
Paul C. Anagnostopoulos 332ff48dee [AMDGPU] [TableGen] Clean up !if(!eq(boolean, 1) and related booleans
Differential Revision: https://reviews.llvm.org/D89796
2020-10-20 16:23:15 -04:00
Shimin Cui 95bda510fb [ConstantFold] Fold the comparison of bitcasted global values
This is to simplify icmp instructions in the form like:

%cmp = icmp eq i32 (i8*, i8*)* bitcast (i32 (i32**, i32**)* @f32 to i32
%(i8*, i8*)), bitcast (i32 (i64**, i64**) @f64 to i32 (i8*, i8*)*)

Here @f32 and @f64 are two functions.

Differential Revision: https://reviews.llvm.org/D87850
2020-10-20 12:41:49 -07:00
David Stenberg 0c0fcea557 Handle value uses wrapped in metadata for the use-list order
When generating the use-list order, also consider value uses that are
operands which are wrapped in metadata; e.g. llvm.dbg.value operands.

This fixes PR36778. The test case is based on the reproducer from that
report.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D53758
2020-10-20 20:05:59 +02:00
Nicolai Hähnle 848a68a032 DomTree: Extract (mostly) read-only logic into type-erased base classes
Avoid having to instantiate and compile a subset of the dominator tree logic
separately for each node type. More importantly, this allows generic
algorithms to be built on top of dominator trees without writing them as
templates -- such algorithms can now use opaque CfgBlockRef and
CfgInterface instead.

A type-erased implementation of dominator trees could be written in
terms of CfgInterface as well, but doing so would change the current
trade-off: it would slightly reduce code size at the cost of a slight
runtime overhead.

This patch does not change the trade-off, as it only does type-erasure
where basic blocks can be treated in a fully opaque way, i.e. it only
moves methods that don't require iteration over CFG successors and
predecessors.

v5:
- rename generic_{begin,end,children} back without the generic_ prefix
  and refer explictly to base class methods in NewGVN, which wants to
  mutate the order of dominator tree node children directly

v6:
- style change: iDom -> idom; it's arguable whether this is really
  invalid, since it is actually standard camelCase, but clang-tidy
  complains about it so... *shrug*
- rename {to,from}Generic -> {wrap,unwrap}Ref

Change-Id: Ib860dc04cf8bb093d8ed00be7def40d662213672

Differential Revision: https://reviews.llvm.org/D83089
2020-10-20 19:53:07 +02:00