Commit Graph

130983 Commits

Author SHA1 Message Date
George Burgess IV f8c9ceb1ce [SimplifyLibCalls] Add __strlen_chk.
Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.

Differential Revision: https://reviews.llvm.org/D74079
2020-02-08 11:51:00 -08:00
Nikita Popov a148b9e990 [InstCombine] Fix infinite min/max canonicalization loop (PR44541)
While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541,
it does so in a more roundabout manner and there might be other
loopholes to trigger the same issue. This is a more direct fix,
that prevents the transform if the min/max is based on a
non-canonical sub X, 0 instruction.

Differential Revision: https://reviews.llvm.org/D73849
2020-02-08 20:42:17 +01:00
Craig Topper eeb63944e4 [LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results.
Remove code from LegalizeTypes that allowed this to work.

We were already using BUILD_PAIR for this in some places so this
standardizes on a single way to do this.
2020-02-08 09:52:31 -08:00
Simon Pilgrim 4aa7b9cc96 [X86] X86InstComments - add FMA4 comments
These typically match the FMA3 equivalents, although the multiply operands sometimes get flipped due to the FMA3 permute variants.
2020-02-08 17:02:00 +00:00
Simon Pilgrim 10417ad2e4 [X86] Standardize BROADCAST enum names (PR31079)
Tweak EVEX implementation names so it matches the other variants by adding the 'r' prefix. Oddly some of the subvec broadcast ops already matched.
2020-02-08 16:55:00 +00:00
Nikita Popov 5b2b67be8e [InstCombine] Remove unnecessary worklist push; NFCI
This is no longer needed after d4627b90a0,
should have dropped it there...
2020-02-08 17:09:28 +01:00
Nikita Popov d4627b90a0 [InstCombine] Avoid modifying instructions in-place
As discussed on D73919, this replaces a few cases where we were
modifying multiple operands of instructions in-place with the
creation of a new instruction, which we generally prefer nowadays.

This tends to be more readable and less prone to worklist management
bugs.

Test changes are only superficial (instruction naming and order).
2020-02-08 17:05:56 +01:00
Nikita Popov 9d03b7d0d0 [InstCombine] Use swapValues(); NFC
Less code, and makes it more obvious that these operands do not
need to be added back to the worklist.
2020-02-08 16:57:28 +01:00
Nikita Popov 23db9724d0 [InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform
if it wouldn't actually do anything (apart from removing and reinserting
the same instructions).

Note that the test case doesn't loop on current master anymore, only
on the LLVM 10 release branch. The issue is already mitigated on master
due to worklist order fixes, but we should fix the root cause there as well.

As a side note, we should probably assert in combineLoadToNewType()
that it does not combine to the same type. Not doing this here, because
this assertion would also be triggered in another place right now.

Differential Revision: https://reviews.llvm.org/D74278
2020-02-08 16:55:22 +01:00
Simon Pilgrim 0ed79e9b8f [X86] Standardize VPSLLDQ/VPSRLDQ enum names (PR31079)
Tweak EVEX implementation names so it matches the other variants
2020-02-08 14:54:44 +00:00
serge-sans-paille 658495e6ec Revert "Support -fstack-clash-protection for x86"
This reverts commit e229017732.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
2020-02-08 14:26:22 +01:00
Victor Campos af2a384581 Revert "[ARM] Improve codegen of volatile load/store of i64"
This reverts commit 60e0120c91.
2020-02-08 13:18:45 +00:00
Igor Kudrin 1ea99a2ebc [DebugInfo] Allow reading an address table with a mismatched address.
This case does not look as an unrecoverable error.

Differential Revision: https://reviews.llvm.org/D74194
2020-02-08 20:00:03 +07:00
serge_sans_paille e229017732 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720
2020-02-08 13:31:52 +01:00
Benjamin Kramer ef83d46b6b Use heterogenous lookup for std;:map<std::string with a StringRef. NFCI. 2020-02-08 13:28:29 +01:00
Benjamin Kramer e4230a9f6c ArrayRef'ize spillCalleeSavedRegisters. NFCI. 2020-02-08 12:19:23 +01:00
Simon Pilgrim 7f5b3fa73c [X86][SSE] Add X86ISD::FRCP handling to isNegatibleForFree
Peek through X86ISD::FRCP nodes to see if there is a negatible input.
2020-02-08 10:56:27 +00:00
Simon Pilgrim 4229f12a22 [TargetLowering] Remove isDesirableToCombineBuildVectorToShuffleTruncate target hook. NFC.
This hasn't been used for years, its original implementation, D35700, had bugs that caused the reversion of most of the code, and since then x86 shuffle lowering/combining has handled most cases and can deal with the rest as well.
2020-02-08 08:55:51 +00:00
Craig Topper 2af1640f9a [LegalizeDAG][X86][AMDGPU] Use ANY_EXTEND instead of ZERO_EXTEND when promoting ISD::CTTZ/CTTZ_ZERO_UNDEF.
Summary:
For CTTZ we place a set bit just past where the non-promoted type
stopped so the extended bits won't be used for the count. For
CTTZ_ZERO_UNDEF we don't care what happens if no bits are set in
the original type and we end up counting into the extended bits.
So we can just use ANY_EXTEND for both cases.

This matches what is done in type legalization for these operations.
We make no effort to force the upper bits to zero.

Differential Revision: https://reviews.llvm.org/D74111
2020-02-07 22:25:56 -08:00
Akira Hatanaka 4dcc029edb [ObjC][ARC] Keep track of phis that have been discovered to avoid an
infinite loop

This fixes a bug introduced in 6770fbb314.

rdar://problem/59137105
2020-02-07 20:33:11 -08:00
Sam Clegg caeb6cfbc2 [WebAssembly] Fix signature of __powitf2 libcall
Add tests for @llvm.powi.f64/f128.

See: https://llvm.org/docs/LangRef.html#llvm-powi-intrinsic

Differential Revision: https://reviews.llvm.org/D74274
2020-02-07 20:30:47 -08:00
Heejin Ahn 5b5cbfe135 [WebAssembly] Add debug info to insts in Emscripten SjLj
Summary:
This makes sure all newly create instructions in Emscripten SjLj has
appropriate debug info attached. Fixes
https://github.com/emscripten-core/emscripten/issues/9797.

Reviewers: kripken

Subscribers: dschuff, aprantl, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74269
2020-02-07 19:08:39 -08:00
Akira Hatanaka 6770fbb314 [ObjC][ARC] Delete ARC runtime calls that take inert phi values
This improves on the following patch, which removed ARC runtime calls
taking inert global variables:

https://reviews.llvm.org/D62433

rdar://problem/59137105
2020-02-07 16:31:36 -08:00
David Blaikie ba9cae58bb IR Linking: Support merging Warning+Max module metadata flags
Summary:
Debug Info Version was changed to use "Max" instead of "Warning" per the
original design intent - but this maxes old/new IR unlinkable, since
mismatched merge styles are a linking failure.

It seems possible/maybe reasonable to actually support the combination
of these two flags: Warn, but then use the maximum value rather than the
first value/earlier module's value.

Reviewers: tejohnson

Differential Revision: https://reviews.llvm.org/D74257
2020-02-07 16:29:58 -08:00
Amara Emerson 35c63d66aa [GlobalISel][CallLowering] Look through bitcasts from constant function pointers.
Calls to ObjC's objc_msgSend function are done by bitcasting the function global
to the required function type signature. This patch looks through this bitcast
so that we can do a direct call with bl on arm64 instead of using an indirect blr.

Differential Revision: https://reviews.llvm.org/D74241
2020-02-07 15:32:54 -08:00
Craig Topper bb717d3f46 [X86] Correct the implementation of the avx512 masked fmsubadd autoupgrade code to not leave the negate unconnected.
This was causing us to generate fmaddsub instead of fmsubadd if
rounding control is not 4.
2020-02-07 15:27:05 -08:00
Guillaume Chatelet d65bbf81f8 [clang] Add support for __builtin_memcpy_inline
Summary: This is a follow up on D61634 and the last step to implement http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html

Reviewers: efriedma, courbet, tejohnson

Subscribers: hiraditya, cfe-commits, llvm-commits, jdoerfert, t.p.northover

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73543
2020-02-07 23:55:26 +01:00
Huihui Zhang 6556c615f3 Reland "[AMDGPU] Fix data race on RegisterBank initialization." 2020-02-07 14:18:48 -08:00
Huihui Zhang ae39105466 Reland "[ARM] Fix data race on RegisterBank initialization."
Update lambda function
static auto InitializeRegisterBankOnce = [this](const auto &TRI) {
with
static auto InitializeRegisterBankOnce = [&]() {

Capture reference instead of passing argument, as there are buildbot
compiling errors related when passing argument.
2020-02-07 14:01:06 -08:00
Huihui Zhang 2491fd0e6f Reland "[AArch64] Fix data race on RegisterBank initialization."
Update lambda function
static auto InitializeRegisterBankOnce = [this](const auto &TRI) {
with
static auto InitializeRegisterBankOnce = [&]() {

Capture reference instead of passing argument, as there are buildbot
compiling errors related when passing argument.
2020-02-07 13:13:55 -08:00
Nemanja Ivanovic 26bf877ec5 [PowerPC] Fix spilling of vector registers in PEI of EH aware functions
On little endian targets prior to Power9, we spill vector registers using a
swapping store (i.e. stdxvd2x saves the vector with the two doublewords in
big endian order regardless of endianness). This is generally not a problem
since we restore them using the corresponding swapping load (lxvd2x). However
if the restore is done by the unwinder, the vector register contains data in
the incorrect order.

This patch fixes that by using Altivec loads/stores for vector saves and
restores in PEI (which keep the order correct) under those specific conditions:
- EH aware function
- Subtarget requires swaps for VSX memops (Little Endian prior to Power9)

Differential revision: https://reviews.llvm.org/D73692
2020-02-07 14:41:52 -06:00
Nico Weber b03c3d8c62 Revert "Support -fstack-clash-protection for x86"
This reverts commit 4a1a0690ad.
Breaks tests on mac and win, see https://reviews.llvm.org/D68720
2020-02-07 14:49:38 -05:00
Changpeng Fang 884acbb9e1 AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare
Summary:
  The accuracy limit to use rcp is adjusted to 1.0 ulp from 2.5 ulp.
Also, afn instead of arcp is used to allow inaccurate rcp to be used.

Reviewers:
  arsenm

Differential Revision: https://reviews.llvm.org/D73588
2020-02-07 11:46:23 -08:00
Fangrui Song 6520976064 [dsymutil] Delete unneeded parameter Triple from DWARFLinker
Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D74173
2020-02-07 11:33:27 -08:00
Jessica Paquette 609a489e05 [AArch64][GlobalISel] Reland SLT/SGT TBNZ optimization
The issue in the previous commits was that we swap the LHS and RHS while
looking for the constant. In SLT/SGT, the constant must be on the RHS, or the
optimization is invalid.

Move the swapping logic after the check for the SLT/SGT case and update tests.

Original commits:

d78cefb160
a373841407
2020-02-07 11:15:25 -08:00
Changpeng Fang 6370c7c13e AMDGPU: Limit the search in finding the instruction pattern for v_swap generation.
Summary:
  Current implementation of matchSwap in SIShrinkInstructions searches the entire
use_nodbg_operands set to find the possible pattern to generate v_swap instruction.
This approach will lead to a O(N^3) in compile time for SIShrinkInstructions.

But in reality, the matching pattern only exists within nearby instructions in the
same basic block. This work limits the search to a maximum of 16 instructions, and has
a linear compile time comsumption.

Reviewers:
  rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D74180
2020-02-07 11:06:33 -08:00
serge_sans_paille 4a1a0690ad Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with correct option
flags set.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 19:54:39 +01:00
Sean Fertile 88073d40c7 [PowerPC] Create a FixedStack object for CR save in linkage area.
hasReservedSpillSlot returns a dummy frame index of '0' on PPC64 for the
non-volatile condition registers, which leads to the CalleSavedInfo
either referencing an unrelated stack object, or an invalid object if
there are no stack objects. The latter case causes the mir-printer to
crash due to assertions that checks if the frame index referenced by a
CalleeSavedInfo is valid.

To fix the problem create an immutable FixedStack object at the correct offset
in the linkage area of the previous stack frame (ie SP + positive offset).

Differential Revision: https://reviews.llvm.org/D73709
2020-02-07 13:33:44 -05:00
Craig Topper 278578744a [X86] Handle SETB_C32r/SETB_C64r in flag copy lowering the same way we handle SBB
Previously we took the restored flag in a GPR, extended it 32 or 64 bits. Then used as an input to a sub from 0. This requires creating a zero extend and creating a 0.

This patch changes this to just use an ADD with 255 to restore the carry flag and keep the SETB_C32r/SETB_C64r. Exactly like we handle SBB which is what SETB becomes.

Differential Revision: https://reviews.llvm.org/D74152
2020-02-07 10:31:19 -08:00
Vedant Kumar 0d0ef315cb [MachineInstr] Add isCandidateForCallSiteEntry predicate
Add the isCandidateForCallSiteEntry predicate to MachineInstr to
determine whether a DWARF call site entry should be created for an
instruction.

For now, it's enough to have any call instruction that doesn't belong to
a blacklisted set of opcodes. For these opcodes, a call site entry isn't
meaningful.

Differential Revision: https://reviews.llvm.org/D74159
2020-02-07 10:10:41 -08:00
Petar Avramovic 7df5fc9e03 [GlobalISel] Add buildMerge with SrcOp initializer list
Allows more flexible use of buildMerge in places where
use operands are available as SrcOp since it does not
require explicit conversion to Register.
Simplify code with new buildMerge.

Differential Revision: https://reviews.llvm.org/D74223
2020-02-07 18:43:45 +01:00
Sanjay Patel de6f7eb47e [x86] don't create an unused constant vector
Noticed while scanning through debug spew. Creating unused
nodes is inefficient and makes following the debug output harder.
2020-02-07 12:05:02 -05:00
Simon Pilgrim c96001035d [X86] isNegatibleForFree - allow pre-legalized FMA negation
As long as the FMA operation is legal (which we can proxy for the FMA3/FMA4 variants as well), we don't have to wait for the LegalOperations stage.
2020-02-07 17:04:17 +00:00
Amara Emerson 28d22c2c9c [GlobalISel][IRTranslator] Add special case support for ~memory inline asm clobber.
This is a one off special case, since actually implementing full inline asm
support will be much more involved. This lets us compile a lot more code as a
common simple case.

Differential Revision: https://reviews.llvm.org/D74201
2020-02-07 08:55:23 -08:00
Jinsong Ji 01edae1271 [AsmPrinter] Print FP constant in hexadecimal form instead
Printing floating point number in decimal is inconvenient for humans.
Verbose asm output will print out floating point values in comments, it
helps.

But in lots of cases, users still need additional work to covert the
decimal back to hex or binary to check the bit patterns,
especially when there are small precision difference.

Hexadecimal form is one of the supported form in LLVM IR, and easier for
debugging.

This patch try to print all FP constant in hex form instead.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D73566
2020-02-07 16:00:55 +00:00
Matt Arsenault 2f885cbe90 AMDGPU/GlobalISel: Fix move s.buffer.load to VALU
We were executing this in a waterfall loop as a placeholder, but this
should really be converted to a MUBUF load. Also execute in a
waterfall loop if the resource isn't an SGPR. This is a case where the
DAG handling was wrong because doing the right thing was too hard.

Currently, this will mishandle 96-bit loads. There's currently no way
to track the original memory size with an MMO, so these loads will be
widened andd the resulting memory size will be 128-bits.
2020-02-07 07:19:01 -08:00
Simon Tatham 5c6b1a6dfd [TableGen] Fix spurious type error in bit assignment.
Summary:
The following example gives the error message "expected value of type
'bits<32>', got 'bit'" on the assignment.

    class Instruction { bits<32> encoding; }
    def foo: Instruction { let encoding{10} = !eq(0, 1); }

But there's nothing wrong with this code: 'bit' is a perfectly good
type for the RHS of an assignment to a //single bit// of an
instruction encoding.

The problem is that `ParseBodyItem` is accidentally type-checking the
RHS against the full type of the `encoding` field, without adjusting
it in the case where we're only assigning to a subset of the bits. The
fix is trivial.

Reviewers: nhaehnle, hfinkel

Reviewed By: hfinkel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74220
2020-02-07 15:11:42 +00:00
Matt Arsenault 3b198518ad GlobalISel: Fix narrowing of G_CTPOP
The result type is separate from the source type. Tests will be
included in a future AMDGPU patch which uses this from
RegBankSelect/applyMappingImpl.
2020-02-07 06:58:00 -08:00
Matt Arsenault 8de2dad9e0 GlobalISel: Fix lowering of G_CTLZ/G_CTTZ
The type passed to lower was invalid, so I'm not sure how this was
even working before. The source and destination type also do not have
to match, so make sure to use the right ones.
2020-02-07 06:54:12 -08:00
Momchil Velikov a2531081b3 [AArch64] Predictably disassemble system registers with the same encoding
The registers TRCEXTINSELR and TRCEXTINSELR0 are distinct registers,
defined by separate extension specifications (ETM and ETE,
respectively), yet they use the same encoding in MSR/MRS.

When performing a system register lookup by encoding, we would
essentially return a random one, depending on the number, relative
position in the TableGen file, whether the TableGen records for system
registers are named or not, and, if they are named, depending on
record (not register!) name as well.

This patch works around the issue by explictly checking for the
TRCEXTINSELR/TRCEXTINSELR0 encoding and always returning TRCEXTINSELR.

Differential Revision: https://reviews.llvm.org/D74074
2020-02-07 12:19:57 +00:00