Commit Graph

136081 Commits

Author SHA1 Message Date
Vitaly Buka 7b27c09f63 [StackSafety,NFC] Don't test terminators
Code does not track terminators and do not expose them through interface.
State there is just a state of the last instruction or entry.
So this information is just redundant and doesn't need to be tested.
2020-06-19 02:32:17 -07:00
Florian Hahn f9d8e33c32 [SCCP] Turn sext into zext for non-negative ranges.
This patch updates SCCP/IPSCCP to use the computed range info to turn
sexts into zexts, if the value is known to be non-negative. We already
to a similar transform in CorrelatedValuePropagation, but it seems like
we can catch a lot of additional cases by doing it in SCCP/IPSCCP as
well.

The transform is limited to ranges that are known to not include undef.

Currently constant ranges from conditions are treated as potentially
containing undef, due to PR46144. Once we flip this, the transform will
be more effective in practice.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81756
2020-06-19 10:17:55 +01:00
Jay Foad 7cdf4326a8 [LiveIntervals] Fix early-clobber handling in handleMoveUp
Without this fix, handleMoveUp can create an invalid live range like
this:

[98904e,98908r:0)[98908e,227504r:1)

where the two segments overlap, but only because we have lost the "e"
(early-clobber) on the end point of the first segment.

Differential Revision: https://reviews.llvm.org/D82110
2020-06-19 10:17:04 +01:00
Tyker b7338fb1a6 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-19 10:32:26 +02:00
David Sherwood 7edc7f6edb [CodeGen] Fix SimplifyDemandedBits for scalable vectors
For now I have changed SimplifyDemandedBits and it's various callers
to assume we know nothing for scalable vectors and to ignore the
demanded bits completely. I have also done something similar for
SimplifyDemandedVectorElts. These changes fix up lots of warnings
due to calls to EVT::getVectorNumElements() for types with scalable
vectors. These functions are all used for optimisations, rather than
functional requirements. In future we can revisit this code if
there is a need to improve code quality for SVE.

Differential Revision: https://reviews.llvm.org/D80537
2020-06-19 07:59:35 +01:00
David Sherwood 9e811b0d93 [CodeGen] Fix ComputeNumSignBits for scalable vectors
When trying to calculate the number of sign bits for scalable vectors
we should just bail out for now and pretend we know nothing.

Differential Revision: https://reviews.llvm.org/D81093
2020-06-19 07:58:42 +01:00
Kristof Beyls d938ec4509 [AArch64] Avoid incompatibility between SLSBLR mitigation and BTI codegen.
A "BTI c" instruction only allows jumping/calling to using a BLR* instruction.
However, the SLSBLR mitigation changes a BLR to a BR to implement the
function call. Therefore, a "BTI c" check that passed before could
trigger after the BLR->BL change done by the SLSBLR mitigation.
However, if the register used in BR is X16 or X17, this trigger will not
fire (see ArmARM for further details).

Therefore, this patch simply changes the function stubs for the SLSBLR
mitigation from
__llvm_slsblr_thunk_x<N>:
    br x<N>
    SpeculationBarrier
to
__llvm_slsblr_thunk_x<N>:
    mov x16, x<N>
    br  x16
    SpeculationBarrier

Differential Revision: https://reviews.llvm.org/D81405
2020-06-19 06:21:54 +01:00
Ronak Chauhan 5bd33de9c8 [MC] Pass the symbol rather than its name to onSymbolStart()
Summary: This allows targets to also consider the symbol's type and/or address if needed.

Reviewers: scott.linder, jhenderson, MaskRay, aardappel

Reviewed By: scott.linder, MaskRay

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82090
2020-06-19 09:30:12 +05:30
Francesco Petrogalli d32c134648 [llvm][SVE] Reg + reg addressing mode for LD1RO.
Reviewers: efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80741
2020-06-19 03:56:10 +00:00
Nemanja Ivanovic 1fed131660 [PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448
2020-06-18 21:54:22 -05:00
Carl Ritson 8f3b2c8aa3 AMDGPU/GlobalISel: Remove selection of MAD/MAC when not available
Add code to respect mad-mac-f32-insts target feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81990
2020-06-19 10:30:19 +09:00
Vitaly Buka fcd67665a8 [StackSafety] Add "Must Live" logic
Summary:
Extend StackLifetime with option to calculate liveliness
where alloca is only considered alive on basic block entry
if all non-dead predecessors had it alive at terminators.

Depends on D82043.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82124
2020-06-18 16:53:37 -07:00
Nathan James 8b0df1c1a9
[NFC] Refactor Registry loops to range for 2020-06-19 00:40:10 +01:00
Vitaly Buka f672791e08 [StackSafety] Add pass for StackLifetime testing
Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82043
2020-06-18 16:34:18 -07:00
Matt Arsenault bbd78519f9 ARC: Enforce function alignment at code emission time
Don't do this in the MachineFunctionInfo constructor. Also, ensure the
alignment rather than overwriting it outright. I vaguely remember
there was another place to enforce the target minimum alignment, but I
couldn't find it (it's there for instructions).
2020-06-18 17:40:49 -04:00
Matt Arsenault 95605b784b AMDGPU/GlobalISel: Implement computeKnownAlignForTargetInstr
We probably need to move where intrinsics are lowered to copies to
make this useful.
2020-06-18 17:28:00 -04:00
Matt Arsenault b13f6b0fe0 BypassSlowDivision: Fix dropping debug info
I don't know anything about debug info, but this seems like more work
should be necessary. This constructs a new IRBuilder and reconstructs
the original divides rather than moving the original.

One problem this has is if a div/rem pair are handled, both end up
with the same debugloc. I'm not sure how to fix this, since this uses
a cache when it sees the same input operands again, which will have
the first instance's location attached.
2020-06-18 17:27:19 -04:00
Amy Kwan c45c161130 [PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:

vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long);
vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b);
unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
unsigned long long __builtin_pextd (unsigned long long, unsigned long long);

Revision Depends on D80758

Differential Revision: https://reviews.llvm.org/D80935
2020-06-18 16:23:56 -05:00
Matt Arsenault 7f8b2e1b91 GlobalISel: Pass LegalizerHelper to custom legalize callbacks
This was passing in all the parameters needed to construct a
LegalizerHelper in the custom legalization, when it's simpler to just
pass in the existing helper.

This is slightly more annoying to use in the common case where you
don't need the legalizer helper, but we could add back the common
parameters back in addition to the helper.

I didn't propagate this to all the internal target changes that this
logically implies, but did update a sample one for
legalizeMinNumMaxNum.

This is in preparation for moving AMDGPU load/store legalization
entirely into custom lowering. The current set of legalization actions
is really constraining and not really capable of expressing all the
actions needed to legalize loads/stores. In particular there's no way
to express when the memory access itself needs to change size vs. the
result type. There's also a lot of redundancy since the same
split/widen actions need to be applied in both vector and scalar
cases. All of the sub-cases logically belong as steps in the legalizer
helper, but it will be easier to consider everything at once in custom
lowering.
2020-06-18 17:17:38 -04:00
Christopher Tetreault 8d11ec66b6 [SVE] Remove calls to VectorType::getNumElements from Transforms/Utils
Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82057
2020-06-18 13:39:14 -07:00
Alexandre Ganea 2ae0df5be7 [CodeView] Revert 8374bf4363 and 403f953792
This reverts:
8374bf4363 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.
403f953792 [CodeView] Add full repro to LF_BUILDINFO record

This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test
And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c
2020-06-18 16:18:46 -04:00
Kirill Naumov 41d53194fb [BasicBlock] Added AnnotationWriter functionality to BasicBlock class
This functionality is very similar to Function compatibility with
AnnotationWriter. This change allows us to use AnnotationWriter with
BasicBlock through BB.print() method.

Reviewed-By: apilipenko
Differntial Revision: https://reviews.llvm.org/D81321
2020-06-18 19:49:58 +00:00
Sanjay Patel 46a285ad9e [IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC
The predicate can always be used to distinguish between icmp and fcmp,
so we don't need to keep repeating this check in the callers.
2020-06-18 15:47:06 -04:00
Davide Italiano 8cdd2a158c [SimplifyCFG] Update debug location when folding branch to common destination
Sometimes a dead block gets folded and the debug information is still
retained. This manifests as jumpy stepping in lldb, see the bugzilla PR
for an end-to-end C testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46008

Differential Revision:  https://reviews.llvm.org/D82062
2020-06-18 12:33:32 -07:00
Michael Liao 2defe55722 [TTI] Expose isNoopAddrSpaceCast in TTI.
Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82025
2020-06-18 14:40:47 -04:00
serge-sans-paille 4dd332723d Fix return status of LoopDistribute
Move code that may update the IR after precondition, so that if precondition
fail, the IR isn't modified.

Differential Revision: https://reviews.llvm.org/D81225
2020-06-18 20:13:18 +02:00
Matt Arsenault 779cba79ec AMDGPU: Remove mayLoad/mayStore from some side effecting intrinsics
These don't really modify any memory, and should not expect memory
operands.
2020-06-18 14:12:19 -04:00
Stanislav Mekhanoshin 6c7e1b16fa [AMDGPU] Added new encoding to getMCOpcodeGen
Nothing breaks yet, but all encodings shall be in the map.

Differential Revision: https://reviews.llvm.org/D81974
2020-06-18 10:11:33 -07:00
Arthur Eubanks 91ef930526 [GlobalOpt] Remove preallocated calls when possible
When possible (e.g. internal linkage), strip preallocated attribute off
parameters/arguments.
This requires removing the "preallocated" operand bundle from the call
site, replacing @llvm.call.preallocated.arg() with an alloca and a
bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since
@llvm.call.preallocated.arg() can be called multiple times with the same
arg index, we create an alloca per arg index.
We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was
and a @llvm.stackrestore() after the preallocated call to prevent the
stack from blowing up. This is valid because the argument would normally
not exist on the stack after the call before the transformation.

This does not currently handle all possible preallocated calls. We will
need to figure out where to put @llvm.stackrestore() in the cases where
there is no obvious place to put it, for example conditional
preallocated calls, invokes.

This sort of transformation may need to be moved to somewhere more
accessible to accomodate similar transformations (like inlining) in the
future.

Reviewers: efriedma, hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80951
2020-06-18 09:56:13 -07:00
Alexandros Lamprineas ecdf48f15b [ARM] Basic bfloat support
This patch adds basic support for BFloat in the Arm backend.
For now the code generation relies on fullfp16 being present.

Briefly:
* adds the bfloat scalar and vector types in the necessary register classes,
* adjusts the calling convention to cope with bfloat argument passing and return,
* adds codegen patterns for moves, loads and stores.

It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy).

The following people contributed to this patch:

 * Alexandros Lamprineas
 * Ties Stuij

Differential Revision: https://reviews.llvm.org/D81373
2020-06-18 17:26:24 +01:00
Simon Pilgrim 2474421398 [TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes.
If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.
2020-06-18 16:41:08 +01:00
Matt Arsenault 6f09bb7da2 AMDGPU: Don't pass MachineFunction if only the IR Function is used 2020-06-18 11:06:46 -04:00
Ayke van Laethem b4c91462e8
[AVR] Fix miscompilation of zext + add
Code like the following:

    define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) {
    entry:
      %conv = zext i1 %b to i32
      %add = add nsw i32 %conv, %a
      ret i32 %add
    }

Would compile to the following (incorrect) code:

    foo:
        mov     r18, r20
        clr     r19
        add     r22, r18
        adc     r23, r19
        sbci    r24, 0
        sbci    r25, 0
        ret

Those sbci instructions are clearly wrong, they should have been adc
instructions.

This commit improves codegen to use adc instead:

    foo:
        mov     r18, r20
        clr     r19
        ldi     r20, 0
        ldi     r21, 0
        add     r22, r18
        adc     r23, r19
        adc     r24, r20
        adc     r25, r21
        ret

This code is not optimal (it could be just 5 instructions instead of the
current 9) but at least it doesn't miscompile.

Differential Revision: https://reviews.llvm.org/D78439
2020-06-18 16:51:37 +02:00
Matt Arsenault 243303f8d7 Lanai: Remove unused method
This was depending on the MachineFunction at MachineFunctionInfo
construction, which will soon be disallowed.
2020-06-18 10:48:14 -04:00
Simon Pilgrim fe0a85faf4 [X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) == -1 -> PTESTZ(X,X)
Allow combineSetCCMOVMSK to handle 'allof' X == 0 patterns to be replaced with PTESTZ

This is a preliminary patch before properly handling PR35129
2020-06-18 15:38:32 +01:00
Alexandre Ganea 8374bf4363 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.
Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.
2020-06-18 10:07:30 -04:00
Kamlesh Kumar 7622ea5835 [RISCV64] Emit correct lib call for fp(float/double) to ui/si
Since i32 is not legal in riscv64,
it always promoted to i64 before emitting lib call and
for conversions like float/double to int and float/double to unsigned int
wrong lib call was emitted. This commit fix it using custom lowering.

Differential Revision: https://reviews.llvm.org/D80526
2020-06-18 19:34:16 +05:30
Igor Kudrin 6853cc7221 [MC] Rename a misnamed function. NFC.
The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to
better reflect an expression it creates and fix a naming style issue.

Differential Revision: https://reviews.llvm.org/D82079
2020-06-18 20:18:19 +07:00
Alexandre Ganea 403f953792 [CodeView] Add full repro to LF_BUILDINFO record
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).

Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.

For more information see PR36198 and D43002.

Differential Revision: https://reviews.llvm.org/D80833
2020-06-18 09:17:15 -04:00
Alexandre Ganea 24eff42ba4 [CodeView] Add TypeCollection::replaceType to replace type records post-merging
The API is not called in this patch. This is to simply/support https://reviews.llvm.org/D80833
2020-06-18 09:17:14 -04:00
Alexandre Ganea a45409d885 [Clang] Move clang::Job::printArg to llvm::sys::printArg. NFCI.
This patch is to support/simplify https://reviews.llvm.org/D80833
2020-06-18 09:17:13 -04:00
Florian Hahn 1669fddc9f [Matrix] Use alignment info when lowering loads/stores.
This patch updates LowerMatrixIntrinsics to preserve the alignment
specified at the original load/stores and the align attribute for the
pointer argument of the column.major.load/store intrinsics.

We can always use the specified alignment for the load of the first
column. For subsequent columns, the alignment may need to be reduced.

For ConstantInt strides, compute the offset for the start of the column in
bytes and use commonAlignment to get the largest valid alignment.

For non-ConstantInt strides, we need to take the common alignment of the
initial alignment and the element size in bytes.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D81960
2020-06-18 13:19:31 +01:00
Lucas Prates 92ad6d57c2 [ARM] Moving CMSE handling of half arguments and return to the backend
Summary:
As half-precision floating point arguments and returns were previously
coerced to either float or int32 by clang's codegen, the CMSE handling
of those was also performed in clang's side by zeroing the unused MSBs
of the coercer values.

This patch moves this handling to the backend's calling convention
lowering, making sure the high bits of the registers used by
half-precision arguments and returns are zeroed.

Reviewers: chill, rjmccall, ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81428
2020-06-18 13:16:29 +01:00
Lucas Prates a255931c40 [ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend
Summary:
Half-precision floating point arguments and returns are currently
promoted to either float or int32 in clang's CodeGen and there's
no existing support for the lowering of `half` arguments and returns
from IR in AArch32's backend.

Such frontend coercions, implemented as coercion through memory
in clang, can cause a series of issues in argument lowering, as causing
arguments to be stored on the wrong bits on big-endian architectures
and incurring in missing overflow detections in the return of certain
functions.

This patch introduces the handling of half-precision arguments and returns in
the backend using the actual "half" type on the IR. Using the "half"
type the backend is able to properly enforce the AAPCS' directions for
those arguments, making sure they are stored on the proper bits of the
registers and performing the necessary floating point convertions.

Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer

Reviewed By: ostannard

Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75169
2020-06-18 13:15:13 +01:00
Paul Walker 4612f39120 [SVE] Add flag to specify SVE register size, using this to calculate legal vector types.
Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE
data registers (in bits) to be specified. This allows the code
generator to make assumptions it normally couldn't. As a starting
point this information is used to mark fixed length vector types
that can fit within the specified size as legal.

Reviewers: rengolin, efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80384
2020-06-18 12:11:16 +00:00
Sameer Sahasrabuddhe 7aad220795 [DA] conservatively mark the join of every divergent branch
For a loop, a join block is a block that is reachable along multiple
disjoint paths from the exiting block of a loop. If the exit condition
of the loop is divergent, then such join blocks must also be marked
divergent. This currently fails in some cases because not all join
blocks are identified correctly.

The workaround is to conservatively mark every join block of any
branch (not necessarily the exiting block of a loop) as divergent.

https://bugs.llvm.org/show_bug.cgi?id=46372

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81806
2020-06-18 17:39:20 +05:30
Florian Hahn d88acd8f7d [Matrix] Preserve volatile when loading loads/stores.
Currently the matrix lowering turns volatile loads/stores into
non-volatile ones. This patch updates the lowering to preserve the
volatile bit.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D81498
2020-06-18 12:14:19 +01:00
Jeremy Morse 3626eba11f [NFC][LiveDebugValues] Document how LiveDebugValues operates
We're missing a plain English explanation of how this pass is supposed
to operate -- add one to the file comment.

Differential Revision: https://reviews.llvm.org/D80929
2020-06-18 10:54:09 +01:00
Ayke van Laethem 15bf42d503
[AVR] Implement disassembly of 32-bit instructions
This needed two fixes:

  * 32-bit instructions were read in the wrong order. The machine code
    swaps the two 16-bit instruction words, which wasn't undone when
    decoding instructions.
  * Jump and call instructions don't encode the lowest address bit,
    which is always zero. Therefore, the address needed to be shifted by
    one to fix that.

Differential Revision: https://reviews.llvm.org/D81961
2020-06-18 11:26:58 +02:00
David Sherwood 7e30ef77f6 [CodeGen] Fix warnings in getVectorTypeBreakdown
Added NextPowerOf2() routine to TypeSize and rewritten the code
in getVectorTypeBreakdown to avoid warnings being generated.

Differential Revision: https://reviews.llvm.org/D81578
2020-06-18 09:54:16 +01:00
Florian Hahn 6d18c2067e [Matrix] Update load/store intrinsics.
This patch adjust the load/store matrix intrinsics, formerly known as
llvm.matrix.columnwise.load/store, to improve the naming and allow
passing of extra information (volatile).

The patch performs the following changes:
 * Rename columnwise.load/store to column.major.load/store. This is more
   expressive and also more in line with the naming in Clang.
 * Changes the stride arguments from i32 to i64. The stride can be
   larger than i32 and this makes things more uniform with the way
   things are handled in Clang.
 * A new boolean argument is added to indicate whether the load/store
   is volatile. The lowering respects that when emitting vector
   load/store instructions
 * MatrixBuilder is updated to require both Alignment and IsVolatile
   arguments, which are passed through to the generated intrinsic. The
   alignment is set using the `align` attribute.

The changes are grouped together in a single patch, to have a single
commit that breaks the compatibility. We probably should be fine with
updating the intrinsics, as we did not yet officially support them in
the last stable release. If there are any concerns, we can add
auto-upgrade rules for the columnwise intrinsics though.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse

Reviewed By: anemet, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D81472
2020-06-18 09:44:52 +01:00
David Sherwood 65912a9768 [CodeGen] Fix warnings in foldCONCAT_VECTORS
Instead of asserting the number of elements is the same, we should be
comparing the element counts instead. In addition, when looking at
concats of extract_subvectors it's fine to use getVectorMinNumElements()
for scalable vectors.

I discovered these warnings when compiling the structured loads tests in
this file:

  test/CodeGen/AArch64/sve-intrinsics-loads.ll

Differential Revision: https://reviews.llvm.org/D81936
2020-06-18 09:29:37 +01:00
serge-sans-paille f9c7e3136e Correctly report modified status for HWAddressSanitizer
Differential Revision: https://reviews.llvm.org/D81238
2020-06-18 10:27:44 +02:00
David Green 158e734af1 [ARM] Adjust AND/OR combines to not call isConstantSplat on i1 vectors. NFC.
The rearranges PerformANDCombine and PerformORCombine to try and make
sure we don't call isConstantSplat on any i1 vectors. As pointed out in
D81860 it may not be very well defined in those cases.
2020-06-18 08:25:44 +01:00
Kristof Beyls 832cfc7672 [IndirectThunks] Make generated MF structure as expected by all instruction selectors.
This also enables running the AArch64 SLSHardening pass with GlobalISel,
so add a test for that.

Differential Revision: https://reviews.llvm.org/D81403
2020-06-18 06:44:53 +01:00
Kristof Beyls 3f0cc96a96 [AArch64] SLSHardening: compute correct thunk name for X29.
The enum values for AArch64 registers are not all consecutive.
Therefore, the computation
  "__llvm_slsblr_thunk_x" + utostr(Reg - AArch64::X0)
is not always correct. utostr(Reg - AArch64::X0) will not generate the
expected string for the registers that do not have consecutive values in
the enum.
This happened to work for most registers, but does not for AArch64::FP
(i.e. register X29).
This can get triggered when the X29 is not used as a frame pointer.

Differential Revision: https://reviews.llvm.org/D81997
2020-06-18 06:36:49 +01:00
Xing GUO d261a1c0e0 [DWARFYAML][debug_abbrev] Make the abbreviation code optional.
This patch helps make the `Code` optional in abbreviations table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81826
2020-06-18 13:02:54 +08:00
Mehdi Amini 77b79d79c0 Remove "unused" member ModuleSlice from `struct OpenMPOpt`
This is fixing warning from clang:

 warning: private field 'ModuleSlice' is not used [-Wunused-private-field]
  SmallPtrSetImpl<Function *> &ModuleSlice;
                               ^

Differential Revision: https://reviews.llvm.org/D82027
2020-06-18 03:02:26 +00:00
Kang Zhang 58e19d465a [PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator
Summary:
For PPC BinaryOperator of fp128 will become libcall, we shouldn't
convert loop to CTR loop if the loop contain libCall.

But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal
with BinaryOperator for ppc_fp128, don't deal with the fp128.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81353
2020-06-18 02:54:19 +00:00
Xing GUO 1f391afbf4 [ObjectYAML][ELF] Add support for emitting the .debug_abbrev section.
This patch enables yaml2elf emit the .debug_abbrev section.

The generated .debug_abbrev is verified using `llvm-dwarfdump`.

Known issues that will be addressed later:
- Current implementation doesn't support generating multiple abbreviation tables in one .debug_abbrev section.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D81820
2020-06-18 10:50:38 +08:00
Esme-Yi ad6024e29f [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected.
In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81076
2020-06-18 01:32:23 +00:00
Sam Clegg 7ee758d691 [WebAssembly] MC: Fix for data aliases with offsets (getelementptr)
For some reason we hadn't seen such cases in the wild which makes
me think that clang and rustc don't generate these.  In the bug which
reproduces it only occurs with LTO so my guess is that some LTO pass
is creating this alias + gep.

See: https://github.com/emscripten-core/emscripten/issues/8731

Differential Revision: https://reviews.llvm.org/D79462
2020-06-17 16:25:50 -07:00
Matt Arsenault 5f5f566b26 AMDGPU: Don't use 16-bit FP inline constants in integer operands
It seems to be a hardware defect that the half inline constants do not
work as expected for the 16-bit integer operations (the inverse does
work correctly). Experimentation seems to show these are really
reading the 32-bit inline constants, which can be observed by writing
inline asm using op_sel to see what's in the high half of the
constant. Theoretically we could fold the high halves of the 32-bit
constants using op_sel.

The *_asm_all.s MC tests are broken, and I don't know where the script
to autogenerate these are. I started manually fixing it, but there's
just too many cases to fix. This also does break the
assembler/disassembler support for these values, and I'm not sure what
to do about it. These are still valid encodings, so it seems like you
should be able to use them in some way. If you wrote assembly using
them, you could have really meant it (perhaps to read the high bits
with op_sel?). The disassembler will print the invalid literal
constant which will fail to re-assemble. The behavior is also
different depending on the use context. Consider this example, which
was previously accepted and encoded using the inline constant:

  v_mad_i16 v5, v1, -4.0, v3
  ; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04]

In contexts where an inline immediate is required (such as on gfx8/9),
this will now be rejected. For gfx10, this will produce the literal
encoding and change the printed format:
  v_mad_i16 v5, v1, 0xc400, v3
  ; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00]

This is just another variation of the issue that we don't perfectly
handle round trip assembly/disassembly due to not tracking how
immediates were encoded. This doesn't matter much in practice, since
compilers don't emit the suboptimal encoding. I doubt any users are
relying on this behavior (although I did make use of the old behavior
to figure out what was wrong).

Fixes bug 46302.
2020-06-17 19:14:10 -04:00
Yonghong Song 89648eb16d [BPF] fix a bug for BTF pointee type pruning
In BTF, pointee type pruning is used to reduce cluttering
too many unused types into prog BTF. For example,
   struct task_struct {
      ...
      struct mm_struct *mm;
      ...
   }
If bpf program does not access members of "struct mm_struct",
there is no need to bring types for "struct mm_struct" to BTF.

This patch fixed a bug where an incorrect pruning happened.
The test case like below:
    struct t;
    typedef struct t _t;
    struct s1 { _t *c; };
    int test1(struct s1 *arg) { ... }

    struct t { int a; int b; };
    struct s2 { _t c; }
    int test2(struct s2 *arg) { ... }

After processing test1(), among others, BPF backend generates BTF types for
    "struct s1", "_t" and a placeholder for "struct t".
Note that "struct t" is not really generated. If later a direct access
to "struct t" member happened, "struct t" BTF type will be generated
properly.

During processing test2(), when processing member type "_t c",
BPF backend sees type "_t" already generated, so returned.
This caused the problem that "struct t" BTF type is never generated and
eventually causing incorrect type definition for "struct s2".

To fix the issue, during DebugInfo type traversal, even if a
typedef/const/volatile/restrict derived type has been recorded in BTF,
if it is not a type pruning candidate, type traversal of its base type continues.

Differential Revision: https://reviews.llvm.org/D82041
2020-06-17 15:13:46 -07:00
Eric Christopher a8dad30388 Revert "Remove unused class variable ModuleSlice." as it was
used in debug only code.

This reverts commit 07a1749081.
2020-06-17 14:45:17 -07:00
Eric Christopher 07a1749081 Remove unused class variable ModuleSlice. 2020-06-17 14:33:29 -07:00
Christopher Tetreault 8819202dfd [SVE] Eliminate bad VectorType::getNumElements() calls from ConstantFold
Summary:
Assume all usages of this function are explicitly fixed-width operations
and cast to FixedVectorType

Reviewers: efriedma, sdesmalen, c-rhodes, majnemer, dblaikie

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80262
2020-06-17 14:19:56 -07:00
Christopher Tetreault 4b776a98f1 [SVE] Fix invalid usages of getNumElements in ShuffleVectorInstruction
Summary:
Fix invalid usages of getNumElements identified by test case
LLVM.Transforms/InstCombine::vscale_extractelement.ll.

changesLength: Since the length of the llvm::SmallVector shufflemask
is related to the minimum number of elements in a scalable vector, it is
fine to just get the Min field of the ElementCount

isIdentityWithExtract: Since it is not possible to express the mask
needed for this pattern for scalable vectors, we can just bail before
calling getNumElements()

Reviewers: efriedma, sdesmalen, fpetrogalli, gchatelet, yrouban, craig.topper

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81969
2020-06-17 13:45:34 -07:00
Roman Lebedev 84b4f5a6a6
[InstCombine] Negator: while there, add detection for cycles during negation
I don't have any testcases showing it happening,
and i haven't succeeded in creating one,
but i'm also not positive it can't ever happen,
and i recall having something that looked like
that in the very beginning of Negator creation.

But since we now already have a negation cache,
we can now detect such cases practically for free.

Let's do so instead of "relying" on stack overflow :D
2020-06-17 22:47:20 +03:00
Roman Lebedev e3d8cb1e1d
[InstCombine] Negator: cache negation results (PR46362)
It is possible that we can try to negate the same value multiple times.
For example, PHI nodes may happen to have multiple incoming values
(all of which must be the same value) for the same incoming basic block.
It may happen that we try to negate such a PHI node, and succeed,
and that might result in having now-different incoming values..

To avoid that, and in general to reduce the amount of duplicated
work we might be doing, let's introduce a cache where
we'll track results of negating each value.

The added test was previously failing -verify after -instcombine.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46362
2020-06-17 22:47:20 +03:00
Roman Lebedev c4166f3d84
[NFC][InstCombine] Negator: add thin negate() wrapped before visit() 2020-06-17 22:47:20 +03:00
Roman Lebedev 2b85147337
[NFC][InstCombine] Negator: do not include unneeded "llvm/IR/DerivedTypes.h" header 2020-06-17 22:47:19 +03:00
Thomas Lively 49754dcf22 [WebAssembly] Fix bug in FixBrTables and use branch analysis utils
Summary:
This commit fixes a bug in the FixBrTables pass in which an
unconditional branch from the switch header block to the jump table
block was not removed before the blocks were combined. The result was
an invalid CFG in the MachineFunction. This commit also switches from
using bespoke branch analysis and deletion code to using the standard
utilities for the same.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81909
2020-06-17 12:34:45 -07:00
Nick Desaulniers e7816f263b [InlineSpiller] add assert about spills post terminators
Summary:
This invariant is being violated in the test case
https://reviews.llvm.org/D77849, related to the use of the relatively
new ability for callbr to have return values, and MachineBasicBlocks
with INLINEASM_BR terminators to emit live out register defs.

As noted in the comment, this triggers invariant violations in
MachineVerifier via `llc -verify-machineinstrs` or
`llc -verify-regalloc`, since only MachineInstrs that are terminators
are allowed to follow the first terminator.

https://reviews.llvm.org/D75098 may rework this very assertion if we're
spilling via a (proposed) TCOPY MachineInstr.

Reviewers: void, efriedma, arsenm

Reviewed By: efriedma

Subscribers: qcolombet, wdng, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78166
2020-06-17 11:51:58 -07:00
Nick Desaulniers 88c965ba14 BreakCriticalEdges for callbr indirect dests
Summary:
llvm::SplitEdge was failing an assertion that the BasicBlock only had
one successor (for BasicBlocks terminated by CallBrInst, we typically
have multiple successors).  It was surprising that the earlier call to
SplitCriticalEdge did not handle the critical edge (there was an early
return).  Removing that triggered another assertion relating to creating
a BlockAddress for a BasicBlock that did not (yet) have a parent, which
is a simple order of operations issue in llvm::SplitCriticalEdge (a
freshly constructed BasicBlock must be inserted into a Function's basic
block list to have a parent).

Thanks to @nathanchance for the report.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018

Reviewers: craig.topper, jyknight, void, fhahn, efriedma

Reviewed By: efriedma

Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81607
2020-06-17 11:45:06 -07:00
Davide Italiano 1cbaf847ab [CGP] Reset the debug location when promoting zext(s).
When the zext gets promoted, it used to retain the original location,
which pessimizes the debugging experience causing an unexpected
jump in stepping at -Og.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also
contains a full C repro).

Differential Revision:  https://reviews.llvm.org/D81437
2020-06-17 11:13:13 -07:00
Ian Levesque 7c7c8e0da4 [xray] Option to omit the function index
Summary:
Add a flag to omit the xray_fn_idx to cut size overhead and relocations
roughly in half at the cost of reduced performance for single function
patching.  Minor additions to compiler-rt support per-function patching
without the index.

Reviewers: dberris, MaskRay, johnislarry

Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D81995
2020-06-17 13:49:01 -04:00
Alexandre Ganea acb30f6856 [X86] For 32-bit targets, emit two-byte NOP when possible
In order to support hot-patching, we need to make sure the first emitted instruction in a function is a two-byte+ op. This is already the case on x86_64, which seems to always emit two-byte+ ops. However on 32-bit targets this wasn't the case.

PATCHABLE_OP now lowers to a XCHG AX, AX, (66 90) like MSVC does. However when targetting pentium3 (/arch:SSE) or i386 (/arch:IA32) targets, we generate MOV EDI,EDI (8B FF) like MSVC does. This is for compatiblity reasons with older tools that rely on this two byte pattern.

Differential Revision: https://reviews.llvm.org/D81301
2020-06-17 13:44:38 -04:00
Alexandre Ganea ad879b31f0 [X86] Change signature of EmitNops. NFC.
This is to support https://reviews.llvm.org/D81301.
2020-06-17 13:44:37 -04:00
Fangrui Song c8b082a3ab [llvm-cov gcov] Support clang<11 fake 4.2 format
Test cases are restored from a3bed4bd37
2020-06-17 10:17:15 -07:00
Michał Górny 5c621900a6 [llvm] [CommandLine] Do not suggest really hidden opts in nearest lookup
Skip 'really hidden' options when performing lookup of the nearest
option when invalid option was passed.  Since these options aren't even
documented in --help-hidden, it seems inconsistent to suggest them
to users.

This fixes clang-tools-extra test failures due to unexpected suggestions
when linking the tools to LLVM dylib (that provides more options than
the subset of LLVM libraries linked directly).

Differential Revision: https://reviews.llvm.org/D82001
2020-06-17 19:00:26 +02:00
Scott Linder 691ff4682f [AMDGPU] Skip CFIInstructions in SIInsertWaitcnts
Summary:
CFI emitted during PEI at the beginning of the prologue needs to apply
to any inserted waitcnts on function entry.

Reviewers: arsenm, t-tye, RamNalamothu

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D76881
2020-06-17 12:41:03 -04:00
vnalamot 2e28009981 [NFC] Move getAll{S,V}GPR{32,128} methods to SIFrameLowering
Summary:
Future patch needs some of these in multiple places.

The definitions of these can't be in the header and be eligible for
inlining without making the full declaration of GCNSubtarget visible.
I'm not sure what the right trade-off is, but I opted to not bloat
SIRegisterInfo.h

Reviewers: arsenm, cdevadas

Reviewed By: arsenm

Subscribers: RamNalamothu, qcolombet, jvesely, wdng, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79878
2020-06-17 12:08:09 -04:00
sstefan1 7cfd267c51 [OpenMPOPT][NFC] Introducing OMPInformationCache.
Summary:
Introduction of OpenMP-specific information cache based on Attributor's `InformationCache`. This should make it easier to share information between them.

Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku

Subscribers: yaxunl, hiraditya, guansong, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81798
2020-06-17 16:56:45 +02:00
Jay Foad def2e4c47f [AMDGPU] Simplify GCNPassConfig::addOptimizedRegAlloc. NFC. 2020-06-17 15:56:15 +01:00
Simon Pilgrim a5f1f9c9b8 ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC.
Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency.

Add implicit header dependency to source files where necessary.
2020-06-17 15:48:23 +01:00
Sjoerd Meijer d1522513d4 [ARM] Reimplement MVE Tail-Predication pass using @llvm.get.active.lane.mask
To set up a tail-predicated loop, we need to to calculate the number of
elements processed by the loop. We can now use intrinsic
@llvm.get.active.lane.mask() to do this, which is emitted by the vectoriser in
D79100. This intrinsic generates a predicate for the masked loads/stores, and
consumes the Backedge Taken Count (BTC) as its second argument. We can now use
that to reconstruct the loop tripcount, instead of the IR pattern match
approach we were using before.

Many thanks to Eli Friedman and Sam Parker for all their help with this work.

This also adds overflow checks for the different, new expressions that we
create: the loop tripcount, and the sub expression that calculates the
remaining elements to be processed. For the latter, SCEV is not able to
calculate precise enough bounds, so we work around that at the moment, but is
not entirely correct yet, it's conservative. The overflow checks can be
overruled with a force flag, which is thus potentially unsafe (but not really
because the vectoriser is the only place where this intrinsic is emitted at the
moment). It's also good to mention that the tail-predication pass is not yet
enabled by default.  We will follow up to see if we can implement these
overflow checks better, either by a change in SCEV or we may want revise the
definition of llvm.get.active.lane.mask.

Differential Revision: https://reviews.llvm.org/D79175
2020-06-17 15:17:42 +01:00
Kirill Naumov ea844c7520 Revert "[InlineCost] InlineCostAnnotationWriterPass introduced"
This reverts commit 37e06e8f5c.
2020-06-17 14:02:34 +00:00
Kirill Naumov dcf2a9f2ee Revert "[InlineCost] PrinterPass prints constants to which instructions are simplified"
This reverts commit 52b0db22f8.
2020-06-17 14:02:29 +00:00
Kirill Naumov 39a4505e34 Revert "[InlineCost] GetElementPtr with constant operands"
This reverts commit 34fba68d80.
2020-06-17 14:02:18 +00:00
Kirill Naumov 34fba68d80 [InlineCost] GetElementPtr with constant operands
If the GEP instruction contanins only constants as its arguments,
then it should be recognized as a constant. For now, there was
also added a flag to turn off this simplification if it causes
any regressions ("disable-gep-const-evaluation") which is off
by default. Once I gather needed data of the effectiveness of
this simplification, the flag will be deleted.

Reviewers: apilipenko, davidxl, mtrofin

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81026
2020-06-17 13:40:19 +00:00
Kirill Naumov 52b0db22f8 [InlineCost] PrinterPass prints constants to which instructions are simplified
This patch enables printing of constants to see which instructions were
constant-folded. Needed for tests and better visiual analysis of
inliner's work.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81024
2020-06-17 13:40:18 +00:00
Kirill Naumov 37e06e8f5c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-17 13:40:17 +00:00
Benjamin Kramer df9a51dab3 Remove global std::strings. NFCI. 2020-06-17 14:29:42 +02:00
Sjoerd Meijer c1034d044a Follow up of rGe345d547a0d5, and attempt to pacify buildbot:
"error: 'get' is deprecated: The base class version of get with the scalable
argument defaulted to false is deprecated."

Changed VectorType::get() -> FixedVectorType::get().
2020-06-17 13:24:09 +01:00
Sjoerd Meijer e345d547a0 Recommit "[LV] Emit @llvm.get.active.lane.mask for tail-folded loops"
Fixed ARM regression test.

Please see the original commit message rG47650451738c for details.
2020-06-17 13:12:15 +01:00
David Green 076e08aa45 [LSR] Filter for postinc formulae
In more complicated loops we can easily hit the complexity limits of
loop strength reduction. If we do and filtering occurs, it's all too
easy to remove the wrong formulae for post-inc preferring accesses due
to it attempting to maximise register re-use. The patch adds an
alternative filtering step when the target is preferring postinc to pick
postinc formulae instead, hopefully lowering the complexity to below the
limit so that aggressive filtering is not needed.

There is also a change in here to stop considering existing addrecs as
free under postinc. We should already be modelling them as a reg so
don't want it to cause us to get the cost wrong. (I'm not sure that code
makes sense in general, but there are X86 tests specifically for it
where it seems to be helping so have left it around for the standard
non-post-inc case).

Differential Revision: https://reviews.llvm.org/D80273
2020-06-17 12:32:04 +01:00
Carl Ritson ac8a2f132b [AMDGPU] Fix failure in VCC spilling
Spills of VCC (SGPR64) will fail with new SGPR spill code,
because super register is not correctly resolved.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81224
2020-06-17 20:11:15 +09:00
Benjamin Kramer 547b6da73c [CallPrinter] Remove static constructor.
No need to have std::string here. NFC.
2020-06-17 13:02:58 +02:00
Sam Parker 5bf0858c0b Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant"
I originally reverted the patch because it was causing performance
issues, but now I think it's just enabling simplify-cfg to do
something that I don't want instead :)

Sorry for the noise.

This reverts commit 3e39760f8e.
2020-06-17 11:38:59 +01:00
Paul Walker 95db1e7fb9 [FileCheck] Implement * and / operators for ExpressionValue.
Subscribers: arichardson, hiraditya, thopre, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80915
2020-06-17 09:39:17 +00:00
Hans Wennborg 16ad6eeb94 [IR] Don't copy profile metadata in createCallMatchingInvoke()
The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.

Differential revision: https://reviews.llvm.org/D81996
2020-06-17 11:18:23 +02:00
serge-sans-paille 1cafd8a5d1 Fix LoopIdiomRecognize pass return status
Introduce an helper class to aggregate the cleanup in case of rollback.

Differential Revision: https://reviews.llvm.org/D81230
2020-06-17 11:12:03 +02:00
Sjoerd Meijer d4e183f686 Revert "[LV] Emit @llvm.get.active.mask for tail-folded loops"
This reverts commit 4765045173
while I investigate the build bot failures.
2020-06-17 10:09:54 +01:00
Max Kazantsev 4ac9a6902f [NFC] Add API for edge domination check in dom tree 2020-06-17 16:05:05 +07:00
Florian Hahn 773353be4e [SCCP] Move common code to simplify basic block to helper (NFC).
Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81755
2020-06-17 10:03:43 +01:00
Sjoerd Meijer 4765045173 [LV] Emit @llvm.get.active.mask for tail-folded loops
This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised
loops if the intrinsic is supported by the backend, which is checked by
querying TargetTransform hook emitGetActiveLaneMask.

This intrinsic creates a mask representing active and inactive vector lanes,
which is used by the masked load/store instructions that are created for
tail-folded loops. The semantics of @llvm.get.active.mask are described here in
LangRef:

https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics

This intrinsic is also used to provide a hint to the backend. That is, the
second argument of the intrinsic represents the back-edge taken count of the
loop. For MVE, for example, we use that to set up tail-predication, which is a
new form of predication in MVE for vector loops that implicitely predicates the
last vector loop iteration by implicitely setting active/inactive lanes, i.e.
the tail loop is predicated. In order to set up a tail-predicated vector loop,
we need to know the number of data elements processed by the vector loop, which
corresponds the the tripcount of the scalar loop, which we can now reconstruct
using @llvm.get.active.mask.

Differential Revision: https://reviews.llvm.org/D79100
2020-06-17 09:53:58 +01:00
Sjoerd Meijer 20835cff27 [TTI] Refactor emitGetActiveLaneMask
Refactor TTI hook emitGetActiveLaneMask and remove the unused arguments
as suggested in D79100.
2020-06-17 09:53:58 +01:00
Kirill Bobyrev 3847737fa4
[CallPrinter] Handle freq = 0 case
Improvement of the following revision:
bbc629ebd6

This might still be problematic if freq = 0, so it's better to check for
that.
2020-06-17 10:52:18 +02:00
Kirill Bobyrev bbc629ebd6
[CallPrinter] Fix maxFreq = 0 case
llvm::getHeatColor becomes a problem when maxFreq = 0 -> freq = 0 =>
log2(double(freq)) / log2(maxFreq) -> log2(0.) / log2(0.) which
results in illegal instruction on some architectures.

Problematic revision: https://reviews.llvm.org/D77172
2020-06-17 10:44:28 +02:00
Florian Hahn e4b58ea8c1 [MemDep] Also remove load instructions from NonLocalDesCache.
Currently load instructions are added to the cache for invariant pointer
group dependencies, but only pointer values are removed currently. That
leads to dangling AssertingVHs in the test case below, where we delete a
load from an invariant pointer group. We should also remove the entries
from the cache.

Fixes PR46054.

Reviewers: efriedma, hfinkel, asbirlea

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81726
2020-06-17 09:36:53 +01:00
James Henderson b21794a91c [DebugInfo] Unify Cursor usage for all debug line opcodes
This is a natural extension of the previous changes to use the Cursor
class independently in the standard and extended opcode paths, and in
turn allows delaying error handling until the entire line has been
printed in verbose mode, removing interleaved output in some cases.

Reviewed by: MaskRay, JDevlieghere

Differential Revision: https://reviews.llvm.org/D81562
2020-06-17 09:19:24 +01:00
Vitaly Buka d812efb121 [SafeStack,NFC] Fix names after files move
Summary: Depends on D81831.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81832
2020-06-17 01:08:40 -07:00
Vitaly Buka 6754a0e2ed [SafeStack,NFC] Move SafeStackColoring code
Summary:
This code is going to be used in StackSafety.
This patch is file move with minimal changes. Identifiers
will be fixed in the followup patch.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81831
2020-06-17 01:07:47 -07:00
Jonas Paulsson d3f7448e3c [SystemZ] Bugfix in storeLoadCanUseBlockBinary().
Check that the MemoryVT of LoadA matches that of LoadB.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46239.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D81671
2020-06-17 09:49:31 +02:00
Kang Zhang c2574dc9f7 [NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass
Summary:

In the patch D62907 the PPC CTRLoops pass has been replaced by Generic
Hardware Loop pass, and it has imported some new intrinsic for Generic
Hardware Loop.

The old intrinsic used in PPC CTRLoops int_ppc_mtctr and
int_ppc_is_decremented_ctr_nonzero is been replaced by
int_set_loop_iterations and loop_decrement.

This patch is to remove above unused two instrinsic.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81539
2020-06-17 07:06:46 +00:00
Serge Pavlov 2e613d2ded [Support] Get process statistics in ExecuteAndWait and Wait
The functions sys::ExcecuteAndWait and sys::Wait now have additional
argument of type pointer to structure, which is filled with process
execution statistics upon process termination. These are total and user
execution times and peak memory consumption. By default this argument is
nullptr so existing users of these function must not change behavior.

Differential Revision: https://reviews.llvm.org/D78901
2020-06-17 13:39:59 +07:00
Igor Kudrin ccbd7e8d46 [DebugInfo] Support parsing and dumping of DWARF64 macro units.
Differential Revision: https://reviews.llvm.org/D81844
2020-06-17 12:57:54 +07:00
Sameer Sahasrabuddhe d3963b3a5f [DA] propagate loop live-out values that get used in a branch
Values that are uniform within a loop but appear divergent to uses
outside the loop are "tainted" so that such uses are marked
divergent. But if such a use is a branch, then it's divergence needs
to be propagated. The simplest way to do that is to put the branch
back in the main worklist so that it is processed appropriately.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81822
2020-06-17 09:21:00 +05:30
Itay Bookstein df9d64ed9c [IR] Add missing GlobalAlias copying of ThreadLocalMode attribute
Summary:
Previously, GlobalAlias::copyAttributesFrom did not preserve ThreadLocalMode,
causing incorrect IR generation in IR linking flows. This patch pushes the code
responsible for copying this attribute from GlobalVariable::copyAttributesFrom
down to GlobalValue::copyAttributesFrom so that it is shared by GlobalAlias.
Fixes PR46297.

Reviewers: tejohnson, pcc, hans

Reviewed By: tejohnson, hans

Subscribers: hiraditya, ibookstein, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81605
2020-06-16 20:15:27 -07:00
Matt Arsenault 3b34f3fcca AMDGPU/GlobalISel: Fix obvious bug in ported 32-bit udiv/urem
This was hidden by the IR expansion in AMDGPUCodeGenPrepare, which I
forgot to turn off.
2020-06-16 22:46:35 -04:00
Xing GUO 9aaa32cfcb [ObjectYAML][DWARF] Let writeVariableSizedInteger() return Error.
This patch helps change the return type of `writeVariableSizedInteger()` from `void` to `Error`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81915
2020-06-17 09:30:14 +08:00
Matt Arsenault c5c58fd6b5 AMDGPU: Remove intermediate DAG node for trig_preop intrinsic
We weren't doing anything with this, and keeping it would just add
more boilerplate for GlobalISel.
2020-06-16 21:06:25 -04:00
Christopher Tetreault 8e204f807b [SVE] Generalize size checks in Verifier to use getElementCount
Summary:
Attempts to call getNumElements on scalable vectors identified by test
LLVM.Other::scalable-vectors-core-ir.ll. Since these checks are all
attempting to find if two vectors are the same size, calling
getElementCount will only increase safety.

Reviewers: efriedma, aprantl, reames, kmclaughlin, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81895
2020-06-16 16:03:36 -07:00
Aaron Smith 7e01675ea5 [SelectionDAG] Add MVT::bf16 to getConstantFP()
Summary:
This was probably overlooked in recent bfloat patches.
Needed to handle bf16 constants in SelectionDAG.

  ConstantFP:bf16<APFloat(0)>

Reviewers: stuij

Reviewed By: stuij

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81779
2020-06-16 15:10:05 -07:00
Fangrui Song 7f7cb79b57 [llvm-cov gcov] Don't suppress .gcov output if .gcda is corrupted
If .gcda is corrupted, gcov continues to produce a .gcov and just
assumes execution counts are zeros. This is reasonable, because the
program can corrupt its .gcda output. The code path should be similar to
the code path without .gcda.
2020-06-16 14:55:38 -07:00
Daniel Sanders e35ba09961 [gicombiner] Allow generated combiners to store additional members
Summary:
Adds the ability to add members to a generated combiner via
a State base class. In the current AArch64PreLegalizerCombiner
this is used to make Helper available without having to
provide it to every call.

As part of this, split the command line processing into a
separate object so that it still only runs once even though
the generated combiner is constructed more frequently.

Depends on D81862

Reviewers: aditya_nandakumar, bogner, volkan, aemerson, paquette, arsenm

Reviewed By: arsenm

Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81863
2020-06-16 14:47:04 -07:00
Kirill Naumov 369d00df60 [CallPrinter] Adding heat coloring to CallPrinter
This patch introduces the heat coloring of the Call Printer which is based
on the relative "hotness" of each function. The patch is a part of sequence of
three patches, related to graphs Heat Coloring.
Another feature added is the flag similar to "-cfg-dot-filename-prefix",
which allows to write the graph into a named .pdf

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D77172
2020-06-16 21:15:29 +00:00
Fangrui Song def2156389 [gcov] Add -i --intermediate-format
Between gcov 4.9~8, `gcov -i $file` prints coverage information to
$file.gcov in an intermediate text format (single file, instead of
$source.gcov for each source file).

lcov newer than 2019-05-24 detects -i support and uses it to increase
processing speed.  gcov 9 (GCC r265587) removed --intermediate-format
and -i was changed to mean --json-format. However, we consider this
format still useful and support it. geninfo (part of lcov) supports this
format even if we announce that we are compatible with gcov 9.0.0
2020-06-16 14:14:28 -07:00
Fangrui Song 4cd7ba7eca [gcov] Refactor llvm-cov gcov and add SourceInfo 2020-06-16 14:14:26 -07:00
Christopher Tetreault 616d8d942b [SVE] Eliminate calls to default-false VectorType::get() from AArch64
Reviewers: efriedma, c-rhodes, david-arm, samparker, greened

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81518
2020-06-16 13:53:25 -07:00
Christopher Tetreault b265cad93e [NFC] Bail out for scalable vectors before calling getNumElements
Summary:
Move the bail out logic to before constructing the Result and Lane
vectors. This is both potentially faster, and avoids calling
getNumElements on a potentially scalable vector

Reviewers: efriedma, sunfish, chandlerc, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81619
2020-06-16 13:41:29 -07:00
Christopher Tetreault 747486991c [SVE] Fix bad FixedVectorType cast in simplifyDivRem
Summary:
simplifyDivRem attempts to walk a VectorType elementwise. Ensure that it
only does so for FixedVectorType

Reviewers: efriedma, spatel, lebedev.ri, david-arm, kmclaughlin

Reviewed By: spatel, david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81856
2020-06-16 13:17:05 -07:00
Christopher Tetreault ff628f5f5e [SVE] Eliminate calls to default-false VectorType::get() from Vectorize
Reviewers: efriedma, fhahn, spatel, sdesmalen, kmclaughlin

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81521
2020-06-16 12:50:13 -07:00
Matt Arsenault e4f19d1dda GlobalISel: Fix not failing on widening G_INSERT_VECTOR_ELT
This doesn't actually handled type idx 0, but was reporting Legalized
on it. No test changes because nothing was trying to use this.
2020-06-16 15:48:57 -04:00
Ahsan Saghir 37e72f47a4 [PowerPC] Add -m[no-]power10-vector clang and llvm option
Summary: This patch adds command line option for enabling power10-vector support.

Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc

Reviewed By: lei, amyk, #powerpc

Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits

Tags: #llvm, #clang, #powerpc

Differential Revision: https://reviews.llvm.org/D80758
2020-06-16 14:47:35 -05:00
Matt Arsenault 8a3340d25d GlobalISel: Use early return and reduce indentation 2020-06-16 14:47:08 -04:00
Stanislav Mekhanoshin 3f0c9c1634 Fix ubsan error in tblgen with signed left shift
UBSAN complains when tblgen performs SHL of a negative
value.

Differential Revision: https://reviews.llvm.org/D81952
2020-06-16 11:15:09 -07:00
Hiroshi Yamauchi 6bc2b042f4 [TLI] Add four C++17 delete variants.
Summary:
delete(void*, unsigned int, align_val_t)
delete(void*, unsigned long, align_val_t)
delete[](void*, unsigned int, align_val_t)
delete[](void*, unsigned long, align_val_t)

Differential Revision: https://reviews.llvm.org/D81853
2020-06-16 11:12:02 -07:00
Sanjay Patel ed67f5e7ab [VectorCombine] scalarize compares with insertelement operand(s)
Generalize scalarization (recently enhanced with D80885)
to allow compares as well as binops.
Similar to binops, we are avoiding scalarization of a loaded
value because that could avoid a register transfer in codegen.
This requires 1 extra predicate that I am aware of: we do not
want to scalarize the condition value of a vector select. That
might also invert a transform that we do in instcombine that
prefers a vector condition operand for a vector select.

I think this is the final step in solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

Differential Revision: https://reviews.llvm.org/D81661
2020-06-16 13:48:10 -04:00
Jessica Paquette 7caa9caa80 [AArch64][GlobalISel] Avoid creating redundant ubfx when selecting G_ZEXT
When selecting 32 b -> 64 b G_ZEXTs, we don't have to always emit the extend.

If the instruction feeding into the G_ZEXT implicitly zero extends the high
half of the register, we can just emit a SUBREG_TO_REG instead.

Differential Revision: https://reviews.llvm.org/D81897
2020-06-16 09:50:47 -07:00
Fangrui Song 4799fb63b5 [GlobalISel] Delete unused variable after r353432 2020-06-16 08:32:09 -07:00
Leandro Vaz 56262a74c3 Fix debug line info when line markers are present inside macros.
Compiling assembly files when newlines are reduced to line markers within a `.macro` context will generate wrong information in `.debug_line` section.
This patch fixes this issue by evaluating line markers within the macro scope but not when they are used and evaluated.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D80381
2020-06-16 16:13:11 +01:00
Luke Geeson 10b6567f49 [AArch64]: BFloat MatMul Intrinsics&CodeGen
This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman

Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki,
stuij

Reviewed By: miyuki, stuij

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki, chill, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80752

Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f
2020-06-16 15:23:30 +01:00
Luke Geeson 508a4764c0 [AArch64]: BFloat Load/Store Intrinsics&CodeGen
This patch upstreams support for ld / st variants of BFloat intrinsics
in from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

 - Luke Geeson
 - Momchil Velikov
 - Luke Cheeseman

Reviewers: fpetrogalli, SjoerdMeijer, sdesmalen, t.p.northover, stuij

Reviewed By: stuij

Subscribers: arsenm, pratlucas, simon_tatham, labrinea, kristof.beyls,
hiraditya, danielkiss, cfe-commits, llvm-commits, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80716

Change-Id: I22e1dca2a8a9ec25d1e4f4b200cb50ea493d2575
2020-06-16 15:23:30 +01:00
Georgii Rymar 66fb3c39cb [DebugInfo/DWARF] - Report .eh_frame sections of version != 1.
Specification (https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html#AEN1349)
says that the value of Version field for .eh_frame should be 1.

Though we accept other values and might perform an attempt to read
it as a .debug_frame because of that, what is wrong.

This patch adds a version check.

Differential revision: https://reviews.llvm.org/D81469
2020-06-16 15:46:26 +03:00
Tyker d7deef1206 Revert "[AssumeBundles] add cannonicalisation to the assume builder"
This reverts commit 90c50cad19.
2020-06-16 14:34:55 +02:00
Ayke van Laethem 5aa8014ca8
[AVR] Remove faulty stack pushing behavior
An instruction like this will need to allocate some stack space for the
last parameter:

  %x = call addrspace(1) i16 @bar(i64 undef, i64 undef, i16 undef, i16 0)

This worked fine when passing an actual value (in this case 0). However,
when passing undef, no value was pushed to the stack and therefore no
push instructions were created. This caused an unbalanced stack leading
to interesting results.

This commit fixes that by replacing the push logic with a regular stack
adjustment and stack-relative load/stores. This is less efficient but at
least it correctly compiles the code.

I can think of a few improvements in the future:

  * The stack should have been adjusted in the function prologue when
    there are no allocas in the function.
  * Many (if not most) stack adjustments can be replaced by
    pushing/popping the values directly. Exactly like the previous code
    attempted but didn't do correctly.
  * Small stack adjustments can be done more efficiently with a few
    push/pop instructions (pushing/popping bogus values), both for code
    size and for speed.

All in all, as long as there are no allocas in the function I think that
it is almost always more efficient to emit regular push/pop
instructions. This is however left for future optimizations.

Differential Revision: https://reviews.llvm.org/D78581
2020-06-16 13:53:32 +02:00
Ayke van Laethem 3ab1c97e35
[AVR] Fix stack size in functions with a frame pointer
This patch fixes a bug in stack save/restore code. Because the frame
pointer was saved/restored manually (not by marking it as clobbered) the
StackSize variable was not updated accordingly. Most code still worked,
but code that tried to load a parameter passed on the stack did not.

This commit fixes this by marking the frame pointer as a
callee-clobbered register. This will let it be saved without any effort
in prolog/epilog code and will make sure the correct address is
calculated for loading parameters that are passed on the stack.

This approach is used by most other targets (such as X86, AArch64 and
RISC-V).

Differential Revision: https://reviews.llvm.org/D78579
2020-06-16 13:53:32 +02:00
David Green f269bb7da0 [ARM] Fix crash trying to generate i1 immediates
These code patterns attempt to call isVMOVModifiedImm on a splat of i1
values, leading to an unreachable being hit. I've guarded the call on a
more specific set of sizes, as i1 vectors are legal under MVE.

Differential Revision: https://reviews.llvm.org/D81860
2020-06-16 12:27:24 +01:00
Simon Pilgrim 9d11822f09 Fix comment typo - Uexpected -> Unexpected. NFC. 2020-06-16 12:14:51 +01:00
Tyker 90c50cad19 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-16 13:12:35 +02:00
Kristof Beyls 503a26d8e4 Silence GCC 7 warning
GCC 7 was reporting "enumeral and non-enumeral type in conditional expression"
as a warning.
The code casts an instruction opcode enum to unsigned implicitly, in
line with intentions; so this commit silences the warning by making the
cast to unsigned explicit.
2020-06-16 11:42:52 +01:00
sstefan1 e099c7b64a [NFC][OpenMPOpt] Provide function-specific foreachUse. 2020-06-16 12:33:15 +02:00
Alexandros Lamprineas f6189da938 [ARM][NFC] Explicitly specify the fp16 value type in codegen patterns.
We are planning to add the bf16 value type in the HPR register class
and this will make the codegen patterns ambiguous.

Differential Revision: https://reviews.llvm.org/D81505
2020-06-16 11:32:17 +01:00
Jay Foad 6fdd5a28b7 Revert "[IR] Clean up dead instructions after simplifying a conditional branch"
This reverts commit 69bdfb075b.

Reverting to investigate https://bugs.llvm.org/show_bug.cgi?id=46343
2020-06-16 10:32:15 +01:00
Simon Pilgrim 379c5b31f7 [X86][SSE] combineVectorSizedSetCCEquality - remove unused AVX2 MOVMSK path. NFCI.
If PTEST is not available, then we're guaranteed to be performing a 128-bit vector comparison using MOVMSK(PCMPEQB(v16i8)).
2020-06-16 10:07:41 +01:00
Igor Kudrin ffc5d98d2c [MC] Generate .debug_frame in the 64-bit DWARF format [7/7]
Note that .eh_frame sections are generated in the 32-bit format even
when debug sections are 64-bit, for compatibility reasons. They use
relative references between entries, so they hardly benefit from the
64-bit format.

Differential Revision: https://reviews.llvm.org/D81149
2020-06-16 15:50:14 +07:00
Igor Kudrin 1e081342d4 [MC] Fix DWARF forms for 64-bit DWARFv3 files [6/7]
DW_FORM_sec_offset was introduced in DWARFv4, so, for 64-bit DWARFv3,
DW_FORM_data8 should be used instead.

Differential Revision: https://reviews.llvm.org/D81148
2020-06-16 15:50:14 +07:00
Igor Kudrin ab7458fb04 [MC] Generate .debug_rnglists in the 64-bit DWARF format [5/7]
In addition, the patch fixes referencing the section within
a compilation unit.

Differential Revision: https://reviews.llvm.org/D81147
2020-06-16 15:50:13 +07:00
Igor Kudrin b5f8959bcd [MC] Generate .debug_aranges in the 64-bit DWARF format [4/7]
Differential Revision: https://reviews.llvm.org/D81146
2020-06-16 15:50:13 +07:00
Igor Kudrin 1dfcce5395 [MC] Generate a compilation unit in the 64-bit DWARF format [3/7]
The patch enables producing DWARF64 compilation units and fixes
generating references to .debug_abbrev and .debug_line sections.
A similar change for .debug_ranges/.debug_rnglists will be added
in a forthcoming patch.

Differential Revision: https://reviews.llvm.org/D81145
2020-06-16 15:50:13 +07:00
Igor Kudrin 64c049595b [MC] Generate .debug_line in the 64-bit DWARF format [2/7]
Differential Revision: https://reviews.llvm.org/D81144
2020-06-16 15:50:13 +07:00
Igor Kudrin a8ec9de406 [MC] Add --dwarf64 to generate DWARF64 debug info [1/7]
The patch adds an option `--dwarf64` to instruct a tool to generate
debug information in the 64-bit DWARF format. There is no real
implementation yet, only a few compatibility checks.

Differential Revision: https://reviews.llvm.org/D81143
2020-06-16 15:50:13 +07:00
Simon Pilgrim 057c9c7ee0 [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions
This patch extends MatchVectorAllZeroTest to handle OR vector reduction patterns where the result is compared against zero.

Fixes PR45378

Differential Revision: https://reviews.llvm.org/D81547
2020-06-16 09:42:34 +01:00
Simon Pilgrim 65c3fa849b [X86][SSE] combineVectorSizedSetCCEquality - move single Subtarget.hasAVX() use into condition. NFC.
We already have Subtarget.hasSSE2() and Subtarget.useAVX512Regs() in the condition - seems to be a legacy from when we had multiple uses.
2020-06-16 09:42:33 +01:00
Sam Parker 7158f285a8 [CostModel] Unify getCFInstrCost
Have TTI::getInstructionThroughput call getUserCost for Br, Ret and
PHI. This now means that eveything in getInstructionThroughput is
handled by getUserCost.

Differential Revision: https://reviews.llvm.org/D79849
2020-06-16 08:40:54 +01:00
Fangrui Song a3b5f428c1 [AArch64] Print the immediate operand for SPACE pseudo instruction
Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D81814
2020-06-15 20:55:53 -07:00
Amara Emerson 1035a416a6 [AArch64][GlobalISel] Emit constant pool loads for 64 bit fp immediates.
Note: don't do this for integer 64 bit materialization to match SDAG.

Differential Revision: https://reviews.llvm.org/D81893
2020-06-15 20:53:09 -07:00
Qiu Chaofan e62912b190 [LLParser] Delete temp CallInst when error occurs
Only functions with floating-point return type accepts fast-math flags.
When adding such flags to function returning integer, we'll see a crash,
because there's still an undeleted value referencing the argument. This
patch manually removes the temporary instruction when error occurs.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D78355
2020-06-16 11:41:25 +08:00
Xing GUO 8aaeaddec8 [ObjectYAML][DWARF] Implement the .debug_addr section.
This patch implements the .debug_addr section.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D81541
2020-06-16 10:53:10 +08:00
Mircea Trofin 296e47734e [llvm][NFC] Fix license on InlineFeaturesAnalysis.{h|cpp}
Summary: Also fixed the InlineAdvisor.cpp license.

Reviewers: rriddle

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81896
2020-06-15 19:34:33 -07:00
Craig Topper 255d5dbae1 [X86] Add support for inline assembly 'x' constraint for i128.
Limiting to x86-64 since that's when __int128 is legal in clang.

Differential Revision: https://reviews.llvm.org/D81817
2020-06-15 19:34:02 -07:00
Gui Andrade b0ffa8befe [MSAN] Pass Origin by parameter to __msan_warning functions
Summary:
Normally, the Origin is passed over TLS, which seems like it introduces unnecessary overhead. It's in the (extremely) cold path though, so the only overhead is in code size.

But with eager-checks, calls to __msan_warning functions are extremely common, so this becomes a useful optimization.

This can save ~5% code size.

Reviewers: eugenis, vitalybuka

Reviewed By: eugenis, vitalybuka

Subscribers: hiraditya, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D81700
2020-06-15 17:49:18 -07:00
Stanislav Mekhanoshin 576fa5a50c [AMDGPU] make ubsan happy with unsigned left shift
Fixes UBSAN error after rG9ee272f13d88f090817235ef4f91e56bb2a153d6
A trivial signed/unsigned shift.
2020-06-15 17:21:10 -07:00
Amy Huang f8170d8715 [NativeSession] Implement findLineNumbersByAddress in NativeSession,
which takes an address and a length and returns all lines within that
address range.
2020-06-15 17:05:39 -07:00
Jessica Paquette 5a4c3f6b06 [GlobalISel] Look through extends etc in CombinerHelper::matchConstantOp
It's possible to end up with a zext or something in the way of a G_CONSTANT,
even pre-legalization. This can happen with memsets.

e.g.

https://godbolt.org/z/Bjc8cw

To make sure we can catch these cases, use `getConstantVRegValWithLookThrough`
instead of `mi_match`.

Differential Revision: https://reviews.llvm.org/D81875
2020-06-15 16:34:25 -07:00
Stanislav Mekhanoshin 9ee272f13d [AMDGPU] Add gfx1030 target
Differential Revision: https://reviews.llvm.org/D81886
2020-06-15 16:18:05 -07:00
Amara Emerson fc905ae003 [GlobalISel] Don't emit multiply by magic constant for zero memset values. 2020-06-15 14:42:14 -07:00
Mircea Trofin e2cc854015 [llvm][NFC] Move content of ML subdirectory into Analysis
The initial intent was to organize ML stuff in its own directory, but
it turns out that conflicts with llvm component layering policies: it
is not a component, because subsequent changes want to rely on other
analyses, which would create a cycle; and we don't have a reliable,
cross-platform mechanism to compile files in a subdirectory, and fit in
the existing LLVM build structure.

This change moves the files into Analysis, and subsequent changes will
leverage conditional compilation for those that have optional
dependencies.
2020-06-15 14:35:33 -07:00
Nick Desaulniers 2d8e105db6 [PPCAsmPrinter] support 'L' output template for memory operands
Summary:
L is meant to support the second word used by 32b calling conventions for 64b arguments.

This is required for build 32b PowerPC Linux kernels after upstream
commit 334710b1496a ("powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'")

Thanks for the report from @nathanchance, and reference to GCC's
implementation from @segher.

Fixes: pr/46186
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1044

Reviewers: echristo, hfinkel, MaskRay

Reviewed By: MaskRay

Subscribers: MaskRay, wuzish, nemanjai, hiraditya, kbarton, steven.zhang, llvm-commits, segher, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81767
2020-06-15 14:31:44 -07:00
Davide Italiano c2dccf9d5e [CodeGenPrepare] Reset the debug location when promoting trunc(s)
The promotion machinery in CGP moves instructions retaining
debug locations. When the transformation is local, this is mostly
correct, but when instructions are moved cross-BBs, this is not
always true and causes jumpiness in line tables. This is the first
of a series of commits. sext(s) and zext(s) need to be treated
similarly.

Differential Revision:  https://reviews.llvm.org/D81879
2020-06-15 14:25:43 -07:00
Nikita Popov 35651fdd45 [IR] Add AttributeBitSet wrapper (NFC)
This wraps the uint8_t[12] type used in two places, because I
plan to introduce a third use of the same pattern.
2020-06-15 21:28:25 +02:00
Jessica Paquette 3495b884de [AArch64][GlobalISel] Add G_EXT and select ext using it
Add selection support for ext via a new opcode, G_EXT and a post-legalizer
combine which matches it.

Add an `applyEXT` function, because the AArch64ext patterns require a register
for the immediate. So, we have to create a G_CONSTANT to get these without
writing new patterns or modifying the existing ones.

Tests are the same as arm64-ext.ll.

Also prevent ext from firing on the zip test. It has higher priority, so we
don't want it potentially getting in the way of mask tests.

Also fix up the shuffle-splat test, because ext is now selected there. The
test was incorrectly regbank selected before, which could cause a verifier
failure when you emit copies.

Differential Revision: https://reviews.llvm.org/D81436
2020-06-15 12:20:59 -07:00
Mircea Trofin 29e5722949 Revert "[llvm] Added support for stand-alone cmake object libraries."
This reverts commit 695c7d6313.

Breaks windows (e.g.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/16497)

Likely to cause problems with XCode.
2020-06-15 12:15:39 -07:00
Davide Italiano e51e82745e [Target/PPC] Fold inside an assertion.
Pointed out by dblaikie.
2020-06-15 12:08:57 -07:00
Mircea Trofin 695c7d6313 [llvm] Added support for stand-alone cmake object libraries.
Summary:
Currently, add_llvm_library would create an OBJECT library alongside
of a STATIC / SHARED library, but losing the link interface (its
elements would become dependencies instead). To support scenarios
where linking an object library also brings in its usage
requirements, this patch adds support for 'stand-alone' OBJECT
libraries - i.e. without an accompanying SHARED/STATIC library, and
maintaining the link interface defined by the user.

The support is via a new option, OBJECT_ONLY, to avoid breaking changes
- since just specifying "OBJECT" would currently imply also STATIC or
SHARED, depending on BUILD_SHARED_LIBS.

This is useful for cases where, for example, we want to build a part
of a component separately. Using a STATIC target would incur the risk
that symbols not referenced in the consumer would be dropped (which may
be undesirable).

The current application is the ML part of Analysis. It should be part
of the Analysis component, so it may reference other analyses; and (in
upcoming changes) it has dependencies on optional libraries.

Reviewers: karies, davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81447
2020-06-15 12:01:43 -07:00
Matt Arsenault e07cf92377 AMDGPU/GlobalISel: Don't hardcode maximum register size
This is a somewhat artifical limit, so avoid repeating it many places
in case it changes.
2020-06-15 15:01:19 -04:00
Matt Arsenault 1a7f115dce AMDGPU/GlobalISel: Extend load/store workaround to i128 vectors 2020-06-15 14:55:11 -04:00
Rahul Joshi 72d20b9604 [LLVM] Change isa<> to a variadic function template
Change isa<> to a variadic function template, so that it can be used to test against one of multiple types as follows:
   isa<Type0, Type1, Type2>(Val)

Differential Revision: https://reviews.llvm.org/D81045
2020-06-15 18:46:57 +00:00
Lang Hames 5682f192bd [RuntimeDyld] Add dependence on Core.
Commit 498dd745f5 introduced a dependence on Core. This patch updates
LLVMbuild.txt to reflect this.
2020-06-15 11:14:27 -07:00
Davide Italiano 5cb44196aa [Target/PPC] Silence an unused variable warning. NFC. 2020-06-15 11:05:01 -07:00
Craig Topper d72cb4ce21 Recommit "[X86] Separate imm from relocImm handling."
Fix the copy/paste mistake that caused it to fail previously
2020-06-15 10:59:43 -07:00
Florian Hahn 120c059292 [DSE,MSSA] Port partial store merging.
Port partial constant store merging logic to MemorySSA backed DSE. The
heavy lifting is done by the existing helper function. It is used in
context where we already ensured that the later instruction can
eliminate the earlier one, if it is a complete overwrite.
2020-06-15 18:41:46 +01:00
Lang Hames 498dd745f5 [ORC] Honor linker private global prefix on symbol names.
If a symbol name begins with the linker private global prefix (as
described by the DataLayout) then it should be treated as non-exported,
regardless of its LLVM IR visibility value.
2020-06-15 10:28:36 -07:00
Wouter van Oortmerssen 3b29376e3f [WebAssembly] Adding 64-bit version of R_WASM_MEMORY_ADDR_* relocs
This adds 4 new reloc types.

A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.

A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307

Differential Revision: https://reviews.llvm.org/D81704
2020-06-15 10:07:42 -07:00
Craig Topper ad1c46c3c0 [X86] Remove printanymem/printopaquemem from the InstPrinters. Just tell tablegen to printMemReference directly. NFC
Most of the wrappers exist to print the memory size in Intel syntax
and then call the printMemReference. But printanymem/printopaquemem
don't print anything extra in Intel syntax so just drop them.
2020-06-15 09:46:06 -07:00
Florian Hahn 71a91b9837 [DSE] Hoist partial store merging code into function (NFC).
Hoist the general logic into a new function, because it can be re-used
by the MemorySSA backed DSE as well.
2020-06-15 17:44:24 +01:00
Jessica Paquette 1ac8451a9b [GlobalISel] Simplify G_ADD when it has (0-X) on the LHS or RHS
This implements the following combines:

((0-A) + B) -> B-A
(A + (0-B)) -> A-B

Porting over the basic algebraic combines from the DAGCombiner. There are
several combines which fold adds away into subtracts. This is just the simplest
one.

I noticed that add combines are some of the most commonly hit across CTMark,
(via print statements when they fire), so I'm porting over some of the obvious
ones.

This gives some minor code size improvements on CTMark at -O3 on AArch64.

Differential Revision: https://reviews.llvm.org/D77453
2020-06-15 09:43:24 -07:00
Francesco Petrogalli 28a00ac9ba [llvm][SVE] IR intrinsics for quadword permutation instructions.
Summary:
Adding intrinsics and codegen patterns for:

* trn1 <Zd>.q, <Zm>.q, <Zn>.q
* trn2 <Zd>.q, <Zm>.q, <Zn>.q
* zip1 <Zd>.q, <Zm>.q, <Zn>.q
* zip2 <Zd>.q, <Zm>.q, <Zn>.q
* uzp1 <Zd>.q, <Zm>.q, <Zn>.q
* uzp2 <Zd>.q, <Zm>.q, <Zn>.q

These instructions are defined in Armv8.6-A.

Reviewers: sdesmalen, efriedma, kmclaughlin

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80850
2020-06-15 16:21:56 +00:00
Matt Arsenault 2ca552322c AMDGPU/GlobalISel: Fix 8-byte aligned, 96-bit scalar loads
These are legal since we can do a 96-bit load on some subtargets, but
this is only for vector loads. If we can't widen the load, it needs to
be broken down once known scalar. For 16-byte alignment, widen to a
128-bit load.
2020-06-15 11:33:16 -04:00
Wouter van Oortmerssen d9e0bbd17b [WebAssembly] Adding 64-bit versions of all load & store ops.
Context: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md
This is just a first step, adding the new instruction variants while keeping the existing 32-bit functionality working.
Some of the basic load/store tests have new wasm64 versions that show that the basics of the target are working.
Further features need implementation, but these will be added in followups to keep things reviewable.

Differential Revision: https://reviews.llvm.org/D80769
2020-06-15 08:31:56 -07:00
Florian Hahn 8c61f13a0f [DSE,MSSA] Delete instructions after printing it.
Also enables a now-passing test case, that exposed a crash caused by the
wrong order.
2020-06-15 16:01:36 +01:00
Simon Pilgrim cb8a0ba829 [X86][SSE] Add LowerVectorAllZero helper for checking if all bits of a vector are zero.
Pull the lowering code out of LowerVectorAllZeroTest (and rename it MatchVectorAllZeroTest).

We should be able to reuse this in combineVectorSizedSetCCEquality as well.

Another cleanup to simplify D81547.
2020-06-15 15:54:38 +01:00
Stefan Pintilie 57c9dc0521 [PowerPC] Do not add the relocation addend to the instruction encoding
We should not be adding the relocation addend to the instruction encoding.
This patch removes that and sets those bits to zero.

Differential Revision: https://reviews.llvm.org/D81082
2020-06-15 09:51:34 -05:00
Dominik Montada 87e5742654 [NFC] Add braces to if-statement in MachineVerifier 2020-06-15 16:33:56 +02:00
Simon Pilgrim ae33cbc494 [X86][SSE] LowerVectorAllZeroTest - add support for >256-bit vectors
Reduce by splitting the vector until we reach the target size for PTEST/MOVMSK_PCMPEQ. There might be some cases where AVX512 can perform this with 512-bit vectors but so far I haven't encountered any such pattern that reaches LowerVectorAllZeroTest.

Prep work for D81547
2020-06-15 15:30:24 +01:00
Hans Wennborg f47a776628 Revert "[X86] Separate imm from relocImm handling."
> relocImm was a complexPattern that handled both ConstantSDNode
> and X86Wrapper. But it was only applied selectively because using
> it would cause patterns to be not importable into FastISel or
> GlobalISel. So it only got applied to flag setting instructions,
> stores, RMW arithmetic instructions, and rotates.
>
> Most of the test changes are a result of making patterns available
> to GlobalISel or FastISel. The absolute-cmp.ll change is due to
> this fixing a pattern ordering issue to make an absolute symbol
> match to an 8-bit immediate before trying a 32-bit immediate.
>
> I tried to use PatFrags to reduce the repetition, but I was getting
> errors from TableGen.

This caused "Invalid EmitNode" assertions, see the llvm-commits thread for
discussion.
2020-06-15 16:14:59 +02:00
Simon Pilgrim 0b806549b5 [X86][SSE] LowerVectorAllZeroTest - remove unnecessary bitcasts
matchScalarReduction should return all its source vectors with the same type, so we can safely perform the OR reduction with the original type.

So we just need to bitcast for PTEST/PCMPEQB with the final reduced vector.
2020-06-15 15:13:13 +01:00
Kevin P. Neal 07f3351284 [strictfp] Replace dangling strictfp attrs with nobuiltin
In preparation for a patch that will enforce new rules for the usage of
the strictfp attribute, this patch introduces auto-upgrade behavior that
will replace the strictfp attribute on callsites with nobuiltin if the
enclosing function declaration doesn't also have the strictfp attribute.

This auto-upgrade isn't being performed on .ll files because that would
prevent us from writing a test for the forthcoming verifier behavior.

Differential Revision: https://reviews.llvm.org/D70096
2020-06-15 10:05:35 -04:00
Yvan Roux 669066de65 [ARM][MachineOutliner] Add LR RegSave mode.
Outline chunks of code which need to save and restore the link register
when a spare register can be used to it.

Differential Revision: https://reviews.llvm.org/D80127
2020-06-15 15:22:08 +02:00
Daniel Kiss b8ae3fdfa5 [AArch64] Fix BTI instruction emission.
Summary:
SCTLR_EL1.BT[01] controls the PACI[AB]SP compatibility with PBYTE 11
(see [1])
This bit will be set to zero so PACI[AB]SP are equal to BTI C
instruction only.

[1] https://developer.arm.com/docs/ddi0595/b/aarch64-system-registers/sctlr_el1

Reviewers: chill, tamas.petz, pbarrio, ostannard

Reviewed By: tamas.petz, ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81746
2020-06-15 15:04:36 +02:00
Matt Arsenault dae9554b2b AMDGPU/GlobalISel: Workaround some load/store type selection patterns
The logic is written for what loads/stores should be selectable. There
are a set of cases that should be selectable, but due to missing MVTs
and/or selection patterns, will fail to select. I think eventually
load/store select patterns should ignore the type and only look at the
value size, but until that happens, bitcast these to equivalent i32
vectors.
2020-06-15 07:42:20 -04:00
Matt Arsenault 33e9086501 GlobalISel: Support lowering vector->vector G_BITCAST
Extract subvectors and cast to the result element type before
remerging.
2020-06-15 07:36:30 -04:00
James Henderson 1a78904752 [DebugInfo] Report errors for truncated debug line standard opcode
Standard opcodes usually have ULEB128 arguments, so it is generally not
possible to recover from such errors. This patch causes the parser to
stop parsing the table in such situations.

Also don't emit the operands or add data to the table if there is an
error reading these opcodes.

Reviewed by: JDevlieghere

Differential Revision: https://reviews.llvm.org/D81470
2020-06-15 11:50:12 +01:00
Georgii Rymar ec4e68e667 [yaml2obj] - Introduce the "NoHeaders" key for "SectionHeaderTable"
We have an issue currently. The following YAML piece just ignores the `Excluded` key.

```
SectionHeaderTable:
  Sections: []
  Excluded:
    - Name: .foo
```

Currently the meaning is: exclude the whole table.

The code checks that the `Sections` key is empty and doesn't catch/check
invalid/duplicated/missed `Excluded` entries.

Also there is no way to exclude all sections except the first null section,
because `Sections: []` currently just excludes the whole the sections header table.

To fix it, I suggest a change of the behavior.

1) A new `NoHeaders` key is added. It provides an explicit syntax to drop the whole table.
2) The meaning of the following is changed:

```
SectionHeaderTable:
  Sections: []
  Excluded:
    - Name: .foo

```
Assuming there are 2 sections in the object (a null section and `.foo`), with this patch it
means: exclude the `.foo` section, keep the null section. The null section is an implicit
section and I think it is reasonable to make "Sections: []" to mean it is implicitly added.
It will be consistent with the global "Sections" tag that is used to describe sections.

3) `SectionHeaderTable->Sections` is now optional. No `Sections` is the same as
   `Sections: []` (I think it avoids a confusion).
4) Using of `NoHeaders` together with `Sections`/`Excluded` is not allowed.
5) It is possible to use the `Excluded` key without the `Sections` key now (in this case
   `Excluded` must contain all sections).
6) `SectionHeaderTable:` or `SectionHeaderTable: []` is not allowed.
7) When the `SectionHeaderTable` key is present, we still require all sections to be
   present in `Sections` and `Excluded` lists. No changes here, we are still strict.

Differential revision: https://reviews.llvm.org/D81655
2020-06-15 12:43:16 +03:00
Kazushi (Jam) Marukawa e026f147f7 [VE] Support relocation information in MC layer
Summary:
Change VEAsmParser to support identification with relocation information
in assmebler.  Change VEAsmBackend to support relocation information in
MC layer.  Change VEDisassembler and VEMCCodeEmitter to support binary
generation of branch target operands.  Add REFLONG fixup and variant kind
to support new R_VE_REFLONG ELF symbol.  And, add regression test in both
MC and CodeGen to check binary genaration with relocation information.

Differential Revision: https://reviews.llvm.org/D81553
2020-06-15 11:24:53 +02:00
Dominik Montada c87bf29149 [MachineVerifier][GlobalISel] Check that branches have a MBB operand or are declared indirect. Add missing properties to G_BRJT, G_BRINDIRECT
Summary:
Teach MachineVerifier to check branches for MBB operands if they are not declared indirect.

Add `isBarrier`, `isIndirectBranch` to `G_BRINDIRECT` and `G_BRJT`.
Without these, `MachineInstr.isConditionalBranch()` was giving a
false-positive for those instructions.

Reviewers: aemerson, qcolombet, dsanders, arsenm

Reviewed By: dsanders

Subscribers: hiraditya, wdng, simoncook, s.egerton, arsenm, rovka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81587
2020-06-15 11:17:09 +02:00
Sam Parker 2596da3174 [CostModel] getCFInstrCost in getUserCost.
Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 specific implementation as it seems to be very
reliant on those instructions being free. Changes are also made to
AMDGPU so that their implementations distinguish between cost kinds,
so that the unrolling isn't affected. PowerPC also has its own
implementation to prevent changes to the reg-usage vectorizer test.

The cost model test changes now reflect that ret instructions are not
generally free.

Differential Revision: https://reviews.llvm.org/D79164
2020-06-15 09:28:46 +01:00
Sam Parker 321ebfd175 [NFCI][CostModel] Unify FNeg cost
Enable TTIImpl::getUserCost to handle FNeg so that
getInstructionThroughput can call that instead. This means we can
remove the code in the AMDGPU backend too.

Differential Revision: https://reviews.llvm.org/D81635
2020-06-15 08:33:04 +01:00
Nikita Popov 7cac7e0cfc [IR] Prefer hasFnAttribute() where possible (NFC)
When checking for an enum function attribute, use hasFnAttribute()
rather than hasAttribute() at FunctionIndex, because it is
significantly faster (and more concise to boot).
2020-06-15 09:30:35 +02:00
Sam Parker 51541c068a [CostModel] Unify ExtractElement cost.
Move the cost modelling, with the reduction pattern matching, from
getInstructionThroughput into generic TTIImpl::getUserCost. The
modelling in the AMDGPU backend can now be removed.

Differential Revision: https://reviews.llvm.org/D81643
2020-06-15 08:27:14 +01:00
Max Kazantsev 60da4369a1 [NFC] Bail early simplifying unconditional branches 2020-06-15 13:59:53 +07:00
Sam Parker 3e39760f8e Revert "Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant""
This reverts commit 23291b9863.

This caused performance regressions.
2020-06-15 07:46:28 +01:00
Vitaly Buka ca2dcbd030 [SafeStack,NFC] Make StackColoring read-only
Move core which removes markers out of StackColoring.
2020-06-14 23:05:43 -07:00
Vitaly Buka c6426e2657 [SafeStack,NFC] Remove unneded branch 2020-06-14 23:05:43 -07:00
Vitaly Buka 7282da1ea8 [SafeStack,NFC] Fix naming style 2020-06-14 23:05:42 -07:00
Vitaly Buka 2f5e535a84 [SafeStack,NFC] Cleanup LiveRange interface 2020-06-14 23:05:42 -07:00
Vitaly Buka adefa9ca2e [SafeStack,NFC] "const" cleanup 2020-06-14 23:05:42 -07:00
Vitaly Buka fb1e0f324f [SafeStack,NFC] Add BlockLifetimeInfo constructor 2020-06-14 23:05:42 -07:00
Vitaly Buka 645058036a [SafeStack,NFC] Use IntrinsicInst instead of Instruction 2020-06-14 23:05:41 -07:00
Vitaly Buka f8e411656e [SafeStack,NFC] Move ClColoring into SafeStack.cpp
This allows to reuse the code in other components.
2020-06-14 23:05:41 -07:00
Vitaly Buka 05590a9cb8 [SafeStack,NFC] Move unconditional code into constructor
Prepare to move ClColoring from SafeStackCode to SafeStackLayout.
This will allow to reuse the code in other components.
2020-06-14 23:05:41 -07:00
Chen Zheng bd7096b977 [PowerPC] fma chain break to expose more ILP
This patch tries to reassociate two patterns related to FMA to expose
more ILP on PowerPC.

// Pattern 1:
//   A =  FADD X,  Y          (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMA  X,  M21,  M22
//   B =  FMA  Y,  M31,  M32
//   C =  FADD A,  B

// Pattern 2:
//   A =  FMA  X,  M11,  M12  (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMUL M11,  M12
//   B =  FMA  X,  M21,  M22
//   D =  FMA  A,  M31,  M32
//   C =  FADD B,  D

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D80175
2020-06-15 00:00:04 -04:00
Kang Zhang 74abe50071 [PowerPC] Add some InstAlias for mtspr/mfspr instructions
Summary:

We have defined MTSPR/MFSPR and MTSPR8/MFSPR8, but we only defined
mtspr/mfspr InstAlias for some MTSPR/MFSPR.
This patch is to add the InstAlias definitions for MTSPR8/MFSPR8,
and add the some new mtspr/mfspr InstAlias we may use.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77531
2020-06-15 02:43:13 +00:00
Chen Zheng 163162a0a4 [PowerPC] fold a bug for rlwinm folding when with full mask.
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81006
2020-06-14 21:27:01 -04:00
Simon Pilgrim 3d8149c2a1 [X86][SSE] Fold BITOP(MOVMSK(X),MOVMSK(Y)) -> MOVMSK(BITOP(X,Y))
Reduce XMM->GPR traffic by performing bitops on the vectors, and using a single MOVMSK call.

This requires us to use vectors of the same size and element width, but we can mix fp/int type equivalents with suitable bitcasting.
2020-06-14 21:37:58 +01:00
Nikita Popov 5184857c62 [IR] Remove unused IndexAttrPair typedef (NFC)
This was part of an older attributes implementation.
2020-06-14 22:27:17 +02:00
Florian Hahn 6176f04436 [LAA] Do not set CanDoRT to false for AS that do not need RT checks.
Alternative approach to D80570.

canCheckPtrAtRT already contains checks the figure out for which alias
sets runtime checks are needed. But it currently sets CanDoRT to false
for alias sets for which we cannot do RT checks but also do not need
any.

If we know that we do not need RT checks based on the number of
reads/writes in the alias set, we can skip processing the AS.

This patch also adds an assertion to ensure that DepCands does not
contain more than one write from the alias set.

Reviewers: Ayal, anemet, hfinkel, dmgreen

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D80622
2020-06-14 20:55:59 +01:00
Whitney Tsang 5225cd43e8 [LoopUnroll] Allow loops with multiple exiting blocks where loop latch
is not necessary one of them.

Summary: Currently LoopUnrollPass already allow loops with multiple
exiting blocks, but it is only allowed when the loop latch is one of the
exiting blocks.
When the loop latch is not an exiting block, then only single exiting
block is supported.
When possible, the single loop latch or the single exiting block
terminator is optimized to an unconditional branch in the unrolled loop.

This patch allows loops with multiple exiting blocks even if the loop
latch is not one of them. However, the optimization of exiting block
terminator to unconditional branch is not done when there exists more
than one exiting block.
Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour
Reviewed By: efriedma
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D81053
2020-06-14 18:44:18 +00:00
Matt Arsenault 804397dde6 AMDGPU: Do not bundle inline asm
Fixes bug 46285
2020-06-14 13:24:50 -04:00
Matt Arsenault fb51d508ee AMDGPU/GlobalISel: Select general case for G_PTRMASK 2020-06-14 13:12:29 -04:00
Matt Arsenault 46579471fd AMDGPU: Fix spill/restore of 192-bit registers
I tried to use an IR inline asm test, but that doesn't work since the
inline asm handling asserts without an MVT to use.
2020-06-14 13:12:01 -04:00
Qiu Chaofan 13edcd696e [PowerPC] Support constrained rounding operations
This patch adds handling of constrained FP intrinsics about round,
truncate and extend for PowerPC target, with necessary tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D64193
2020-06-14 23:43:31 +08:00
Qiu Chaofan 7315d221a2 [PowerPC] Exploit vnmsubfp instruction
On PowerPC, we have vnmsubfp Altivec instruction for fnmsub operation on
v4f32 type. Default pattern for this instruction never works since we
don't have legal fneg for v4f32 when VSX disabled.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80617
2020-06-14 23:19:17 +08:00
Qiu Chaofan f8ef7c99a0 [DAGCombiner] Require ninf for division estimation
Current implementation of division estimation isn't correct for some
cases like 1.0/0.0 (result is nan, not expected inf).

And this change exposes a potential infinite loop: we use
isConstOrConstSplatFP in combineRepeatedFPDivisors to look up if the
divisor is some constant. But it doesn't work after legalized on some
platforms. This patch restricts the method to act before LegalDAG.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D80542
2020-06-14 22:58:22 +08:00
Sanjay Patel 098e48a6a1 [PassManager] restore early-cse to vector cleanup
As noted in D80236 - the early-cse pass was included here before:
D75145 / rG71a316883d50
But it got moved outside of the "extra" option there, then it
got dropped while adjusting -vector-combine:
rG6438ea45e053
rG57bb4787d72f

So this is restoring the behavior and adding a test to prevent
accidental changes again. I don't see an equivalent option for
the new pass manager.
2020-06-14 10:04:53 -04:00
Nikita Popov 862db369f8 [LVI] Fix class indentation (NFC)
This class uses a mix of different indentation levels, normalize it.
2020-06-14 15:42:27 +02:00
Nikita Popov 83e7230e5a [LVI] Cache lookup of experimental.guard intrinsic (NFC)
When LVI is performing assume intersections, it also checks for
llvm.experimental.guard intrinsics. To avoid unnecessary block
scans, it first checks whether this intrinsic is declared in the
module at all. I've noticed that we end up spending quite a lot
of time looking up that function again and again...

Avoid this by only looking it up once when LazyValueInfo is
constructed. This of course assumes that we don't introduce new
guard intrinsics (which is the case for all existing uses of LVI --
and even if it weren't, it would not introduce miscompiles, just
potentially lose optimization power.)

Differential Revision: https://reviews.llvm.org/D81796
2020-06-14 15:32:30 +02:00
Sanjay Patel b5fb26951a [InstCombine] reassociate FP diff of sums into sum of diffs
(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) -->
(a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3])

This should be the last step in solving PR43953:
https://bugs.llvm.org/show_bug.cgi?id=43953

We started emitting reduction intrinsics with:
D80867/ rGe50059f6b6b3
So it's a relatively easy pattern match now to re-order those ops.
Also, I have not seen any complaints for the switch to intrinsics
yet, so I'll propose to remove the "experimental" tag from the
intrinsics soon.

Differential Revision: https://reviews.llvm.org/D81491
2020-06-14 09:09:03 -04:00
Sanjay Patel aeb5044801 [InstCombine] allow undef elements when comparing vector constants for min/max bailout
This is a hacky, but low-risk fix to avoid the infinite loop in PR46271:
https://bugs.llvm.org/show_bug.cgi?id=46271

As discussed there, the problem is that FoldOpIntoSelect() can get into a conflict
with a transform that wants to pull a 'not' op through min/max via
SimplifyDemandedVectorElts(). We need to relax our matching of min/max to include
undefined elements in vector constants to avoid that. Alternatively, we could
improve or cripple the demanded elements analysis, but that could create even
more problems.

The likely better, safer alternative will be to create min/max intrinsics, so
we can remove all of the hacks related to min/max matching in instcombine.

Differential Revision: https://reviews.llvm.org/D81698
2020-06-14 09:02:47 -04:00
Simon Pilgrim e0cff30c17 [X86][SSE] LowerVectorAllZeroTest - add support for pre-SSE41 targets
Even without PTEST, we can still efficiently perform an OR reduction as PMOVMSKB(PCMPEQB(X,0)) == 0, avoiding xmm->gpr extractions.
2020-06-14 13:41:56 +01:00
Xing GUO ff9c1ae213 [ObjectYAML][DWARF] Let the target address size be inferred from FileHeader.
This patch adds a new field `bool Is64bit` in `DWARFYAML::Data` to indicate the address size of target. It's helpful for inferring the `AddrSize` in some DWARF sections.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D81709
2020-06-14 12:42:20 +08:00
Craig Topper bfd12c76eb [X86] Add mayLoad flag to FARCALL*m/FARJMP memory instrutions. Add 'm' to the end of FARJMP64/FARCALL64 instruction names.
We never codegen them so this doesn't matter in practice. But
sometimes someone comes along and tries to use these flags
for something else. LIke the Load Value Inject inline assembly
handling.
2020-06-13 15:40:51 -07:00
Craig Topper 0cbe713c69 [X86] Automatically harden inline assembly RET instructions against Load Value Injection (LVI)
Previously, the X86AsmParser would issue a warning whenever a ret instruction is encountered. This patch changes the behavior to automatically transform each ret instruction in an inline assembly stream into:

shlq $0, (%rsp)
lfence
ret

which is secure, according to https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection#specialinstructions.

Patch by Scott Constable with some minor changes by Craig Topper.
2020-06-13 15:16:05 -07:00
Craig Topper cb5072d187 [X86] Teach combineBitcastvxi1 to prefer movmsk on avx512 in more cases
If the input to the bitcast is a sign bit test, it makes sense to
directly use vpmovmskb or vmovmskps/pd. This removes the need to
copy the sign bits to a k-register and then to a GPR.

Fixes PR46200.

Differential Revision: https://reviews.llvm.org/D81327
2020-06-13 14:50:13 -07:00
Craig Topper 6b4b660174 [X86] Move -x86-use-vzeroupper command line flag into runOnMachineFunction for the pass itself rather than the pass pipeline construction
This pass has no dependencies on other passes so conditionally
including it in the pipeline doens't do much. Just move it the
pass itself to keep it isolated.
2020-06-13 14:42:41 -07:00
Roman Lebedev e987ee6318
[NFCI][AggressiveInstCombiner] Add `STATISTIC()`s for transforms 2020-06-13 23:53:16 +03:00
Florian Hahn 97e7147e34 [DSE,MSSA] Fix location order in isOverwrite call.
isOverwrite expects the later location as first argument and the earlier
result later. The adjusted call is intended to check whether CC
overwrites DefLoc.
2020-06-13 20:39:00 +01:00
Craig Topper 93264a2e4f [X86] Enable the EVEX->VEX compression pass at -O0.
A lot of what EVEX->VEX does is equivalent to what the
prioritization in the assembly parser does. When an AVX mnemonic
is used without any EVEX features or XMM16-31, the parser will
pick the VEX encoding.

Since codegen doesn't go through the parser, we should also
use VEX instructions when we can so that the code coming out of
integrated assembler matches what you'd get from outputing an
assembly listing and parsing it.

The pass early outs if AVX isn't enabled and uses TSFlags to
check for EVEX instructions before doing the more costly table
lookups. Hopefully that's enough to keep this from impacting
-O0 compile times.
2020-06-13 12:29:04 -07:00
Craig Topper 8885a7640b [X86] Separate imm from relocImm handling.
relocImm was a complexPattern that handled both ConstantSDNode
and X86Wrapper. But it was only applied selectively because using
it would cause patterns to be not importable into FastISel or
GlobalISel. So it only got applied to flag setting instructions,
stores, RMW arithmetic instructions, and rotates.

Most of the test changes are a result of making patterns available
to GlobalISel or FastISel. The absolute-cmp.ll change is due to
this fixing a pattern ordering issue to make an absolute symbol
match to an 8-bit immediate before trying a 32-bit immediate.

I tried to use PatFrags to reduce the repetition, but I was getting
errors from TableGen.
2020-06-13 11:29:28 -07:00
Amanieu d'Antras 6973125cb7 Fix FastISel dropping srcloc metadata from InlineAsm
Summary:
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46060

I've also added the Extra_IsConvergent flag which was missing from FastISel.

Reviewers: echristo

Reviewed By: echristo

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80759
2020-06-13 16:52:37 +01:00
Xing GUO 0431e4bcb2 Recommit "[DWARFYAML][debug_line] Replace `InitialLength` with `Format` and `Length`."
This recommits fcc0c186e9
2020-06-13 23:39:11 +08:00
Xing GUO 325f7607b0 Revert "[DWARFYAML][debug_line] Replace `InitialLength` with `Format` and `Length`."
This reverts commit fcc0c186e9.
2020-06-13 17:57:02 +08:00
Xing GUO fcc0c186e9 [DWARFYAML][debug_line] Replace `InitialLength` with `Format` and `Length`. 2020-06-13 17:47:06 +08:00
Nikita Popov f87b785abe Reapply [LVI] Restructure caching to fix non-determinism
This was reverted due to a reported memory usage increase. However,
a test case was never provided, and I wasn't able to reproduce it
myself.

Relative to the original patch, I have moved the block cache
structure behind a unique_ptr, to avoid storing a huge structure
inside a DenseMap.

---

Variant on D70103 to fix https://bugs.llvm.org/show_bug.cgi?id=43909.
The caching is switched to always use a BB to cache entry map, which
then contains per-value caches. A separate set contains value handles
with a deletion callback. This allows us to properly invalidate
overdefined values.

A possible alternative would be to always cache by value first and
have per-BB maps/sets in the each cache entry. In that case we could
use a ValueMap and would avoid the separate value handle set. I went
with the BB indexing at the top level to make it easier to integrate
D69914, but possibly that's not the right choice.

Differential Revision: https://reviews.llvm.org/D70376
2020-06-13 11:31:40 +02:00
Craig Topper 2831f7852f [X86] Remove brand_id check from getHostCPUName.
Brand index was a feature some Pentium III and Pentium 4 CPUs.
It provided an index into a software lookup table to provide a
brand name for the CPU. This is separate from the family/model.

It's unclear to me why this index being non-zero was used to
block checking family/model. I think the effect of this is that
-march=native was not working correctly on the CPUs that have a
non-zero brand index. They are all about 20 years old so this
probably hasn't affected many users.
2020-06-12 20:38:30 -07:00
Mehdi Amini 339e49e2ca Fix GCC5 build by renaming variable used in 'auto' deduction (NFC)
GCC5 errors out with:

llvm/lib/Analysis/StackSafetyAnalysis.cpp:935:21: error: use of 'KV' before deduction of 'auto'
     for (auto &KV : KV.second.Params) {
                     ^
2020-06-13 03:08:56 +00:00
Craig Topper a27d0dcf65 [X86] Combine the three feature variables in getHostCPUName into an array and pass it around as an array reference.
This makes the setting and clearing of bits simpler.
2020-06-12 18:30:41 -07:00
Vitaly Buka c1e47b47f8 [StackSafety] Run ThinLTO
Summary:
ThinLTO linking runs dataflow processing on collected
function parameters. Then StackSafetyGlobalInfoWrapperPass
in ThinLTO backend will run as usual looking up to external
symbol in the summary if needed.

Depends on D80985.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81242
2020-06-12 18:11:29 -07:00
Vitaly Buka e6ce0dc5de [StackSafety,NFC] Extract addOverflowNever 2020-06-12 17:42:32 -07:00
Eric Christopher b422fe7d62 Temporarily revert "[MemCpyOptimizer] Simplify API of processStore and processMem* functions"
as it seems to be causing some internal crashes in AA after
email with the author.

This reverts commit f79e6a8847.
2020-06-12 14:01:27 -07:00
Roman Lebedev 17f7654152
[NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead of DenseSet.
This decreases the time consumed by the pass [during RawSpeed unity build]
by 25% (0.0586 s -> 0.04388 s).

While that isn't really impressive overall, that wasn't the goal here.
The memory results here are noticeable.
The baseline results are:
```
total runtime: 55.65s.
calls to allocation functions: 19754254 (354960/s)
temporary memory allocations: 4951609 (88974/s)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.79MB
total memory leaked: 198.01MB
```
While with this patch the results are:
```
total runtime: 55.37s.
calls to allocation functions: 19068237 (344403/s)   # -3.47 %
temporary memory allocations: 4261772 (76974/s)      # -13.93 % (!!!)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.73MB
total memory leaked: 198.01MB
```

So we get rid of *a lot* of temporary allocations.

Using `SmallSet<8>` makes sense to me because at least here
for x86 BdVer2, the size of that set is *never* more than 3,
over all of llvm test-suite + RawSpeed.

The story might be different on other targets,
not sure if it will ever justify whole DenseSet,
but if it does SmallDenseSet might be a compromise.
2020-06-12 23:10:54 +03:00
Roman Lebedev 7aeb41b3c8
[NFCI] VectorCombine: add statistic for bitcast(shuf()) -> shuf(bitcast()) xform 2020-06-12 23:10:53 +03:00
Roman Lebedev 55eb714a0e
[NFC] OpenMPOpt: add a statistic for num of parallel regions deleted 2020-06-12 23:10:53 +03:00
Ronak Chauhan 480a16d5c8 [MC] Changes to help improve target specific symbol disassembly
Summary:
This commit slightly modifies the MCDisassembler, and llvm-objdump to
allow targets to also decode entire symbols.

WebAssembly uses the onSymbolStart hook it to decode preludes.
WebAssembly partially disassembles the symbol in its target specific
way; and then falls back to the normal flow of llvm-objdump.

AMDGPU needs it to decode kernel descriptors entirely, and move to the
next symbol.

This commit is to split the above task into 2.
- Changes to llvm-objdump and MC-layer without breaking WebAssembly code
  [ this commit ]
- AMDGPU's implementation of onSymbolStart that decodes kernel
  descriptors. [ https://reviews.llvm.org/D80713 ]

Reviewers: scott.linder, t-tye, sunfish, arsenm, jhenderson, MaskRay, aardappel

Reviewed By: scott.linder, jhenderson, aardappel

Subscribers: bcain, dschuff, wdng, tpr, sbc100, jgravelle-google, hiraditya, aheejin, MaskRay, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80512
2020-06-12 15:51:37 -04:00
David Blaikie 5146fc15fc llvm-dwarfdump: Include unit count in DWP index header dumping
And add comma separators (to be consistent with recent
changes/improvements to the dumping of other section headers) while I'm
here.
2020-06-12 12:40:02 -07:00
Michael Liao ec02635d10 [amdgpu] Skip OR combining on 64-bit integer before legalizing ops.
Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81710
2020-06-12 15:22:38 -04:00
Amara Emerson 1cbebd95de [AArch64][GlobalISel] Legalize vector G_PTR_ADD and enable selection.
Differential Revision: https://reviews.llvm.org/D81419
2020-06-12 11:25:17 -07:00
David Green 46529978bf [ARM] Always use reductions intrinsics under MVE
Similar to a recent change to the X86 backend, this changes things so
that we always produce a reduction intrinsics for all reduction types,
not just the legal ones. This gives a better chance in the backend to
custom lower them to something more suitable for MVE. Especially for
something like fadd the in-order reduction produced during DAG lowering
is already better than the shuffles produced in the midend, and we can
do even better with a bit of custom lowering.

Differential Revision: https://reviews.llvm.org/D81398
2020-06-12 19:21:17 +01:00
Daniel Grumberg 4bf1124eda [TableGen] Make behavior of getValueAsListOfStrings consistent with getValueAsString 2020-06-12 19:16:48 +01:00
Michael Liao e7b920e6fe [DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2))
Reviewers: arsenm

Subscribers: sdardis, wdng, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, ecnelises, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81708
2020-06-12 13:53:08 -04:00
Jessica Paquette d3a56f062b [AArch64][GlobalISel] Allow G_DUP for elements smaller than 32 B.
We select all of these via patterns now, so there's no reason to disallow this.

Update select-dup.mir to show that we correctly select the smaller types.

Differential Revision: https://reviews.llvm.org/D81322
2020-06-12 09:40:34 -07:00
Jessica Paquette 305862a5a6 [AArch64][GlobalISel] Set hasSideEffects = 0 on custom shuffle opcodes
This was making it so that the instructions weren't eliminated in
select-rev.mir and select-trn.mir despite not being used.

Update the tests accordingly.

Differential Revision: https://reviews.llvm.org/D81492
2020-06-12 09:39:46 -07:00
Huihui Zhang bf7961fade [NFC] Silence compiler warning [-Wmissing-braces].
llvm/lib/Target/AArch64/AArch64SLSHardening.cpp:146:5: warning: suggest braces around initialization of subobject [-Wmissing-braces]
    "__llvm_slsblr_thunk_x0",  "__llvm_slsblr_thunk_x1",
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    {
llvm/lib/Target/AArch64/AArch64SLSHardening.cpp:168:5: warning: suggest braces around initialization of subobject [-Wmissing-braces]
    AArch64::X0,  AArch64::X1,  AArch64::X2,  AArch64::X3,  AArch64::X4,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    {
2020-06-12 08:55:03 -07:00
Matt Arsenault 350ee7fb3f GlobalISel: Fix not erasing old instruction in sitofp/uitofp lowering 2020-06-12 10:33:23 -04:00
Masoud Ataei 2d038370bb DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked
Here, I am proposing to add an special case for massv powf4/powd2 function (SIMD counterpart of powf/pow function in MASSV library) in MASSV pass to get later optimizations like conversion from pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's in the DAGCombiner in vector float case. My reason for doing this is: the optimized pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's is faster than powf4/powd2 on P8 and P9.

In case MASSV functions is called, and if the exponent of pow is 0.75 or 0.25, we will get the sequence of sqrt's and if exponent is not 0.75 or 0.25 we will get the appropriate MASSV function.

Reviewed By: steven.zhang

Tags: #LLVM #PowerPC

Differential Revision: https://reviews.llvm.org/D80744
2020-06-12 10:02:16 -04:00
Simon Pilgrim 5509e2cc2e [DAG] foldAddSubOfSignBit - add support for non-uniform vector constants 2020-06-12 14:58:15 +01:00
Marco Elver 8af7fa07aa [ASan][NFC] Refactor redzone size calculation
Refactor redzone size calculation. This will simplify changing the
redzone size calculation in future.

Note that AddressSanitizer.cpp violates the latest LLVM style guide in
various ways due to capitalized function names. Only code related to the
change here was changed to adhere to the style guide.

No functional change intended.

Reviewed By: andreyknvl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81367
2020-06-12 15:33:00 +02:00
Xing GUO 613c4a87ba [ObjectYAML][DWARF] Add one helper function `writeInitialLength()`. NFC. 2020-06-12 21:10:58 +08:00
Simon Pilgrim 8d30945ab9 [X86][SSE] combineX86ShuffleChain - combine INSERT_VECTOR_ELT patterns to INSERTPS
Noticed while trying to cleanup D66004 - if a shuffle operand came from a scalar, we're better off using INSERTPS vs UNPCKLPS as this is more likely to load fold later on. It also matches our existing BUILD_VECTOR lowering.

We can extend this to other PINSRB/D/Q/W cases in the future as the need arises.
2020-06-12 11:59:01 +01:00
Florian Hahn 4495a6b141 [BreakCritEdges] Add option to opt-out of perserving loop-simplify.
This patch adds a new option to CriticalEdgeSplittingOptions to control
whether loop-simplify form must be preserved. It is them used by GVN to
indicate that loop-simplify form does not have to be preserved.

This fixes a crash exposed by 189efe295b.

If the critical edge we are splitting goes from a block inside a loop to
a block outside the loop, splitting the edge will create a new exit
block. As a result, the new block will branch to the original exit
block, which will add a non-loop predecessor, breaking loop-simplify
form. To preserve loop-simplify form, the predecessor blocks of the
original exit are split, but that does not work for blocks with
indirectbr terminators. If preserving loop-simplify form is requested,
bail out , before making any changes.

Reviewers: reames, hfinkel, davide, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81582
2020-06-12 11:47:13 +01:00
Florian Hahn 3a846d4d92 [VPlan] Reject loops without computable backedge taken counts
getOrCreateTripCount is used to generate code for the outer loop, but it
requires a computable backedge taken counts. Check that in the VPlan
native path.

Reviewers: Ayal, gilr, rengolin, sguggill

Reviewed By: sguggill

Differential Revision: https://reviews.llvm.org/D81088
2020-06-12 10:31:18 +01:00
Sebastian Neubauer 29a6ad94fd [AMDGPU] Add G16 support to image instructions
Add G16 feature for GFX10 and support A16 and G16 in GlobalISel.

Differential Revision: https://reviews.llvm.org/D76836
2020-06-12 11:26:31 +02:00
Georgii Rymar d95f8e7aef [yaml2obj][MachO] - Fix PubName/PubType handling.
`PubName` and `PubType` are optional fields since D80722.

They are defined as:
  Optional<PubSection> PubNames;
  Optional<PubSection> PubTypes;

And initialized in the following way:
  IO.mapOptional("debug_pubnames", DWARF.PubNames);
  IO.mapOptional("debug_pubtypes", DWARF.PubTypes);

But problem is that because of the issue in `YAMLTraits.cpp`,
when there are no `debug_pubnames`/`debug_pubtypes` keys in a YAML description,
they are not initialized to `Optional::None` as the code expects, but they
are initialized to default `PubSection()` instances.

Because of this, the `if` condition in the following code is always true:

if (Obj.DWARF.PubNames)
  Err = DWARFYAML::emitPubSection(OS, *Obj.DWARF.PubNames,
                                  Obj.IsLittleEndian);

What means `emitPubSection` is always called and it writes few values.

This patch fixes the issue. I've reduced `sizeofcmds` by size of data
previously written because of this bug.

Differential revision: https://reviews.llvm.org/D81686
2020-06-12 12:03:51 +03:00
Chen Zheng 9b6e86a1a5 [PowerPC] refactor convertToImmediateForm - NFC
This is a NFC patch to make convertToImmediateForm a light wrapper
for converting xform and imm form instructions on PowerPC.

Reviewed By: Steven.zhang

Differential Revision: https://reviews.llvm.org/D80907
2020-06-12 03:57:54 -04:00
EgorBo 012909dcaf
[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0"
Summary:
"X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two)
However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression:
"X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj

This is my first contribution to LLVM so I hope I didn't mess things up

Reviewers: lebedev.ri, spatel

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79369
2020-06-12 10:20:06 +03:00
Jonas Devlieghere 425c6f079b [llvm/Object] Reimplment basic_symbol_iterator in TapiFile
Use indices into the Symbols vector instead of casting the objects in
the vector and dereferencing std::vector::end().

This change is NFC modulo the Windows failure reported by
llvm-clang-x86_64-expensive-checks-win.

Differential revision: https://reviews.llvm.org/D81717
2020-06-12 00:03:32 -07:00
Kristof Beyls c35ed40f4f [AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions.
To make sure that no barrier gets placed on the architectural execution
path, each
  BLR x<N>
instruction gets transformed to a
  BL __llvm_slsblr_thunk_x<N>
instruction, with __llvm_slsblr_thunk_x<N> a thunk that contains
__llvm_slsblr_thunk_x<N>:
  BR x<N>
  <speculation barrier>

Therefore, the BLR instruction gets split into 2; one BL and one BR.
This transformation results in not inserting a speculation barrier on
the architectural execution path.

The mitigation is off by default and can be enabled by the
harden-sls-blr subtarget feature.

As a linker is allowed to clobber X16 and X17 on function calls, the
above code transformation would not be correct in case a linker does so
when N=16 or N=17. Therefore, when the mitigation is enabled, generation
of BLR x16 or BLR x17 is avoided.

As BLRA* indirect calls are not produced by LLVM currently, this does
not aim to implement support for those.

Differential Revision:  https://reviews.llvm.org/D81402
2020-06-12 07:34:33 +01:00
Yevgeny Rouban 707836ed4e [JumpThreading] Handle zero !prof branch_weights
Avoid division by zero in updatePredecessorProfileMetadata().

Reviewers: yamauchi
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81499
2020-06-12 11:55:15 +07:00
Craig Topper 0ce9bf6eed [X86] Add a helper lambda to getIntelProcessorTypeAndSubtype to select feature bits from the correct 32-bit feature variable.
We have three 32 bit variables containing feature bits. But our
enum is a flat 96 bit space. So we need to pick which of the
variables to use based on the bit value. We used to do this
manually by mentioning the correct variable and subtracting an
offset from the enum. But this is error prone.
2020-06-11 21:14:46 -07:00
Vitaly Buka 999307323a [StackSafety] Fix byval handling
We don't need process paramenters which marked as
byval as we are not going to pass interested allocas
without copying.

If we pass value into byval argument, we just handle that
as Load of corresponding type and stop that branch of analysis.
2020-06-11 20:58:36 -07:00
Yonghong Song 4db1878158 [BPF] fix incorrect type in BPFISelDAGToDAG readonly load optimization
In BPF Instruction Selection DAGToDAG transformation phase,
BPF backend had an optimization to turn load from readonly data
section to direct load of the values. This phase is implemented
before libbpf has readonly section support and before alu32
is supported.

This phase however may generate incorrect type when alu32 is
enabled. The following is an example,
  -bash-4.4$ cat ~/tmp2/t.c
  struct t {
    unsigned char a;
    unsigned char b;
    unsigned char c;
  };
  extern void foo(void *);
  int test() {
    struct t v = {
      .b = 2,
    };
    foo(&v);
    return 0;
  }

The compiler will turn local variable "v" into a readonly section.
During instruction selection phase, the compiler generates two
loads from readonly section, one 2 byte load or 1 byte load, e.g., for 2 loads,
  t8: i32,ch = load<(dereferenceable load 2 from `i8* getelementptr inbounds
       (%struct.t, %struct.t* @__const.test.v, i64 0, i32 0)`, align 1),
       anyext from i16> t3, GlobalAddress:i64<%struct.t* @__const.test.v> 0, undef:i64
  t9: ch = store<(store 2 into %ir.v1.sub1), trunc to i16> t3, t8,
    FrameIndex:i64<0>, undef:i64

BPF backend changed t8 to i64 = Constant<2> and eventually the generated machine IR:
  t10: i64 = MOV_ri TargetConstant:i64<2>
  t40: i32 = SLL_ri_32 t10, TargetConstant:i32<8>
  t41: i32 = OR_ri_32 t40, TargetConstant:i64<0>
  t9: ch = STH32<Mem:(store 2 into %ir.v1.sub1)> t41, TargetFrameIndex:i64<0>,
      TargetConstant:i64<0>, t3

Note that t10 in the above is not correct. The type should be i32 and instruction
should be MOV_ri_32. The reason for incorrect insn selection is BPF insn selection
generated an i64 constant instead of an i32 constant as specified in the original
load instruction. Such incorrect insn sequence eventually caused the following
fatal error when a COPY insn tries to copy a 64bit register to a 32bit subregister.
  Impossible reg-to-reg copy
  UNREACHABLE executed at ../lib/Target/BPF/BPFInstrInfo.cpp:42!

This patch fixed the issue by using the load result type instead of always i64
when doing readonly load optimization.

Differential Revision: https://reviews.llvm.org/D81630
2020-06-11 19:31:06 -07:00
Cyndy Ishida 28fefcc83c [llvm][llvm-nm] add TextAPI/MachO support
Summary:
This completes the needed glueing to support reading tbd files from nm.
This includes specifying which slice filtering with `--arch` and a new
option specifically for tbd files `--add-inlinedinfo` which will show
the reexported libraries that are appended in the tbd file.

Reviewers: ributzka, steven_wu, JDevlieghere, jhenderson

Reviewed By: JDevlieghere

Subscribers: hiraditya, MaskRay, dexonsmith, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81614
2020-06-11 18:54:16 -07:00
Alina Sbirlea 519b019a0a Verify MemorySSA after all updates.
Verify after completing all updates.
Resolves PR46275.
2020-06-11 18:48:41 -07:00
Eric Christopher 3ff8f61930 Tidy up unsigned -> Register fixups. 2020-06-11 16:50:58 -07:00
Eric Christopher cb21b16822 Add a diagnostic string to an assert. 2020-06-11 16:34:55 -07:00
Matt Arsenault 7d913becfc AMDGPU/GlobalISel: Fix select of private <2 x s16> load 2020-06-11 19:25:25 -04:00
Sanjay Patel 039ff29ef6 [VectorCombine] remove unused parameters; NFC 2020-06-11 19:15:03 -04:00
Vitaly Buka a10fc165f5 [StackSafety,NFC] Fix use of CallBase API
Code does not need iterate arguments and can get ArgNo from
CallBase::getArgOperandNo.
2020-06-11 16:11:30 -07:00
Matt Arsenault 27f8bd94cb AMDGPU/GlobalISel: Fix select of <8 x s64> scalar load 2020-06-11 19:09:43 -04:00
Matt Arsenault 2247072b65 AMDGPU/GlobalISel: Set insert point when emitting control flow pseudos
This was implicitly assuming the branch instruction was the next after
the pseudo. It's possible for another non-terminator instruction to be
inserted between the intrinsic and the branch, so adjust the insertion
point. Fixes a non-terminator after terminator verifier error (which
without the verifier, manifested itself as an infinite loop in
analyzeBranch much later on).
2020-06-11 18:53:26 -04:00
Kirill Naumov 1022b5eb5b [InlineCost] Preparational patch for creation of Printer pass.
- Renaming the printer class, flag
- Refactoring
- Changing some tests

This patch is a preparational stage for introducing a new printing pass and new
functionality to the existing Annotation Writer. I plan to extend
this functionality for this tool to be more useful when looking at the inline
process.
2020-06-11 22:29:03 +00:00
Fangrui Song 030897523d [Support] Don't tie errs() to outs() by default
This reverts part of D81156.

Accessing errs() concurrently was safe before and racy after D81156.
(`errs() << 'a'` is always racy)

Accessing outs() and errs() concurrently was safe before and racy after D81156.

Don't tie errs() to outs() by default to fix the fallout.
llvm-dwarfdump is single-threaded and opting in the tie behavior is safe.
2020-06-11 15:19:56 -07:00
Stanislav Mekhanoshin a98d618f6e Fixed assertion in SROA if block has ho successors
BasicBlock::isLegalToHoistInto() asserts if block does not
have successors. The case is degenarate but assertion still
needs to be avoided.

https://bugs.llvm.org/show_bug.cgi?id=46280

Differential Revision: https://reviews.llvm.org/D81674
2020-06-11 15:15:19 -07:00
Craig Topper c525168190 [X86] Remove unnecessary #if around call to isCpuIdSupported in getHostCPUName.
The exact same #if is already inside isCpuIdSupported and causes
it to return true. The definition of isCpuIdSupported isn't
conditional so we should be able just rely on its body doing
the right thing.
2020-06-11 15:13:28 -07:00
Thomas Lively c5d012341e [WebAssembly] Make BR_TABLE non-duplicable
Summary:
After their range checks were removed in 7f50c15be5, br_tables
started being duplicated into their predecessors by tail
folding. Unfortunately, when the br_tables were in loops this
transformation introduced bad irreducible control flow which was later
expanded into even more br_tables. This commit abuses the
`isNotDuplicable` property to prevent this irreducible control flow
from being introduced. This change saves a few dozen bytes of code
size and has a negligible affect on performance for most of the large
Emscripten benchmarks, but can improve performance significantly on
microbenchmarks of switches in loops.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81628
2020-06-11 15:11:45 -07:00
Reid Kleckner 1c03389c29 Re-land "Migrate the rest of COFFObjectFile to Error"
This reverts commit 101fbc0138.

Remove leftover debugging attribute.

Update LLDB as well, which was missed before.
2020-06-11 14:46:16 -07:00
Craig Topper 8fa3e8fa14 [X86] Force VIA PadLock crypto instructions to emit a 0xF3 prefix when they encode to match what GNU as does.
The spec for these says they need 0xf3 but also mentions REP
before the mnemonic. But I don't think its fair to users to make
them write REP first. And gas doesn't make them. objdump seems to
disassemble with or without the prefix and just prints any 0xf3
as REP.
2020-06-11 12:59:21 -07:00
Craig Topper 269d843720 [X86] Replace TB with PS on instructions that are documented in the SDM with 'NP'
'NP' means that the instruction is not recognized with a 66, F2 or F3
prefix. It will either #UD or decode to a different instruction.

All of the cases are here should fall into the #UD variety since
we should be detecting the collision with other instructions when
we build the disassembler tables.
2020-06-11 12:20:29 -07:00
diggerlin c6be3ea524 [NFC] clean up the AsmPrinter::emitLinkage for AIX part
SUMMARY:

Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage()

Reviewers:  Jason liu

Differential Revision: https://reviews.llvm.org/D81613
2020-06-11 13:33:51 -04:00
Petar Avramovic bd3d951b8b AMDGPU/GlobalISel: Fix lower for f64->f16 G_FPTRUNC
Put AND before ADD in LegalizerHelper::lowerFPTRUNC_F64_TO_F16
in order to match algorithm from AMDGPUTargetLowering::LowerFP_TO_FP16.

Differential Revision: https://reviews.llvm.org/D81666
2020-06-11 18:19:27 +02:00
Mircea Trofin e82eff7a03 [llvm][NFC] Factor some common data in InlineAdvice
Summary:
Other derivations will all want to emit optimization remarks and, as
part of that, use debug info.

Additionally, drive-by const-ing.

Reviewers: davidxl, dblaikie

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81507
2020-06-11 08:01:00 -07:00
Simon Pilgrim 7706c7af74 [X86] Fold vXi1 OR(KSHIFTL(X,NumElts/2),Y) -> KUNPCK
Convert shift+or bool vector patterns into CONCAT_VECTORS if we know this will be lowered to KUNPCK (which requires 16+ vector elements).

Fixes PR32547
2020-06-11 15:47:20 +01:00
serge-sans-paille bff09876d7 Fix return status of DataFlowSanitizer pass
Take into account added functions, global values and attribute change.

Differential Revision: https://reviews.llvm.org/D81239
2020-06-11 16:05:17 +02:00
Jay Foad 69bdfb075b [IR] Clean up dead instructions after simplifying a conditional branch
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.

Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when
removePredecessor calls PHINode::removeIncomingValue.

Differential Revision: https://reviews.llvm.org/D80206
2020-06-11 14:53:01 +01:00
Sam Parker 3d5f7c8531 [IR] Remove assert from ShuffleVectorInst
Which triggers on valid, but not useful, IR such as a undef mask.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46276

Differential Revision: https://reviews.llvm.org/D81634
2020-06-11 14:52:17 +01:00
Jay Foad f45c65aa41 Revert "[IR] Clean up dead instructions after simplifying a conditional branch"
This reverts commit 4494e45316.

It caused problems for sanitizer buildbots.
2020-06-11 14:22:16 +01:00
Jay Foad 4494e45316 [IR] Clean up dead instructions after simplifying a conditional branch
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.

Differential Revision: https://reviews.llvm.org/D80206
2020-06-11 13:28:10 +01:00
Jay Foad f79e6a8847 [MemCpyOptimizer] Simplify API of processStore and processMem* functions
Previously these functions either returned a "changed" flag or a "repeat
instruction" flag, and could also modify an iterator to control which
instruction would be processed next.

Simplify this by always returning a "changed" flag, and handling all of
the "repeat instruction" functionality by modifying the iterator.

No functional change intended except in this case:
// If the source and destination of the memcpy are the same, then zap it.
... where the previous code failed to process the instruction after the
zapped memcpy.

Differential Revision: https://reviews.llvm.org/D81540
2020-06-11 12:48:09 +01:00
Pavel Labath 9ed452f370 [llvm/DWARFDebugLine] Remove spurious full stop from warning messages
Other warnings messages don't have a trailing full stop.
2020-06-11 13:14:21 +02:00
Pavel Labath fccaa89e23 [llvm/DWARFDebugLine] Fix a typo in one warning message 2020-06-11 13:04:52 +02:00
Chris Jackson 4707bc2177 [DebugInfo] Refactor SalvageDebugInfo and SalvageDebugInfoForDbgValues
- Simplify the salvaging interface and the algorithm in InstCombine

Reviewers: vsk, aprantl, Orlando, jmorse, TWeaver

Reviewed by: Orlando

Differential Revision: https://reviews.llvm.org/D79863
2020-06-11 11:13:46 +01:00
Georgii Rymar 818ab3d654 [yaml2obj] - Allocate the file space for SHT_NOBITS sections in some cases.
This teaches yaml2obj to allocate file space for a no-bits section
when there is a non-nobits section in the same segment that follows it.

It was discussed in D78005 thread and matches GNU linkers and LLD behavior.

Differential revision: https://reviews.llvm.org/D80629
2020-06-11 12:54:53 +03:00
Simon Pilgrim 5cca9828ff [X86][AVX512] Avoid bitcasts between scalar and vXi1 bool vectors
AVX512 mask types are often bitcasted to scalar integers for various ops before being bitcast back to be used as a predicate. In many cases we can avoid these KMASK<->GPR transfers and perform equivalent operations on the mask unit.

If the destination mask type is legal, and we can confirm that the scalar op originally came from a mask/vector/float/double type then we should try to avoid the scalar entirely.

This avoids some codegen issues noticed while working on PTEST/MOVMSK improvements.

Partially fixes PR32547 - we don't create a KUNPCK yet, but OR(X,KSHIFTL(Y)) can be handled in a separate patch.

Differential Revision: https://reviews.llvm.org/D81548
2020-06-11 10:22:55 +01:00
Dominik Montada f24e2e9eeb [GlobalISel] fix crash in IRTranslator, MachineIRBuilder when translating @llvm.dbg.value intrinsic and using -debug
Summary:
Fix crash when using -debug caused by the GlobalISel observer trying to print
an incomplete DBG_VALUE instruction. This was caused by the MachineIRBuilder
using buildInstr, which immediately inserts the instruction causing print,
instead of using BuildMI to first build up the instruction and using
insertInstr when finished.

Add RUN-line to existing debug-insts.ll test with -debug flag set to make sure
no crash is happening.

Also fixed a missing %s in the 2nd RUN-line of the same test.

Reviewers: t.p.northover, aditya_nandakumar, aemerson, dsanders, arsenm

Reviewed By: arsenm

Subscribers: wdng, arsenm, rovka, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76934
2020-06-11 10:47:49 +02:00
Kristof Beyls 994748770c [NFC] Refactor ThunkInserter to make it available for all targets.
By moving target-independent code from
llvm/lib/Target/X86/X86IndirectThunks.cpp
to
llvm/include/llvm/CodeGen/IndirectThunks.h

Differential Revision: https://reviews.llvm.org/D81401
2020-06-11 08:38:44 +01:00
Craig Topper 08b275f62e [X86] Remove unnecessary In64BitMode predicate from TEST64ri32. NFC
This appears to have been added when In64BitMode was added to a
bunch of instructions that don't have register operands. When an
instruction uses a register the parser will prevent a 64-bit
register from being parsed on a 32-bit target. But with only
memory and immediate operands this doesn't happen.

TEST64ri32 does have a register operand so the issue the predicate
was supposed to fix doesn't apply.
2020-06-11 00:33:55 -07:00
David Sherwood bd97342a0c [CodeGen] Let computeKnownBits do something sensible for scalable vectors
Until we have a real need for computing known bits for scalable
vectors I have simply changed the code to bail out for now and
pretend we know nothing. I've also fixed up some simple callers of
computeKnownBits too.

Differential Revision: https://reviews.llvm.org/D80437
2020-06-11 08:17:11 +01:00
Kristof Beyls 0ee176edc8 [AArch64] Introduce AArch64SLSHardeningPass, implementing hardening of RET and BR instructions.
Some processors may speculatively execute the instructions immediately
following RET (returns) and BR (indirect jumps), even though
control flow should change unconditionally at these instructions.
To avoid a potential miss-speculatively executed gadget after these
instructions leaking secrets through side channels, this pass places a
speculation barrier immediately after every RET and BR instruction.

Since these barriers are never on the correct, architectural execution
path, performance overhead of this is expected to be low.

On targets that implement that Armv8.0-SB Speculation Barrier extension,
a single SB instruction is emitted that acts as a speculation barrier.
On other targets, a DSB SYS followed by a ISB is emitted to act as a
speculation barrier.

These speculation barriers are implemented as pseudo instructions to
avoid later passes to analyze them and potentially remove them.

Even though currently LLVM does not produce BRAA/BRAB/BRAAZ/BRABZ
instructions, these are also mitigated by the pass and tested through a
MIR test.

The mitigation is off by default and can be enabled by the
harden-sls-retbr subtarget feature.

Differential Revision:  https://reviews.llvm.org/D81400
2020-06-11 07:51:17 +01:00
Yvan Roux 6b8628a1f0 [ARM][MachineOutliner] Add NoLRSave mode.
Outline chunks of code which don't need a save/restore mechanism of the
link register.

Differential Revision: https://reviews.llvm.org/D80125
2020-06-11 08:45:46 +02:00
Craig Topper 1385ab356a [X86] Use X86AS enum constants to replace hardcoded numbers in more places. NFC 2020-06-10 22:31:21 -07:00
Craig Topper ed34140e11 [X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC 2020-06-10 22:06:34 -07:00
Craig Topper ba8d182597 Revert "[X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC"
This reverts commit 874800b4f7.

Forgot to update the clang includes
2020-06-10 21:24:44 -07:00
Craig Topper 874800b4f7 [X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC 2020-06-10 21:18:32 -07:00
Vitaly Buka 5b1c70a48d [StackSafety] Pass summary into codegen
Summary:
The patch wraps ThinLTO index into immutable
pass which can be used by StackSafety analysis.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80985
2020-06-10 21:02:54 -07:00
LemonBoy 7dac008596 [SPARC] Lower fp16 ops to libcalls
The fp16 ops are legalized by extending/chopping them as needed.
The tests are shamelessly stolen from the RISC-V backend.

Recommit with fixed RUN lines for the test.

Differential Revision: https://reviews.llvm.org/D77569
2020-06-10 19:15:26 -07:00
Matt Arsenault 19b3b886b7 AMDGPU/GlobalISel: Fix porting error in 32-bit division
The baffling thing is this passed the OpenCL conformance test for
32-bit integer divisions, but only failed in the 32-bit path of
BypassSlowDivisions for the 64-bit tests.
2020-06-10 21:48:58 -04:00
Xing GUO 99c2335434 [DWARFYAML][debug_ranges] Make the "Offset" field optional.
Before this patch, we have to calculate the offset for the current range list entry. This patch helps make the "Offset" field optional.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81220
2020-06-11 08:36:44 +08:00
Xing GUO 502a2a80c2 [DWARFYAML] Add support for emitting DWARF64 .debug_aranges section.
The `debug_info_offset`(`CuOffset`) should be 64-bit width rather than 32-bit width in DWARF64 .debug_aranges section. This patch helps resolve it.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81528
2020-06-11 08:35:17 +08:00