Commit Graph

128617 Commits

Author SHA1 Message Date
Simon Atanasyan 8ac68f9dc5 [mips] Put conditions when we need to expand memory operand into a separate function. NFC
`expandMemInst` expects instruction with 3 or 4 operands and the last
operand requires expanding. It's redundant to scan all operands in a
loop. We can check the last operands.
2019-11-20 16:07:16 +03:00
Simon Atanasyan 452d0b21e0 [mips] Make MipsAsmParser::isEvaluated static function. NFC 2019-11-20 16:07:16 +03:00
Dmitry Preobrazhensky 525f9c0be5 [AMDGPU][DPP] Corrected DPP combiner
Added a check to make sure that the selected dpp opcode is supported by target.

Reviewers: vpykhtin, arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D70402
2019-11-20 15:56:45 +03:00
Pavel Labath 089c0f5814 [DWARF] Add an api to get "interpreted" location lists
Summary:
This patch adds DWARFDie::getLocations, which returns the location
expressions for a given attribute (typically DW_AT_location). It handles
both "inline" locations and references to the external location list
sections (currently only of the DW_FORM_sec_offset type). It is
implemented on top of DWARFUnit::findLoclistFromOffset, which is also
added in this patch. I tried to make their signatures similar to the
equivalent range list functionality.

The actual location list interpretation logic is in
DWARFLocationTable::visitAbsoluteLocationList. This part is not
equivalent to the range list code, but this deviation is motivated by a
desire to reuse the same location list parsing code within lldb.

The functionality is tested via a c++ unit test of the DWARFDie API.

Reviewers: dblaikie, JDevlieghere, SouraVX

Subscribers: mgorny, hiraditya, cmtice, probinson, llvm-commits, aprantl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70394
2019-11-20 13:25:18 +01:00
Djordje Todorovic 979592a6f7 [DebugInfo] Remove the DIFlagArgumentNotModified debug info flag
Due to changes in D68206, we remove the DIFlagArgumentNotModified
and its usage.

Differential Revision: https://reviews.llvm.org/D68207
2019-11-20 13:18:40 +01:00
Serge Pavlov ea8678d1c7 Move floating point related entities to namespace level
This is recommit of commit e6584b2b7b, which was reverted in
30e7ee3c4b together with af57dbf12e.
Original message is below.

Enumerations that describe rounding mode and exception behavior were
defined inside ConstrainedFPIntrinsic. It makes sense to use the same
definitions to represent the same properties in other cases, not only
in constrained intrinsics. It was however inconvenient as required to
include constrained intrinsics definitions even if they were not needed.
Also using long scope prefix reduced readability.

This change moves these definitioins to the namespace llvm::fp.
No functional changes.

Differential Revision: https://reviews.llvm.org/D69552
2019-11-20 19:05:46 +07:00
Sameer Sahasrabuddhe 52c5014da0 [AMDGPU] add support for hostcall buffer pointer as hidden kernel argument
Hostcall is a service that allows a kernel to submit requests to the
host using shared buffers, and block until a response is
received. This will eventually replace the shared buffer currently
used for printf, and repurposes the same hidden kernel argument. This
change introduces a new ValueKind in the HSA metadata to represent the
hostcall buffer.

Differential Revision: https://reviews.llvm.org/D70038
2019-11-20 15:53:55 +05:30
Martin Storsjö f67534afd6 [ExecutionEngine] Add a missing break to avoid warnings
This fixes buildbot errors since dc3ee33089.
2019-11-20 12:20:05 +02:00
Adam Kallai dc3ee33089 ExecutionEngine: add preliminary support for COFF ARM64
Differential Revision: https://reviews.llvm.org/D69434
2019-11-20 10:59:42 +02:00
Serge Pavlov 0c50c0b055 [FEnv] File with properties of constrained intrinsics
Summary
In several places we need to enumerate all constrained intrinsics or IR
nodes that should be represented by them. It is easy to miss some of
the cases. To make working with these intrinsics more convenient and
robust, this change introduces file containing definitions of all
constrained intrinsics and some of their properties. This file can be
included to generate constrained intrinsics processing code.

Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69887
2019-11-20 13:30:07 +07:00
Austin Kerbow f3225f2abe AMDGPU/GlobalISel: Legalize FDIV64
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70403
2019-11-19 21:02:27 -08:00
Reid Kleckner 606a2bd621 [musttail] Don't forward AL on Win64
AL is only used for varargs on SysV platforms. Don't forward it on
Windows. This allows control flow guard to set up an extra hidden
parameter in RAX, as described in PR44049.

This also has the effect of freeing up RAX for use in virtual member
pointer thunks, which may also be a nice little code size improvement on
Win64.

Fixes PR44049

Reviewers: ajpaverd, efriedma, hans

Differential Revision: https://reviews.llvm.org/D70413
2019-11-19 16:54:00 -08:00
Francis Visoiu Mistrih bffdee8ef3 [LTO][Legacy] Add API for passing LLVM options separately
In order to correctly pass options to LLVM, including options containing
spaces which are used as delimiters for multiple options in
lto_codegen_debug_options, add a new API:
lto_codegen_debug_options_array.

Unfortunately, tools/lto has no testing infrastructure yet, so there are
no tests associated with this patch.

Differential Revision: https://reviews.llvm.org/D70463
2019-11-19 16:30:37 -08:00
Craig Topper c4b41e8d1d [LegalizeDAG][X86] Enable STRICT_FP_TO_SINT/UINT to be promoted
Differential Revision: https://reviews.llvm.org/D70220
2019-11-19 16:14:37 -08:00
Craig Topper 85589f8077 [X86] Add custom type legalization and lowering for scalar STRICT_FP_TO_SINT/UINT
This is a first pass at Custom lowering for these operations. I also updated some of the vector code where it was obviously easy and straightforward. More work needed in follow up.

This enables these operations to be handled with X87 where special rounding control adjustments are needed to perform a truncate.

Still need to fix Promotion in the target independent code in LegalizeDAG.
llrint/llround split into separate test file because we can't make a strict libcall properly yet either and we need to do that when i64 isn't a legal type.

This does not include any isel support. So we still rely on the mutation in SelectionDAGIsel to remove the strict from this stuff later. Except for the X87 stuff which goes through custom nodes that already had chains.

Differential Revision: https://reviews.llvm.org/D70214
2019-11-19 16:05:22 -08:00
Pete Couperus 85435bdde0 [ARC] Add InitializePasses header to fix ARC build. 2019-11-19 15:52:01 -08:00
Philip Reames 28a91473e3 [GuardWidening] Remove WidenFrequentBranches transform
This code has never been enabled.  While it is tested, it's complicating some refactoring.  If we decide to re-implement this, doing it in SimplifyCFG would probably make more sense anyways.
2019-11-19 15:15:52 -08:00
Philip Reames 70c68a6b0e [NFC] Factor out utilities for manipulating widenable branches
With the widenable condition construct, we have the ability to reason about branches which can be 'widened' (i.e. made to fail more often).  We've got a couple o transforms which leverage this.  This patch just cleans up the API a bit.

This is prep work for generalizing our definition of a widenable branch slightly.  At the moment "br i1 (and A, wc()), ..." is considered widenable, but oddly, neither "br i1 (and wc(), B), ..." or "br i1 wc(), ..." is.  That clearly needs addressed, so first, let's centralize the code in one place.
2019-11-19 14:43:13 -08:00
Philip Reames f3eb5dee57 [LoopPred] Generalize profitability check to handle unswitch output
Unswitch (and other loop transforms) like to generate loop exit blocks with unconditional successors, and phi nodes (LCSSA, or simple multiple exiting blocks sharing an exit).  Generalize the "likely very rare exit" check slightly to handle this form.
2019-11-19 14:06:36 -08:00
Benjamin Kramer cd4811360e [ValueTracking] Add a basic version of isKnownNonInfinity and use it to detect more NoNaNs 2019-11-19 22:24:46 +01:00
diggerlin ea13683f3d The patch is the compiler error specific on the compile error on CMVC
SUMMARY:

CMVC has a compiler error on the
const uint64_t OffsetToRaw = is64Bit()
                                   ? toSection64(Sec)->FileOffsetToRawData
                                   : toSection32(Sec)->FileOffsetToRawData;

while  gcc  compiler do not have the problem.
I have to change the code to

  uint64_t OffsetToRaw;
  if (is64Bit())
    OffsetToRaw = toSection64(Sec)->FileOffsetToRawData;
  else
    OffsetToRaw = toSection32(Sec)->FileOffsetToRawData;

Reviewers: Sean Fertile
Subscribers: rupprecht, seiyai,hiraditya

Differential Revision: https://reviews.llvm.org/D70255
2019-11-19 15:17:56 -05:00
Vedant Kumar ba71ca3720 [DebugInfo] Describe size of spilled values in call site params
A call site parameter description of a memory operand needs to
unambiguously convey the size of the operand to prevent incorrect entry
value evaluation.

Thanks for David Stenberg for pointing this issue out!
2019-11-19 12:03:52 -08:00
Duncan P. N. Exon Smith 3279724905 llvm/ObjCARC: Eliminate inlined AutoreleaseRV calls
Pair up inlined AutoreleaseRV calls with their matching RetainRV or
ClaimRV.

- RetainRV cancels out AutoreleaseRV.  Delete both instructions.
- ClaimRV is a peephole for RetainRV+Release.  Delete AutoreleaseRV and
  replace ClaimRV with Release.

This avoids problems where more aggressive inlining triggers memory
regressions.

This patch is happy to skip over non-callable instructions and non-ARC
intrinsics looking for the pair.  It is likely sound to also skip over
opaque function calls, but that's harder to reason about, and it's not
relevant to the goal here: if there's an opaque function call splitting
up a pair, it's very unlikely that a handshake would have happened
dynamically without inlining.

Note that this patch also subsumes the previous logic that looked
backwards from ReleaseRV.

https://reviews.llvm.org/D70370
rdar://problem/46509586
2019-11-19 12:02:01 -08:00
Sanjay Patel 0a8e7ca402 [SLP] fix miscompile on min/max reductions with extra uses (PR43948) (2nd try)
The 1st attempt was reverted because it revealed an existing
bug where we could produce invalid IR (use of value before
definition). That should be fixed with:
rG39de82ecc9c2

The bug manifests as replacing a reduction operand with an undef
value.

The problem appears to be limited to cases where a min/max reduction
has extra uses of the compare operand to the select.

In the general case, we are tracking "ExternallyUsedValues" and
an "IgnoreList" of the reduction operations, but those may not apply
to the final compare+select in a min/max reduction.

For that, we use replaceAllUsesWith (RAUW) to ensure that the new
vectorized reduction values are transferred to all subsequent users.

Differential Revision: https://reviews.llvm.org/D70148
2019-11-19 14:57:35 -05:00
Evgenii Stepanov 2535fe5ad3 MTE: add more unchecked instructions.
Summary:
In particular, 1- and 2-byte loads and stores ignore the pointer tag
when using SP as the base register.

Reviewers: pcc, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70341
2019-11-19 11:19:53 -08:00
David Green 882f23caea [ARM] MVE interleaving load and stores.
Now that we have the intrinsics, we can add VLD2/4 and VST2/4 lowering
for MVE. This works the same way as Neon, recognising the load/shuffles
combination and converting them into intrinsics in a pre-isel pass,
which just calls getMaxSupportedInterleaveFactor, lowerInterleavedLoad
and lowerInterleavedStore.

The main difference to Neon is that we do not have a VLD3 instruction.
Otherwise most of the code works very similarly, with just some minor
differences in the form of the intrinsics to work around. VLD3 is
disabled by making isLegalInterleavedAccessType return false for those
cases.

We may need some other future adjustments, such as VLD4 take up half the
available registers so should maybe cost more. This patch should get the
basics in though.

Differential Revision: https://reviews.llvm.org/D69392
2019-11-19 18:37:30 +00:00
diggerlin b91f798fde implement printing out raw section data of xcoff objectfile for llvm-objdump
SUMMARY:
implement printing out raw section data of xcoff objectfile for llvm-objdump
and option -D --disassemble-all option for llvm-objdump

Reviewers: Sean Fertile
Subscribers: rupprecht, seiyai,hiraditya

Differential Revision: https://reviews.llvm.org/D70255
2019-11-19 13:31:00 -05:00
Matt Arsenault 7fe9435dc8 Work on cleaning up denormal mode handling
Cleanup handling of the denormal-fp-math attribute. Consolidate places
checking the allowed names in one place.

This is in preparation for introducing FP type specific variants of
the denormal-fp-mode attribute. AMDGPU will switch to using this in
place of the current hacky use of subtarget features for the denormal
mode.

Introduce a new header for dealing with FP modes. The constrained
intrinsic classes define related enums that should also be moved into
this header for uses in other contexts.

The verifier could use a check to make sure the denorm-fp-mode
attribute is sane, but there currently isn't one.

Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by
default in the one current user. Clang must be taught to start
emitting this attribute by default to avoid regressions when this is
switched to assume ieee behavior if the attribute isn't present.
2019-11-19 22:01:14 +05:30
jasonliu c9edaa828e [AIX][XCOFF] Write Function descriptors and TOC base to data section
This patch implements writing function descriptors and TOC base into
data section, and also add function descriptors(both csect and label)
and TOC base symbols to the symbol table.
2019-11-19 16:11:00 +00:00
Sanjay Patel 39de82ecc9 [SLP] fix insertion point for min/max reduction
As discussed in D70148 (and caused a revert of the original commit):
if we insert at the select, then we can produce invalid IR because
the replacement for the compare may have uses before the select.
2019-11-19 10:50:10 -05:00
Simon Tatham 254b4f2500 [ARM,MVE] Add intrinsics for scalar shifts.
This fills in the small family of MVE intrinsics that have nothing to
do with vectors: they implement bit-shift operations on 32- or 64-bit
values held in one or two general-purpose registers. Most of these
shift operations saturate if shifting left, and round to nearest if
shifting right, although LSLL and ASRL behave like ordinary shifts.

When these instructions take a variable shift count in a register,
they pay attention to its sign, so that (for example) LSLL or UQRSHLL
will shift left if given a positive number but right if given a
negative one. That makes even LSLL and ASRL different enough from
standard LLVM IR shift semantics that I couldn't see any better
alternative than to simply model the whole family as a set of
MVE-specific IR intrinsics.

(The //immediate// forms of LSLL and ASRL, on the other hand, do
behave exactly like a standard IR shift of a 64-bit value. In fact,
those forms don't have ACLE intrinsics defined at all, because you can
just write an ordinary C shift operation if you want one of those.)

The 64-bit shifts have to be instruction-selected in C++, because they
deliver two output values. But the 32-bit ones are simple enough that
I could write a DAG isel pattern directly into each Instruction
record.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70319
2019-11-19 14:47:29 +00:00
Matt Arsenault db0ed3e429 AMDGPU: Refactor treatment of denormal mode
Start moving towards treating this as a property of the calling
convention, and not the subtarget. The default denormal mode should
not be part of the subtarget, and be moved into a separate function
attribute.

This patch is still NFC. The denormal mode remains as a subtarget
feature for now, but make the necessary changes to switch to using an
attribute.
2019-11-19 19:55:43 +05:30
Matt Arsenault ea23b6428b AMDGPU: Be explicit about denormal mode in MIR tests
Start checking the machine function in GlobalISel instead of the
target directly.

This temporarily breaks fcanonicalize selection in GlobalISel.
2019-11-19 19:55:43 +05:30
Matt Arsenault b696b9dba7 DAG: Add function context to isFMAFasterThanFMulAndFAdd
AMDGPU needs to know the FP mode for the function to answer this
correctly when this is removed from the subtarget.

AArch64 had to make this more complicated by using this from an IR
hook, so add an IR typed overload.
2019-11-19 19:25:26 +05:30
dfukalov 6fd11b14f6 [AMDGPU] Tune inlining parameters for AMDGPU target (part 2)
Summary:
Most of IR instructions got better code size estimations after commit 47a5c36b.
So default parameters values should be updated to improve inlining and
unrolling for the target.

Reviewers: rampitec, arsenm

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70391
2019-11-19 16:33:16 +03:00
evgeny ef5e3b85ee [ThinLTO] Simplify code. NFC 2019-11-19 15:51:25 +03:00
Simon Pilgrim bbf4af3109 [X86][SSE] Remove XFormVExtractWithShuffleIntoLoad to prevent legalization infinite loops (PR43971)
As detailed in PR43971/D70267, the use of XFormVExtractWithShuffleIntoLoad causes issues where we end up in infinite loops of extract(targetshuffle(vecload)) -> extract(shuffle(vecload)) -> extract(vecload) -> extract(targetshuffle(vecload)), there are just too many legalization checks at every stage that we can't guarantee that extract(shuffle(vecload)) -> scalarload can occur.

At the moment we see a number of minor regressions as we don't fold extract(shuffle(vecload)) -> scalarload before legal ops, these can be addressed in future patches and extension of X86ISelLowering's combineExtractWithShuffle.
2019-11-19 11:55:44 +00:00
Thomas Preud'homme a89ca4ae17 Fix PR44001: assert failure in getFunctionLocalOffsetAfterInsn
Summary:
Assert in getFunctionLocalOffsetAfterInsn() fails when processing a call
MachineInstr inside a bundle and compiling with debug info. This is
because labels are added by DwarfDebug::beginInstruction() which is
called for each top-level MI by EmitFunctionBody()'s for-loop iteration
but constructCallSiteEntryDIEs() which calls
getFunctionLocalOffsetAfterInsn() iterates over all MIs.

This commit modifies constructCallSiteEntryDIEs() to get the associated
bundle MI for call MIs inside a bundle and use that to when calling
getFunctionLocalOffsetAfterInsn() and getLabelAfterInsn(). It also skips
loop iterations for bundle MIs since the loop statements are concerned
with debug info for each physical instructions and bundles represent a
group of instructions. It also fix the comment about PCAddr since the
code is getting the return address and not the call address.

Reviewers: dstenb, vsk, aprantl, djtodoro, dblaikie, NikolaPrica

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70293
2019-11-19 11:23:11 +00:00
Simon Atanasyan 7deb8ce4c1 [mips] Joint MipsMemSimmXXXAsmOperand into the single template class. NFC 2019-11-19 13:58:28 +03:00
Evgeniy Brevnov 5f026b6d9e [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151
Summary:
Dependence anlysis has a mechanism to cache results. Thus for particular memory access the cache keep track of side effects in basic blocks. The problem is that for invariant loads dependepce analysis legally ignores many dependencies due to a special semantic rules for such loads. But later results calculated for invariant load retrived from the cache for general case acceses. As a result we have wrong dependence information causing GVN to do illegal transformation. Fixes, T42151.

Proposed solution is to disable caching of invariant loads. I think such loads a pretty rare and it doesn't make sense to extend caching mechanism for them.

Reviewers: reames, chandlerc, skatkov, morisset, jdoerfert

Reviewed By: reames

Subscribers: hiraditya, test, jdoerfert, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64405
2019-11-19 17:30:02 +07:00
evgeny 4ef9315c4b [ThinLTO] Make ValueInfo::operator bool() explicit
Differential revision: https://reviews.llvm.org/D70383
2019-11-19 12:46:09 +03:00
Pavel Labath 39285a0f02 Add streaming/equality operators to DWARFAddressRange/DWARFLocationExpression
The main motivation for this is being able to write simpler assertions
and get better error messages in unit tests.

Split off from D70394.
2019-11-19 10:34:30 +01:00
Pavel Labath c0fc29c468 Add operator<< for object::SectionedAddress
The main motivation for this is better failure messages in unit tests.

Split off from D70394.
2019-11-19 10:34:30 +01:00
Sam Parker d43913ae38 [ARM][MVE] Enable narrow vectors for tail pred
Remove the restriction, from the mve tail predication pass, that the
all masked vectors instructions need to be 128-bits. This allows us
to supported extending loads and truncating stores.

Differential Revision: https://reviews.llvm.org/D69946
2019-11-19 08:51:12 +00:00
Evgeniy Brevnov 4a64d710ae [NFC] Test commit. Please ignore.
As a test commit I fixed a misspelling in one of comments in SLP
vectorizer.
2019-11-19 15:41:57 +07:00
Sam Parker 8978c12b39 [ARM][MVE] Tail predication conversion
This patch modifies ARMLowOverheadLoops to convert a predicated
vector low-overhead loop into a tail-predicatd one. This is currently
a very basic conversion, with the following restrictions:
- Operates only on single block loops.
- The loop can only contain a single vctp instruction.
- No other instructions can write to the vpr.
- We only allow a subset of the mve instructions in the loop.

TODO: Pass the number of elements, not the number of iterations to
dlstp/wlstp.

Differential Revision: https://reviews.llvm.org/D69945
2019-11-19 08:22:18 +00:00
Craig Topper dc02eb1909 [SelectionDAG] Merge the two identical ExpandChainLibCall methods from LegalizeTypes and LegalizeDAG to one version in TaretLowering.
Reviewers: RKSimon, efriedma, spatel

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70354
2019-11-18 20:22:33 -08:00
Leonard Chan 66b6b92765 Revert "implement printing out raw section data of xcoff objectfile for llvm-objdump"
This reverts commit 8f8a9f3437.

Reverting since this patch seems to break a lot of llvm buildbots.
2019-11-18 20:05:57 -08:00
Andrew Browne 6a1b51282b Fix error message missed in commit dde589389f.
Patch by Andrew Browne <browneee@google.com>

Reviewers: tejohnson, evgeny777

Reviewed By: tejohnson

Subscribers: arphaman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70195
2019-11-18 16:04:09 -08:00
Teresa Johnson aeca47fa0f ThinLTO: Fix assembler to emit alwaysInline in the summary
Summary: The earlier commit (https://reviews.llvm.org/D70014) missed this one : If Always_Inline happens to be the only entry in FuncFlags, then the assembler will not print it in the summary.

Patch by Bharathi Seshadri <bseshadr@cisco.com>

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70323
2019-11-18 15:02:13 -08:00