Commit Graph

39563 Commits

Author SHA1 Message Date
Evgeny Leviant 8973fae195 [WPD] Allow load/save bitcoded index when running opt -wholeprogramdevirt
Differential revision: https://reviews.llvm.org/D73094
2020-01-24 00:31:39 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Fangrui Song 22467e2595 Add function attribute "patchable-function-prefix" to support -fpatchable-function-entry=N,M where M>0
Similar to the function attribute `prefix` (prefix data),
"patchable-function-prefix" inserts data (M NOPs) before the function
entry label.

-fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry)
will look like:

```
  .type	foo,@function
.Ltmp0:               # @foo
  nop
foo:
.Lfunc_begin0:
  # optional `bti c` (AArch64 Branch Target Identification) or
  # `endbr64` (Intel Indirect Branch Tracking)
  nop

  .section  __patchable_function_entries,"awo",@progbits,get,unique,0
  .p2align  3
  .quad .Ltmp0
```

-fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable
placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html):

```
(a)         (b)

func:       func:
.Ltmp0:     bti c
  bti c     .Ltmp0:
  nop       nop
```

(a) needs no additional code. If the consensus is to go for (b), we will
need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp .

Differential Revision: https://reviews.llvm.org/D73070
2020-01-23 17:02:27 -08:00
Johannes Doerfert 5429c82db2 [Attributor][FIX] Avoid dangling pointers during code deletion
It can happen that we have instructions in the ToBeDeletedInsts set
which are deleted earlier already. To avoid dangling pointers we use
weak tracking handles.
2020-01-23 18:42:45 -06:00
Alina Sbirlea 1d09174290 [LoopStrengthReduce] Reuse utility method to clean dead instructions. [NFCI]
Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility
method, which sets to nullptr the instructions that are not trivially
dead. Use the new method in LoopStrengthReduce.
Alternative: add a bool to the same method; this option adds a marginal
amount of overhead to the other callers, and the method needs to be
updated to return a bool status when it removes/doesn't remove
instructions.
2020-01-23 16:27:32 -08:00
Teresa Johnson 59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Alina Sbirlea 9e66c4ec12 [Utils] Use WeakTrackingVH in vector used as scratch storage.
The utility method RecursivelyDeleteTriviallyDeadInstructions receives
as input a vector of Instructions, where all inputs are valid
instructions. This same vector is used as a scratch storage (per the
header comment) to recursively delete instructions. If an instruction is
added as an operand of multiple other instructions, it may be added twice,
then deleted once, then the second reference in the vector is invalid.
Switch to using a Vector<WeakTrackingVH>.
This change facilitates a clean-up in LoopStrengthReduction.
2020-01-23 16:04:57 -08:00
Matt Arsenault c77bbea9a6 GlobalISel: Add MIPatternMatch for G_ICMP/G_FCMP 2020-01-23 13:30:47 -08:00
Teresa Johnson 9c2eb220ed [ThinLTO] Summarize vcall_visibility metadata
Summary:
Second patch in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

Summarize vcall_visibility metadata in ThinLTO global variable summary.

Depends on D71907.

Reviewers: pcc, evgeny777, steven_wu

Subscribers: mehdi_amini, Prazek, inglorion, hiraditya, dexonsmith, arphaman, ostannard, llvm-commits, cfe-commits, davidxl

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71911
2020-01-23 13:19:56 -08:00
Reid Kleckner e5caa156b4 [PDB] Simplify API for making section map, NFC
Prevents API misuse described in PR44495
2020-01-23 12:15:21 -08:00
Teresa Johnson 458676db6e [WPD/VFE] Always emit vcall_visibility metadata for -fwhole-program-vtables
Summary:
First patch to support Safe Whole Program Devirtualization Enablement,
see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

Always emit !vcall_visibility metadata under -fwhole-program-vtables,
and not just for -fvirtual-function-elimination. The vcall visibility
metadata will (in a subsequent patch) be used to communicate to WPD
which vtables are safe to devirtualize, and we will optionally convert
the metadata to hidden visibility at link time. Subsequent follow on
patches will help enable this by adding vcall_visibility metadata to the
ThinLTO summaries, and always emit type test intrinsics under
-fwhole-program-vtables (and not just for vtables with hidden
visibility).

In order to do this safely with VFE, since for VFE all vtable loads must
be type checked loads which will no longer be the case, this patch adds
a new "Virtual Function Elim" module flag to communicate to GlobalDCE
whether to perform VFE using the vcall_visibility metadata.

One additional advantage of using the vcall_visibility metadata to drive
more WPD at LTO link time is that we can use the same mechanism to
enable more aggressive VFE at LTO link time as well. The link time
option proposed in the RFC will convert vcall_visibility metadata to
hidden (aka linkage unit visibility), which combined with
-fvirtual-function-elimination will allow it to be done more
aggressively at LTO link time under the same conditions.

Reviewers: pcc, ostannard, evgeny777, steven_wu

Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71907
2020-01-23 11:36:01 -08:00
Alina Sbirlea a0f627d584 [IndVarSimplify] Fix for MemorySSA preserve. 2020-01-23 11:06:16 -08:00
Justin Bogner b81a337be7 [LoopUnroll] Avoid UB when converting from WeakVH to `Value *`
Calling `operator*` on a WeakVH with a null value yields a null
reference, which is UB. Avoid this by implicitly converting the WeakVH
to a `Value *` rather than dereferencing and then taking the address
for the type conversion.

Differential Revision: https://reviews.llvm.org/D73280
2020-01-23 10:36:39 -08:00
Danilo Carvalho Grael 58ceb81d31 [SVE] Add SVE2 patterns for unpredicated multiply instructions
Summary:
Add patterns for SVE2 unpredicated multiply instructions:
- mul, smulh, umulh, pmul, sqdmulh, sqrdmulh

Reviewers: sdesmalen, huntergr, efriedma, c-rhodes, kmclaughlin, rengolin

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72799
2020-01-23 13:20:53 -05:00
Matt Arsenault 4faf71a143 GlobalISel: Use Register 2020-01-23 12:04:20 -05:00
Guillaume Chatelet 59f95222d4 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Michael Liao 398175e5c7 Fix GCC warning/error '-fpermission'. NFC. 2020-01-23 10:45:02 -05:00
Alexey Lapshin a8c5a461a8 [Dsymutil][Debuginfo][NFC] #4 Refactor dsymutil to separate DWARF optimizing part.
Summary:
The primary goal of this refactoring is to separate DWARF optimizing part.
So that it could be reused by linker or by any other client.
There was a thread on llvm-dev discussing the necessity of such a refactoring:

http://lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html.

This is a final part from series of patches for dsymutil.
Previous patches : D71068, D71839, D72476. This patch:

1. Creates lib/DWARFLinker interface :

   void addObjectFile(DwarfLinkerObjFile &ObjFile);
   bool link();
   void setOptions;

1. Moves all linking logic from tools/dsymutil/DwarfLinkerForBinary
   into lib/DWARFLinker.
2. Renames RelocationManager into AddressesManager.
3. Remarks creation logic moved from separate parallel execution
   into object file loading routine.

Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM bundle
matches for the dsymutil with/without that patch.

Reviewers: JDevlieghere, friss, dblaikie, aprantl, jdoerfert

Reviewed By: JDevlieghere

Subscribers: merge_guards_bot, hiraditya, jfb, llvm-commits, probinson, thegameg

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D72915
2020-01-23 18:16:32 +03:00
Kazu Hirata 41784bed01 Revert "Resubmit: [JumpThreading] Thread jumps through two basic blocks"
This reverts commit 53b68e676f.

Our internal tests are showing breakage with this patch.
2020-01-23 06:34:03 -08:00
Sam Parker 0d1468db58 [NFC][RDA] Make the interface const
Make all the public query methods const.
2020-01-23 13:32:11 +00:00
Guillaume Chatelet 279fa8e006 [Alignement][NFC] Deprecate untyped CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73260
2020-01-23 13:34:32 +01:00
Kerry McLaughlin aa0f37e14a [AArch64][SVE] Add first-faulting load intrinsic
Summary:
Implements the llvm.aarch64.sve.ldff1 intrinsic and DAG
combine rules for first-faulting loads with sign & zero extends

Reviewers: sdesmalen, efriedma, andwar, dancgr, rengolin

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73025
2020-01-23 11:57:16 +00:00
Igor Kudrin 8306f55bfa [DWARF] Eliminate the DWARFDebugNames::Header::Padding field.
The padding field is reserved for DWARF and does not contain any useful
information. No need to read, store and report it.

Differential Revision: https://reviews.llvm.org/D73042
2020-01-23 15:11:58 +07:00
Igor Kudrin 99960de741 [DWARF] Get rid of DWARFDebugNames::HeaderPOD. NFC.
This structure was used to get the size of the fixed-size part of a Name
Index header for 32-bit DWARF. It is unsuitable for 64-bit DWARF because
the size of the unit length field is different.

Differential Revision: https://reviews.llvm.org/D73040
2020-01-23 15:11:58 +07:00
Igor Kudrin 5a9ef6c15f [DWARF] Support 64-bit DWARF in .debug_pubnames and similar tables.
Differential Revision: https://reviews.llvm.org/D73103
2020-01-23 14:51:00 +07:00
Daniil Suchkov 6fc9e60149 NFC. Remove obsolete SimpleAnalysis infrastructure
Apparently cache of AliasSetTrackers held by LICM was the only user of
SimpleAnalysis infrastructure. Now, given that we no longer have that
cache, this infrastructure is obsolete and, taking into account its
nature, we don't want any new solutions to be based on it.

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed-By: asbirlea

Differential Revision: https://reviews.llvm.org/D73085
2020-01-23 13:58:30 +07:00
Igor Kudrin 15ac727714 Fix build bot failures.
Unfortunately, not all compilers allow using llvm_unreachable
in a constexpr function.
2020-01-23 13:14:21 +07:00
Igor Kudrin 6332990721 [DWARF] Support DWARF64 in DWARFDebugArangeSet.
This allows parsing Address Range Tables in the 64-bit DWARF format.

Differential Revision: https://reviews.llvm.org/D71876
2020-01-23 12:41:05 +07:00
Igor Kudrin a0f367f792 [DWARF] Make dwarf::getDwarfOffsetByteSize() a free function. NFC.
This will help simplify code in upcoming patches and make some
expressions constexpr.

Differential Revision: https://reviews.llvm.org/D73039
2020-01-23 12:41:05 +07:00
Igor Kudrin d6f39cfed0 [DWARF] Make dwarf::getUnitLengthFieldByteSize() constexpr. NFC.
This will help make some expressions in upcoming patches constexpr.

Differential Revision: https://reviews.llvm.org/D73036
2020-01-23 12:41:05 +07:00
Igor Kudrin dcff3961c2 [DWARF] Return Error from DWARFDebugArangeSet::extract().
This helps to detect and report parsing errors better.
The patch follows the ideas of LLDB's patches D59370 and D59381.

It adds tests for valid and some invalid cases. More checks and
tests to come. Note that the patch fixes validation of the Length
field because the value does not include the field itself.

The existing users are updated to show the error messages.

Differential Revision: https://reviews.llvm.org/D71875
2020-01-23 12:41:05 +07:00
James Clarke 3f5976c97d [RISCV] Fix evaluating %pcrel_lo against global and weak symbols
Summary:
Previously, we would erroneously turn %pcrel_lo(label), where label has
a %pcrel_hi against a weak symbol, into %pcrel_lo(label + offset), as
evaluatePCRelLo would believe the target independent logic was going to
fold it. Moreover, even if that were fixed, shouldForceRelocation lacks
an MCAsmLayout and thus cannot evaluate the %pcrel_hi fixup to a value
and check the symbol, so we would then erroneously constant-fold the
%pcrel_lo whilst leaving the %pcrel_hi intact. After D72197, this same
sequence also occurs for symbols with global binding, which is triggered
in real-world code.

Instead, as discussed in D71978, we introduce a new FKF_IsTarget flag to
avoid these kinds of issues. All the resolution logic happens in one
place, with no coordination required between RISCAsmBackend and
RISCVMCExpr to ensure they implement the same logic twice. Although the
implementation of %pcrel_hi can be left as target independent, we make
it target dependent to ensure that they are handled identically to
%pcrel_lo, otherwise we risk one of them being constant folded but the
other being preserved. This also allows us to properly support fixup
pairs where the instructions are in different fragments.

Reviewers: asb, lenary, efriedma

Reviewed By: efriedma

Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73211
2020-01-23 02:05:48 +00:00
Nikita Popov efba7ed05e [PatternMatch] Make m_c_ICmp swap the predicate (PR42801)
This addresses https://bugs.llvm.org/show_bug.cgi?id=42801.
The m_c_ICmp() matcher is changed to provide the swapped predicate
if the operands are swapped.

Existing uses of m_c_ICmp() fall in one of two categories: Working
on equality predicates only, where swapping is irrelevant.
Or performing a manual swap, in which case this patch removes it.

The only exception is the foldICmpWithLowBitMaskedVal() fold, which
does not swap the predicate, and instead reasons about whether
a swap occurred or not for each predicate. Getting the swapped
predicate allows us to merge the logic for pairs of predicates,
instead of duplicating it.

Differential Revision: https://reviews.llvm.org/D72976
2020-01-22 22:56:26 +01:00
Nikita Popov ed80c86c88 [PatternMatch] Add m_APInt/m_APFloat matchers accepting undef
The current m_APInt() and m_APFloat() matchers do not accept splats
that include undefs (unlike m_Zero() and other matchers for specific
values). We can't simply change the default behavior, as there are
existing transforms that would not be safe with undefs.

For this reason, I'm introducing new m_APIntAllowUndef() and
m_APFloatAllowUndef() matchers, that allow splats with undefs.
Additionally, m_APIntForbidUndef() and m_APFloatForbidUndef() are
added. These have the same behavior as the existing m_APInt() and
m_APFloat(), but serve as an explicit indication that undefs were
considered and found unsound for this transform. This helps
distinguish them from existing uses of m_APInt() where we do not
know whether undefs can or cannot be allowed without additional review.

Differential Revision: https://reviews.llvm.org/D72975
2020-01-22 22:49:32 +01:00
Alina Sbirlea efb130fc93 [LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available.
If MemorySSA analysis is analysis, LoopDeletion now preserves it.
2020-01-22 11:38:38 -08:00
Aaron Ballman 90cfbb8167 Add LLVM_VALUE_FUNCTION to Optional::map(); NFC
This is for future-proofing when compiling with MSVC once we drop support for 2017.
2020-01-22 14:21:08 -05:00
Aaron Ballman 1e4764e103 Add a comment about when we can remove this construct; NFC. 2020-01-22 13:17:38 -05:00
Nico Weber cd470717d1 Revert "[DA][TTI][AMDGPU] Add option to select GPUDA with TTI"
This reverts commit a90a6502ab.
Broke tests on Windows: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/13808
2020-01-22 12:56:19 -05:00
Aaron Ballman dfe9f130e0 Revert "Unconditionally enable lvalue function designators; NFC"
This reverts commit 968561bcdc
2020-01-22 12:40:39 -05:00
David Tenty 45a4aaea7f [NFC][XCOFF] Refactor Csect creation into TargetLoweringObjectFile
Summary:
We create a number of standard types of control sections in multiple places for
things like the function descriptors, external references and the TOC anchor
among others, so it is possible for  their properties to be defined
inconsistently in different places. This refactor moves their creation and
properties into functions in the TargetLoweringObjectFile class hierarchy, where
functions for retrieving various special types of sections typically seem
to reside.

Note: There is one case in PPCISelLowering which is specific to function entry
points which we don't address since we don't have access to the TLOF there.

Reviewers: DiggerLin, jasonliu, hubert.reinterpretcast

Reviewed By: jasonliu, hubert.reinterpretcast

Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72347
2020-01-22 12:09:11 -05:00
Aaron Ballman 968561bcdc Unconditionally enable lvalue function designators; NFC
We previously had to guard against older MSVC and GCC versions which had rvalue
references but not support for marking functions with ref qualifiers. However,
having bumped our minimum required version to MSVC 2017 and GCC 5.1 mean we can
unconditionally enable this feature. Rather than keeping the macro around, this
replaces use of the macro with the actual ref qualifier.
2020-01-22 09:54:34 -05:00
Sander de Smalen 4cf16efe49 [AArch64][SVE] Add patterns for unpredicated load/store to frame-indices.
This patch also fixes up a number of cases in DAGCombine and
SelectionDAGBuilder where the size of a scalable vector is used in a
fixed-width context (thus triggering an assertion failure).

Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71215
2020-01-22 14:32:27 +00:00
Jay Foad e0f0d0e55c [MachineScheduler] Allow clustering mem ops with complex addresses
The generic BaseMemOpClusterMutation calls into TargetInstrInfo to
analyze the address of each load/store instruction, and again to decide
whether two instructions should be clustered. Previously this had to
represent each address as a single base operand plus a constant byte
offset. This patch extends it to support any number of base operands.

The old target hook getMemOperandWithOffset is now a convenience
function for callers that are only prepared to handle a single base
operand. It calls the new more general target hook
getMemOperandsWithOffset.

The only requirements for the base operands returned by
getMemOperandsWithOffset are:
- they can be sorted by MemOpInfo::Compare, such that clusterable ops
  get sorted next to each other, and
- shouldClusterMemOps knows what they mean.

One simple follow-on is to enable clustering of AMDGPU FLAT instructions
with both vaddr and saddr (base register + offset register). I've left
a FIXME in the code for this case.

Differential Revision: https://reviews.llvm.org/D71655
2020-01-22 14:28:24 +00:00
Matt Arsenault 64e9528201 AMDGPU: Fix missing immarg on llvm.amdgcn.interp.mov
The first operand maps to an immediate field, so this should be
immarg.
2020-01-22 09:01:34 -05:00
Kerry McLaughlin cdcc4f2a44 [AArch64][SVE] Add intrinsic for non-faulting loads
Summary:
This patch adds the llvm.aarch64.sve.ldnf1 intrinsic, plus
DAG combine rules for non-faulting loads and sign/zero extends

Reviewers: sdesmalen, efriedma, andwar, dancgr, mgudim, rengolin

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71698
2020-01-22 11:15:20 +00:00
Sander de Smalen 67d4c9924c Add support for (expressing) vscale.
In LLVM IR, vscale can be represented with an intrinsic. For some targets,
this is equivalent to the constexpr:

  getelementptr <vscale x 1 x i8>, <vscale x 1 x i8>* null, i32 1

This can be used to propagate the value in CodeGenPrepare.

In ISel we add a node that can be legalized to one or more
instructions to materialize the runtime vector length.

This patch also adds SVE CodeGen support for VSCALE, which maps this
node to RDVL instructions (for scaled multiples of 16bytes) or CNT[HSD]
instructions (scaled multiples of 2, 4, or 8 bytes, respectively).

Reviewers: rengolin, cameron.mcinally, hfinkel, sebpop, SjoerdMeijer, efriedma, lattner

Reviewed by: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68203
2020-01-22 10:09:27 +00:00
Guillaume Chatelet 0957233320 [Alignment][NFC] Use Align with CreateMaskedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73106
2020-01-22 11:04:39 +01:00
Austin Kerbow a90a6502ab [DA][TTI][AMDGPU] Add option to select GPUDA with TTI
Summary: Enable the new diveregence analysis by default for AMDGPU.

Reviewers: rampitec, nhaehnle, arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73049
2020-01-21 21:13:20 -08:00
Lang Hames ce2207abaf [ORC] Add support for emulated TLS to ORCv2.
This commit adds a ManglingOptions struct to IRMaterializationUnit, and replaces
IRCompileLayer::CompileFunction with a new IRCompileLayer::IRCompiler class. The
ManglingOptions struct defines the emulated-TLS state (via a bool member,
EmulatedTLS, which is true if emulated-TLS is enabled and false otherwise). The
IRCompileLayer::IRCompiler class wraps an IRCompiler (the same way that the
CompileFunction typedef used to), but adds a method to return the
IRCompileLayer::ManglingOptions that the compiler will use.

These changes allow us to correctly determine the symbols that will be produced
when a thread local global variable defined at the IR level is compiled with or
without emulated TLS. This is required for ORCv2, where MaterializationUnits
must declare their interface up-front.

Most ORCv2 clients should not require any changes. Clients writing custom IR
compilers will need to wrap their compiler in an IRCompileLayer::IRCompiler,
rather than an IRCompileLayer::CompileFunction, however this should be a
straightforward change (see modifications to CompileUtils.* in this patch for an
example).
2020-01-21 19:55:33 -08:00
Amara Emerson 67a8775322 [AArch64] Don't generate gpr CSEL instructions in early-ifcvt if regclasses aren't compatible.
In GlobalISel we may in some unfortunate circumstances generate PHIs with
operands that are on separate banks. If-conversion doesn't currently check for
that case and ends up generating a CSEL on AArch64 with incorrect register
operands.

Differential Revision: https://reviews.llvm.org/D72961
2020-01-21 16:51:31 -08:00
Quentin Colombet ff1f3cc1a1 [GISelKnownBits] Make the max depth a parameter of the analysis
Allow users of that analysis to define the cut off depth of the
analysis instead of hardcoding 6.

NFC as the default parameter is 6.
2020-01-21 11:35:31 -08:00
Thomas Lively 3ef169e586 [WebAssembly][InstrEmitter] Foundation for multivalue call lowering
Summary:
WebAssembly is unique among upstream targets in that it does not at
any point use physical registers to store values. Instead, it uses
virtual registers to model positions in its value stack. This means
that some target-independent lowering activities that would use
physical registers need to use virtual registers instead for
WebAssembly and similar downstream targets. This CL generalizes the
existing `usesPhysRegsForPEI` lowering hook to
`usesPhysRegsForValues` in preparation for using it in more places.

One such place is in InstrEmitter for instructions that have variadic
defs. On register machines, it only makes sense for these defs to be
physical registers, but for WebAssembly they must be virtual registers
like any other values. This CL changes InstrEmitter to check the new
target lowering hook to determine whether variadic defs should be
physical or virtual registers.

These changes are necessary to support a generalized CALL instruction
for WebAssembly that is capable of returning an arbitrary number of
arguments. Fully implementing that instruction will require additional
changes that are described in comments here but left for a follow up
commit.

Reviewers: aheejin, dschuff, qcolombet

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71484
2020-01-21 11:13:46 -08:00
Fangrui Song 7a8b0b1595 [StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()
Reviewed By: dantrushin

Differential Revision: https://reviews.llvm.org/D73063
2020-01-21 09:46:27 -08:00
Krzysztof Parzyszek 305bf5b21d [Hexagon] Add support for Hexagon v67t microarchitecture (tiny core) 2020-01-21 11:35:10 -06:00
Krzysztof Parzyszek 020041d99b Update spelling of {analyze,insert,remove}Branch in strings and comments
These names have been changed from CamelCase to camelCase, but there were
many places (comments mostly) that still used the old names.

This change is NFC.
2020-01-21 10:15:38 -06:00
Guillaume Chatelet 139771f8b0 [Alignment][NFC] Use Align with CreateElementUnorderedAtomicMemMove
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73050
2020-01-21 14:16:50 +01:00
Guillaume Chatelet bc8a1ab26f [Alignment][NFC] Use Align with CreateMaskedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73087
2020-01-21 14:13:22 +01:00
Krzysztof Parzyszek c12a5917d2 [Hexagon] Add support for Hexagon/HVX v67 ISA 2020-01-20 16:16:49 -06:00
Guillaume Chatelet 46b9563cf6 [Alignment][NFC] Use Align with CreateElementUnorderedAtomicMemCpy
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, nicolasvasilache

Subscribers: hiraditya, jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73041
2020-01-20 15:39:45 +01:00
Andrzej Warzynski 7e717b3990 [AArch64][SVE] Extend int_aarch64_sve_ld1_gather_imm
The ACLE distinguishes between the following addressing modes for gather
loads:
  * "scalar base, vector offset", and
  * "vector base, scalar offset".
For the "vector base, scalar offset" case, the
`int_aarch64_sve_ld1_gather_imm` intrinsic was added in 79f2422d.
Currently, that intrinsic assumes that the scalar offset is passed as an
immediate.  As a result, it does not cater for cases where scalar offset
is stored in a register.

In this patch `int_aarch64_sve_ld1_gather_imm` is extended so that all
cases are covered:
* `int_aarch64_sve_ld1_gather_imm` is renamed as
  `int_aarch64_sve_ld1_gather_scalar_offset`
* new DAG combine rules are added for GLD1_IMM for scenarios where the
  offset is a non-immediate scalar or an out-of-range immediate
* sve-intrinsics-gather-loads-vector-base.ll is renamed as
  sve-intrinsics-gather-loads-vector-base-imm-offset.ll
* sve-intrinsics-gather-loads-vector-base-scalar-offset.ll is added to test
  file for non-immediate offsets

Similar changes are made for scatter store intrinsics.

Reviewed By: sdesmalen, efriedma

Differential Revision: https://reviews.llvm.org/D71773
2020-01-20 12:19:18 +00:00
Evgeniy Brevnov af7e158872 [LV] Vectorizer should adjust trip count in profile information
Summary: Vectorized loop processes VFxUF number of elements in one iteration thus total number of iterations decreases proportionally. In addition epilog loop may not have more than VFxUF - 1 iterations. This patch updates profile information accordingly.

Reviewers: hsaito, Ayal, fhahn, reames, silvas, dcaballe, SjoerdMeijer, mkuper, DaniilSuchkov

Reviewed By: Ayal, DaniilSuchkov

Subscribers: fedor.sergeev, hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67905
2020-01-20 18:36:28 +07:00
Sjoerd Meijer 93175a5caa [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFCI
This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus
making it a generic loop utility function.  This allows to rewrite loop exit
values by just calling this function without running the whole IndVarSimplify
pass.

We use this in D72714 to rematerialise the iteration count in exit blocks, so
that we can clean-up loop update expressions inside the hardware-loops later.

Differential Revision: https://reviews.llvm.org/D72602
2020-01-20 09:05:00 +00:00
David Green ff2e67a4f7 [ARM] MVE VLDn postinc
This adds Post inc variants of the VLD2/4 and VST2/4 instructions in
MVE. It uses the same mechanism/nodes as Neon, transforming the
intrinsic+add pair into a ARMISD::VLD2_UPD, which gets selected to a
post-inc instruction. The code to do that is mostly taken from the
existing Neon code, but simplified as less variants are needed.

It also fills in some getTgtMemIntrinsic for the arm.mve.vld2/4
instrinsics, which allow the nodes to have MMO's, calculated as the full
length to the memory being loaded/stored.

Differential Revision: https://reviews.llvm.org/D71194
2020-01-20 06:57:07 +00:00
Fangrui Song eaab1bf21e [StackColoring] Remap FixedStackPseudoSourceValue frame index referenced by MachineMemOperand
StackColoring::remapInstructions() remaps MachineOperand frame index (e.g. %stack.1 -> %stack.0)
but does not remap FixedStackPseudoSourceValue frame index (e.g. store 4 into %stack.1.ap2.i.i)
referenced by MachineMemoryOperand.

This can cause an assertion failure when LiveDebugValues references a dead stack object.

It is difficult to craft a test case. -g, va_copy and stack-coloring are required.
I can only reproduce it on ppc32.
2020-01-19 22:53:45 -08:00
Fangrui Song 8e8a75ad50 [TargetRegisterInfo] Default trackLivenessAfterRegAlloc() to true
Except AMDGPU/R600RegisterInfo (a bunch of MIR tests seem to have
problems), every target overrides it with true. PostMachineScheduler
requires livein information. Not providing it can cause assertion
failures in ScheduleDAGInstrs::addSchedBarrierDeps().
2020-01-19 14:20:37 -08:00
Lang Hames 84217ad661 [ORC] Add weak symbol support to defineMaterializing, fix for PR40074.
The MaterializationResponsibility::defineMaterializing method allows clients to
add new definitions that are in the process of being materialized to the JIT.
This patch adds support to defineMaterializing for symbols with weak linkage
where the new definitions may be rejected if another materializer concurrently
defines the same symbol. If a weak symbol is rejected it will not be added to
the MaterializationResponsibility's responsibility set. Clients can check for
membership in the responsibility set via the
MaterializationResponsibility::getSymbols() method before resolving any
such weak symbols.

This patch also adds code to RTDyldObjectLinkingLayer to tag COFF comdat symbols
introduced during codegen as weak, on the assumption that these are COFF comdat
constants. This fixes http://llvm.org/PR40074.
2020-01-19 10:46:07 -08:00
Fangrui Song a72d15e37c [XRay] Set hasSideEffects flag of PATCHABLE_FUNCTION_{ENTER,EXIT}
Otherwise they may be picked as the delay slot by mips-delay-slot-filler, if we move patchable-function before mips-delay-slot-filler.
2020-01-19 00:09:46 -08:00
Fangrui Song 9583a3f262 [AsmPrinter] Delete dead takeDeletedSymbsForFunction()
The code added in r98579 is dead now.
2020-01-18 17:08:00 -08:00
Reid Kleckner ff6be0ca25 Revert "[Support] Explicitly instantiate BumpPtrAllocatorImpl"
This reverts commit add9599050.

Buildbots don't seem to like it.
2020-01-18 09:33:00 -08:00
Reid Kleckner add9599050 [Support] Explicitly instantiate BumpPtrAllocatorImpl
Most clients only ever use the default BumpPtrAllocator.
2020-01-18 09:21:53 -08:00
Michael Liao 6d0d86a64d [DAG] Add helper for creating constant vector index with correct type. NFC. 2020-01-18 01:23:36 -05:00
David Blaikie 46ed93315f [IR] Remove some unnecessary cleanup in Module's dtor, and use a unique_ptr to simplify some
Follow on from D72812, based on Mehdi Amini's feedback.
2020-01-17 17:30:24 -08:00
Derek Schuff ff171acf84 [WebAssembly] Track frame registers through VReg and local allocation
This change has 2 components:

Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It
describes how the Dwarf frame base will be encoded.  That can be a register (the
default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or
a DW_OP_WASM_location descriptr.

WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the
correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs
has run.  Make WebAssemblyExplicitLocals store the local it allocates for the
frame register. Use this local information to implement getDwarfFrameBase

The result is that the DW_AT_frame_base attribute is correctly encoded for each
subprogram, and each param and local variable has a correct DW_AT_location that
uses DW_OP_fbreg to refer to the frame base.

This is a reland of rG3a05c3969c18 with fixes for the expensive-checks
and Windows builds

Differential Revision: https://reviews.llvm.org/D71681
2020-01-17 17:23:56 -08:00
Reid Kleckner 423e3db6a8 Remove unneeded FoldingSet.h include from Attributes.h
Avoids 637 extra FoldingSet.h and Allocator.h includes. FoldingSet.h
needs Allocator.h, which is relatively expensive.
2020-01-17 16:36:09 -08:00
Evgenii Stepanov d081962dea Merge memtag instructions with adjacent stack slots.
Summary:
Detect a run of memory tagging instructions for adjacent stack frame slots,
and replace them with a shorter instruction sequence
* replace STG + STG with ST2G
* replace STGloop + STGloop with STGloop

This code needs to run when stack slot offsets are already known, but before
FrameIndex operands in STG instructions are eliminated; that's the
reason for the new hook in PrologueEpilogue.

This change modifies STGloop and STZGloop pseudos to take the size as an
immediate integer operand, and adds _untied variants of those pseudos
that are allowed to take the base address as a FI operand. This is needed to
simplify recognizing an STGloop instruction as operating on a stack slot
post-regalloc.

This improves memtag code size by ~0.25%, and it looks like an additional ~0.1%
is possible by rearranging the stack frame such that consecutive STG
instructions reference adjacent slots (patch pending).

Reviewers: pcc, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70286
2020-01-17 15:19:29 -08:00
Alina Sbirlea 9f6c6ee6b9 [MemDepAnalysis/VNCoercion] Move static method to its only use. [NFCI]
Static method MemoryDependenceResults::getLoadLoadClobberFullWidthSize
does not have or use any info specific to MemoryDependenceResults.
Move it to its only user: VNCoercion.
2020-01-17 15:18:42 -08:00
Petr Hosek d3db13af7e [profile] Support counter relocation at runtime
This is an alternative to the continous mode that was implemented in
D68351. This mode relies on padding and the ability to mmap a file over
the existing mapping which is generally only available on POSIX systems
and isn't suitable for other platforms.

This change instead introduces the ability to relocate counters at
runtime using a level of indirection. On every counter access, we add a
bias to the counter address. This bias is stored in a symbol that's
provided by the profile runtime and is initially set to zero, meaning no
relocation. The runtime can mmap the profile into memory at abitrary
location, and set bias to the offset between the original and the new
counter location, at which point every subsequent counter access will be
to the new location, which allows updating profile directly akin to the
continous mode.

The advantage of this implementation is that doesn't require any special
OS support. The disadvantage is the extra overhead due to additional
instructions required for each counter access (overhead both in terms of
binary size and performance) plus duplication of counters (i.e. one copy
in the binary itself and another copy that's mmapped).

Differential Revision: https://reviews.llvm.org/D69740
2020-01-17 15:02:23 -08:00
Adrian Prantl 7b30370e5b Move the sysroot attribute from DIModule to DICompileUnit
[this re-applies c0176916a4
 with the correct commit message and phabricator link]

This addresses point 1 of PR44213.
https://bugs.llvm.org/show_bug.cgi?id=44213

The DW_AT_LLVM_sysroot attribute is used for Clang module debug info,
to allow LLDB to import a Clang module from source. Currently it is
part of each DW_TAG_module, however, it is the same for all modules in
a compile unit. It is more efficient and less ambiguous to store it
once in the DW_TAG_compile_unit.

This should have no effect on DWARF consumers other than LLDB.

Differential Revision: https://reviews.llvm.org/D71732
2020-01-17 12:55:40 -08:00
Adrian Prantl c17aee67f1 Revert "Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot"
This reverts commit 12e479475a.

I accidentally landed this patch with the wrong commit message ...
2020-01-17 12:52:36 -08:00
Alina Sbirlea 78d4096d03 [LazyCallGraph] Add invalidate method.
Summary: Add invalidate method in LazyCallGraph.

Reviewers: chandlerc, silvas

Subscribers: hiraditya, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72817
2020-01-17 10:47:51 -08:00
Alina Sbirlea 630a8011e4 [CallGraph] Add invalidate method.
Summary: Add invalidate method in CallGraph.

Reviewers: Eugene.Zelenko, chandlerc

Subscribers: hiraditya, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72816
2020-01-17 10:47:51 -08:00
Alina Sbirlea 62a50a95fc [BrachProbablityInfo] Add invalidate method.
Summary: Add invalidate method for BrachProbablityInfo.

Reviewers: Eugene.Zelenko, chandlerc

Subscribers: hiraditya, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72815
2020-01-17 10:47:51 -08:00
Alina Sbirlea 5cc99d05f5 [GlobalsModRef] Add invalidate method
Summary: Add invalidate method to GlobalsAA.

Reviewers: tejohnson, chandlerc

Subscribers: hiraditya, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72818
2020-01-17 10:33:54 -08:00
Adrian Prantl 12e479475a Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.

This attribute only appears in Clang module debug info.

Differential Revision: https://reviews.llvm.org/D71722
2020-01-17 09:36:48 -08:00
Simon Pilgrim d1b32f328e Revert rGb6437b352db9 - "Fix gcc9 "moving a local object in a return statement prevents copy elision" Wpessimizing-move warnings."
Fix buildbots
2020-01-17 16:04:10 +00:00
Simon Pilgrim 88cdeaa531 Revert rGff3fe145fe48 "Fix gcc9 "moving a local object in a return statement prevents copy elision" Wpessimizing-move warning."
Fix buildbots
2020-01-17 16:03:21 +00:00
Simon Pilgrim ff3fe145fe Fix gcc9 "moving a local object in a return statement prevents copy elision" Wpessimizing-move warning. 2020-01-17 15:51:08 +00:00
Simon Pilgrim b6437b352d Fix gcc9 "moving a local object in a return statement prevents copy elision" Wpessimizing-move warnings. 2020-01-17 15:51:08 +00:00
Sam Parker 42350cd893 [ARM][MVE] Tail Predicate IsSafeToRemove
Introduce a method to walk through use-def chains to decide whether
it's possible to remove a given instruction and its users. These
instructions are then stored in a set until the end of the transform
when they're erased. This is now used to perform checks on the
iteration count (LoopDec chain), element count (VCTP chain) and the
possibly redundant iteration count.

As well as being able to remove chains of instructions, we know also
check that the sub feeding the vctp is producing the expected value.

Differential Revision: https://reviews.llvm.org/D71837
2020-01-17 13:19:14 +00:00
Cullen Rhodes 49edf9a509 [AArch64][SVE] Add break intrinsics
Summary:
Implements the following intrinsics:

    * @llvm.aarch64.sve.brka
    * @llvm.aarch64.sve.brka.z
    * @llvm.aarch64.sve.brkb
    * @llvm.aarch64.sve.brkb.z
    * @llvm.aarch64.sve.brkn.z
    * @llvm.aarch64.sve.brkpa.z
    * @llvm.aarch64.sve.brkpb.z

Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally, rengolin

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72393
2020-01-17 11:47:08 +00:00
Kerry McLaughlin fe3bb8ec96 [AArch64][SVE] Add ImmArg property to intrinsics with immediates
Summary:
Several SVE intrinsics with immediate arguments (including those
added by D70253 & D70437) do not use the ImmArg property.
This patch adds ImmArg<Op> where required and changes
the appropriate patterns which match the immediates.

Reviewers: efriedma, sdesmalen, andwar, rengolin

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72612
2020-01-17 10:47:55 +00:00
Dmitri Gribenko 10b4aece52 Revert "Avoid creating an immutable map in the Automaton class."
This reverts commit 051d330314. It broke
buildbots, for example,
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21908.
2020-01-17 10:20:36 +01:00
Craig Topper caee96031d [Transforms][RISCV] Remove a "using namespace llvm" from an include file. Fix a place that became dependent on it.
This include file was created in October and has a "using namespace llvm". This seems to get exposed to other include files and finally onto cpp files. While this somewhat okay for llvm itself, its bad for other projects that use llvm as a library and includes a header file that picks this up. This was found by ISPC which has some class names at gloal scope with the same names as LLVM.

It looks like RISCV accidentally became dependent on this. I fixed it by reordering some includes in the RISCV code, but maybe we want to change the TableGenEmitter to put "namespace llvm {" in the generated file instead? But we probably want to do the simplest thing first so we can merge it to 10.0.

Differential Revision: https://reviews.llvm.org/D72895
2020-01-16 20:50:41 -08:00
Marcello Maggioni 051d330314 Avoid creating an immutable map in the Automaton class.
Summary:
In the DFAPacketizer we copy the Transitions array
into a map in order to later access the transitions
based on a "Current State/Action" pair as a key.
This map lives in the Automaton object used by the DFAPacketizer.
It is never changed during the life of the object after
having been created during the creation of the Automaton
itself.

This map creation can make the creation of a DFAPacketizer
quite expensive if the target contains a considerable
amount of transition states.

Considering that TableGen already generates a
sorted list of transitions by State/Action pairs
we could just use that directly in our Automaton
and search entries with std::lower_bound instead of copying
it in a map and paying the execution time and memory cost.

Reviewers: jmolloy, ThomasRaoux

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72682
2020-01-16 18:44:20 -08:00
Eric Christopher 40ac4221c3 Move static function to inline function - this fixes a conceivable
ODR violation and a clang-tidy warning about an unused function
in a number of translation units.
2020-01-16 16:12:46 -08:00
David Blaikie 65eb74e94b PointerLikeTypeTraits: Standardize NumLowBitsAvailable on static constexpr rather than anonymous enum
This is (more?) usable by GDB pretty printers and seems nicer to write.

There's one tricky caveat that in C++14 (LLVM's codebase today) the
static constexpr member declaration is not a definition - so odr use of
this constant requires an out of line definition, which won't be
provided (that'd make all these trait classes more annoyidng/expensive
to maintain). But the use of this constant in the library implementation
is/should always be in a non-odr context - only two unit tests needed to
be touched to cope with this/avoid odr using these constants.

Based on/expanded from D72590 by Christian Sigg.
2020-01-16 15:30:50 -08:00
Derek Schuff 80906d9d16 Revert "[WebAssembly] Track frame registers through VReg and local allocation"
This reverts commit 3a05c3969c.
It breaks under expensive-checks and on Windows
2020-01-16 14:38:00 -08:00
Derek Schuff 3a05c3969c [WebAssembly] Track frame registers through VReg and local allocation
This change has 2 components:

Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It
describes how the Dwarf frame base will be encoded.  That can be a register (the
default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or
a DW_OP_WASM_location descriptr.

WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the
correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs
has run.  Make WebAssemblyExplicitLocals store the local it allocates for the
frame register. Use this local information to implement getDwarfFrameBase

The result is that the DW_AT_frame_base attribute is correctly encoded for each
subprogram, and each param and local variable has a correct DW_AT_location that
uses DW_OP_fbreg to refer to the frame base.

Differential Revision: https://reviews.llvm.org/D71681
2020-01-16 13:51:17 -08:00
Kazu Hirata 53b68e676f Resubmit: [JumpThreading] Thread jumps through two basic blocks
This reverts commit 2d258ed931.  This
revision fixes the Windows build and adds a testcase for it, namely
thread-two-bbs3.ll.  My original patch improperly copied EH pads on
Windows.  This patch disregards jump threading opportunities having to
do with EH pads.

[JumpThreading] Thread jumps through two basic blocks

Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-01-16 12:33:37 -08:00
Matt Arsenault a66d2817ca GlobalISel: Don't ignore requested ext narrowing type
This was assuming the narrow target was the source type. Respect the
requested type when these don't match by using intermediate
merges. This avoids producing very wide, illegal shift expansions.
2020-01-16 14:29:37 -05:00
Matt Arsenault be31a7b7ee GlobalISel: Move extension scalar narrowing to separate function
Also rename a few things. Handling a different requested type will
require this to become much more complex.
2020-01-16 14:29:37 -05:00
Krzysztof Parzyszek 5f65065437 [Hexagon] Update autogeneated intrinsic information in LLVM 2020-01-16 13:11:18 -06:00
Matt Arsenault d0943537e1 GlobalISel: Apply target MMO flags to atomics
Unify MMO flag handling with SelectionDAG like with loads and stores.
2020-01-16 13:49:43 -05:00
Matt Arsenault 0d0fce42b0 GlobalISel: Preserve load/store metadata in IRTranslator
This was dropping the invariant metadata on dead argument loads, so
they weren't deleted.

Atomics still need to be fixed the same way. Also, apparently store
was never preserving dereferencable which should also be fixed.
2020-01-16 13:49:43 -05:00
Matt Arsenault 86d14ed766 TableGen: Remove dead code 2020-01-16 13:49:43 -05:00
Arkady Shlykov c87982b467 Revert "[Loop Peeling] Add possibility to enable peeling on loop nests."
This reverts commit 3f3017e because there's a failure on peel-loop-nests.ll
with LLVM_ENABLE_EXPENSIVE_CHECKS on.

Differential Revision: https://reviews.llvm.org/D70304
2020-01-16 10:33:38 -08:00
Fedor Sergeev 3478551bf3 [GVN] introduce GVNOptions to control GVN pass behavior
There are a few global (cl::opt) controls that enable optional
behavior in GVN. Introduce GVNOptions that provide corresponding
per-pass instance controls.

That will allow to use GVN multiple times in pipeline each time
with different settings.

Reviewers: asbirlea, rnk, reames, skatkov, fhahn
Reviewed By: fhahn

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72732
2020-01-16 20:21:08 +03:00
Mircea Trofin 7acfda633f [llvm] Make new pass manager's OptimizationLevel a class
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.

In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.

This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.

The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72547
2020-01-16 09:00:56 -08:00
Jay Foad 28bb43bdf8 [GlobalISel] Use more MachineIRBuilder helper methods
Reviewers: arsenm, nhaehnle

Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72833
2020-01-16 15:34:51 +00:00
Francesco Petrogalli 66c120f025 [VectorUtils] Rework the Vector Function Database (VFDatabase).
Summary:
This commits is a rework of the patch in
https://reviews.llvm.org/D67572.

The rework was requested to prevent out-of-tree performance regression
when vectorizing out-of-tree IR intrinsics. The vectorization of such
intrinsics is enquired via the static function `isTLIScalarize`. For
detail see the discussion in https://reviews.llvm.org/D67572.

Reviewers: uabelho, fhahn, sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72734
2020-01-16 15:08:26 +00:00
Florian Hahn 0b21d55262 [IR] Mark memset.* intrinsics as IntrWriteMem.
llvm.memset intrinsics do only write memory, but are missing
IntrWriteMem, so they doesNotReadMemory() returns false for them.

The test change is due to the test checking the fn attribute ids at the
call sites, which got bumped up due to a new combination with writeonly
appearing in the test file.

Reviewers: jdoerfert, reames, efriedma, nlopes, lebedev.ri

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72789
2020-01-16 10:35:46 +00:00
Florian Hahn 23c113802e [LV] Allow assume calls in predicated blocks.
The assume intrinsic is intentionally marked as may reading/writing
memory, to avoid passes moving them around. When flattening the CFG
for predicated blocks, we have to drop the assume calls, as they
are control-flow dependent.

There are some cases where we can do better (when control flow is
preserved), but that is follow-up work.

Fixes PR43620.

Reviewers: hsaito, rengolin, dcaballe, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D68814
2020-01-16 10:11:35 +00:00
Sameer Sahasrabuddhe ed181efa17 [HIP][AMDGPU] expand printf when compiling HIP to AMDGPU
Summary:
This change implements the expansion in two parts:
- Add a utility function emitAMDGPUPrintfCall() in LLVM.
- Invoke the above function from Clang CodeGen, when processing a HIP
  program for the AMDGPU target.

The printf expansion has undefined behaviour if the format string is
not a compile-time constant. As a sufficient condition, the HIP
ToolChain now emits -Werror=format-nonliteral.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D71365
2020-01-16 15:15:38 +05:30
Igor Kudrin afb22d7c33 [DebugInfo] Simplify the constructor of DWARFDebugAranges::Range. NFC.
This removes the default values of the arguments. The only caller,
DWARFDebugAranges::construct(), provides all three parameters.

Differential Revision: https://reviews.llvm.org/D72757
2020-01-16 13:08:30 +07:00
Matt Arsenault c378e52cb9 Set some fast math attributes in setFunctionAttributes
This will provide a more consistent view to codegen for these
attributes. The current system is somewhat awkward, and the fields in
TargetOptions are reset based on the command line flag if the
attribute isn't set. By forcing these attributes with the flag, there
can never be an inconsistency in the behavior if code directly
inspects the attribute on the function without considering the command
line flags.
2020-01-15 22:23:18 -05:00
Wei Mi 154cd6de51 [SampleFDO] Fix invalid branch profile generated by indirect call promotion.
Suppose an inline instance has hot total sample count but 0 entry count, and
it is an indirect call target. If the indirect call has no other call target
and inline instance associated with it and it is promoted, currently the
conditional branch generated by indirect call promotion will have invalid
branch profile which is !{!"branch_weights", i32 0, i32 0} -- because the
entry count of the promoted target is 0 and the total entry count of all
targets is also 0. This caused a SEGV in Control Height Reduction and may
cause problem in other passes.

Function entry count of an inline instance is computed by a heuristic --
using either the sample of the starting line or starting inner inline
instance. The patch changes the heuristic a little bit so that when total
sample count is larger than 0, the computed entry count will be at least 1.
Then the new branch profile will be !{!"branch_weights", i32 1, i32 0}.

Differential Revision: https://reviews.llvm.org/D72790
2020-01-15 18:36:06 -08:00
Matt Arsenault 77eb1b8f63 llc: Don't overwrite frame-pointer attribute
Continue making command line flags with matching attribute behavior
consistent.
2020-01-15 20:56:46 -05:00
Yuanfang Chen 6e24c6037f Revert "[Support] make report_fatal_error `abort` instead of `exit`"
This reverts commit 647c3f4e47.

Got bots failure from sanitizer-windows and maybe others.
2020-01-15 17:52:25 -08:00
Yuanfang Chen 647c3f4e47 [Support] make report_fatal_error `abort` instead of `exit`
Summary:
This patch could be treated as a rebase of D33960. It also fixes PR35547.
A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems
the consensus is that the test is passing by chance and I'm not
sure how important it is for us. So it is removed like in D33960 for now.
The rest of the test fixes are just adding `--crash` flag to `not` tool.

** The reason it fixes PR35547 is

`exit` does cleanup including calling class destructor whereas `abort`
does not do any cleanup. In multithreading environment such as ThinLTO or JIT,
threads may share states which mostly are ManagedStatic<>. If faulting thread
tearing down a class when another thread is using it, there are chances of
memory corruption. This is bad 1. It will stop error reporting like pretty
stack printer; 2. The memory corruption is distracting and nondeterministic in
terms of error message, and corruption type (depending one the timing, it
could be double free, heap free after use, etc.).

Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola

Reviewed By: rnk, MaskRay

Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D67847
2020-01-15 17:05:13 -08:00
Matt Arsenault 67ec8744d7 llc: Change behavior of -mattr with existing attribute
Append this to the existing target-features attribute on the function.

Some flags ignore existing attributes, and some overwrite them. Move
towards consistently respecting existing attributes if present. Since
target features act as a state machine on their own, append to the
function attribute. The backend default added feature list, function
attributes, and -mattr will all be appended together, and the later
features can individually toggle the earlier settings.
2020-01-15 19:46:01 -05:00
Matt Arsenault eef92f25cc AMDGPU: Remove custom node for exports
I'm mildly worried about potentially reordering exp/exp_done with
IntrWriteMem on the intrinsic.

Requires hacking out the illegal type on SI, so manually select that
case during lowering.
2020-01-15 18:33:15 -05:00
Brian Gesiak daab9227ff [IR] Module's NamedMD table needn't be 'void *'
Summary:
In July 21 2010 `llvm::NamedMDNode` was refactored such that it would no
longer subclass `llvm::Value`:
https://github.com/llvm/llvm-project/commit/2637cc1a38d7336ea30caf

As part of this change, a map type from metadata names to their named
metadata, `llvm::MDSymbolTable`, was deleted. In its place, the type
of member `llvm::Module::NamedMDSymTab` was changed, from
`llvm::MDSymbolTable` to `void *`. The underlying memory allocations
for this pointer were changed to `new StringMap<NamedMDNode *>()`.

However, as far as I can tell, there's no need for obscuring the
underlying type being pointed to by the `void *`, and no need for
static casts from `void *` to `StringMap`. In fact, I don't think
there's a need for explicit calls to `new` and `delete` at all.

This commit changes `NamedMDSymTab` from a pointer to a reference, which
automatically couples its lifetime with the lifetime of its owning
`llvm::Module` instance, thus removing the explicit calls to `new` and
`delete` in the `llvm::Module` constructor and destructor. It also
changes the type from `void *` to a newly defined `NamedMDSymTabType`,
and removes the static casts.

Test Plan:
An ASAN-enabled build and run of `check-all` succeeds with this change
(aside from some tests that always fail for me in ASAN for some reason,
such as `check-clang` `SemaTemplate/stack-exhaustion.cpp`).

Reviewers: aprantl, dblaikie, chandlerc, pcc, echristo

Reviewed By: dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72812
2020-01-15 18:27:25 -05:00
Fedor Sergeev 8a4d12ae5b [BasicBlock] add helper getPostdominatingDeoptimizeCall
It appears to be rather useful when analyzing Loops with multiple
deoptimizing exits, perhaps merged ones.
For now it is used in LoopPredication, will be adding more uses
in other loop passes.

Reviewers: asbirlea, fhahn, skatkov, spatel, reames
Reviewed By: reames

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72754
2020-01-16 01:15:57 +03:00
Mircea Trofin 5466597fee [NFC] Refactor InlineResult for readability
Summary:
InlineResult is used both in APIs assessing whether a call site is
inlinable (e.g. llvm::isInlineViable) as well as in the function
inlining utility (llvm::InlineFunction). It means slightly different
things (can/should inlining happen, vs did it happen), and the
implicit casting may introduce ambiguity (casting from 'false' in
InlineFunction will default a message about hight costs,
which is incorrect here).

The change renames the type to a more generic name, and disables
implicit constructors.

Reviewers: eraman, davidxl

Reviewed By: davidxl

Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72744
2020-01-15 13:34:20 -08:00
Vedant Kumar a2cc80bc95 DebugInfo: Factor out logic to update locations in MD_loop metadata, NFC
Factor out the logic needed to update debug locations contained within
MD_loop metadata.

This refactor is preparation for a future change that also needs to
rewrite MD_loop metadata.

rdar://45507940
2020-01-15 13:02:36 -08:00
Vedant Kumar f0120556c7 [DWARF] Emit DW_AT_call_return_pc as an address
This reverts D53469, which changed llvm's DWARF emission to emit
DW_AT_call_return_pc as a function-local offset. Such an encoding is not
compatible with post-link block re-ordering tools and isn't standards-
compliant.

In addition to reverting back to the original DW_AT_call_return_pc
encoding, teach lldb how to fix up DW_AT_call_return_pc when the address
comes from an object file pointed-to by a debug map. While doing this I
noticed that lldb's support for tail calls that cross a DSO/object file
boundary wasn't covered, so I added tests for that. This latter case
exercises the newly added return PC fixup.

The dsymutil changes in this patch were originally included in D49887:
the associated test should be sufficient to test DW_AT_call_return_pc
encoding purely on the llvm side.

Differential Revision: https://reviews.llvm.org/D72489
2020-01-15 13:02:23 -08:00
Mark Murray da9d57d2c2 [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics.
Summary: Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics and unit tests.

Reviewers: simon_tatham, miyuki, dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72761
2020-01-15 17:20:15 +00:00
evgeny 10cadee5ce [ThinLTO] Always import constants
This patch imports constant variables even when they can't be internalized
(which results in promotion). This offers some extra constant folding
opportunities.

Differential revision: https://reviews.llvm.org/D70404
2020-01-15 19:29:01 +03:00
Arkady Shlykov 3f3017e162 [Loop Peeling] Add possibility to enable peeling on loop nests.
Summary:
Current peeling implementation bails out in case of loop nests.
The patch introduces a field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.

Reviewers: fhahn, lebedev.ri, xbolva00

Reviewed By: xbolva00

Subscribers: xbolva00, hiraditya, zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D70304
2020-01-15 08:25:21 -08:00
Lang Hames e9e26c01cd [ORC] Simplify use of lazyReexports with LLJIT.
This patch makes the target triple available via the LLJIT interface, and moves
the IRTransformLayer from LLLazyJIT down into LLJIT. Together these changes make
it easier to use the lazyReexports utility with LLJIT, and to apply IR
transforms to code as it is compiled in LLJIT (rather than requiring transforms
to be applied manually before code is added). An code example is added in
llvm/examples/LLJITExamples/LLJITWithLazyReexports
2020-01-15 08:02:53 -08:00
Lang Hames d2fabd7006 [ORC] Update lazyReexports to support aliases with different symbol names.
A bug in the existing implementation meant that lazyReexports would not work if
the aliased name differed from the alias's name, i.e. all lazy reexports had to
be of the form (lib1, name) -> (lib2, name). This patch fixes the issue by
capturing the alias's name in the NotifyResolved callback. To simplify this
capture, and the LazyCallThroughManager code in general, the NotifyResolved
callback is updated to use llvm::unique_function rather than a custom class.

No test case yet: This can only be tested at runtime, and the only in-tree
client (lli) always uses aliases with matching names. I will add a new LLJIT
example shortly that will directly test the lazyReexports API and the
non-trivial alias use case.
2020-01-15 08:02:53 -08:00
Hubert Tong 63b428e386 DWARFDebugLine.cpp: Format unknown line number standard opcodes
Summary:
This patch implements `formatv()` formatting for `dwarf::LineNumberOps`
and makes use of it for the `llvm-dwarfdump --debug-line` dump.

Previously, unknown line number standard opcodes would lead to undefined
behaviour. The code would attempt to format the data pointer of an empty
`StringRef` (a null pointer) using `%s`. According to the description
for `format()`, use of that interface carries the "risk of `printf`".
Passing a null pointer in place of an array to a C library function
results in undefined behaviour.

Reviewers: jhenderson, daltenty, stevewan

Reviewed By: jhenderson

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72369
2020-01-15 10:45:50 -05:00
Ulrich Weigand 870137d207 [FPEnv] Address post-commit review comment for D71467
Remove a bit of code duplication between CreateFCmp and CreateFCmpS
by creating a shared helper function.
2020-01-15 15:10:11 +01:00
Matt Arsenault 936483fb7d GlobalISel: Implement lower for G_BITCAST
Bitcast only really applies between scalars and vectors. Implement as
an unmerge and remerge. The test needs to tolerate failure since one
of the unmerges currently fails to legalize.
2020-01-15 08:58:58 -05:00
Georgii Rymar 7570d387c2 [yaml2obj/obj2yaml] - Add support for SHT_RELR sections.
Note: this is a reland with a trivial 2 lines fix in ELFState<ELFT>::writeSectionContent.
      It adds a check similar to ones we already have for other sections to fix the case revealed
      by bots, like http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/60744.

The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.

More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272

This patch adds a support for these sections.

Differential revision: https://reviews.llvm.org/D71872
2020-01-15 15:15:24 +03:00
Georgii Rymar ca6f616532 Revert "[yaml2obj/obj2yaml] - Add support for SHT_RELR sections."
This reverts commit 46d11e30ee.

It broke bots. E.g. http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/60744
2020-01-15 14:19:00 +03:00
Russell Gallop 884a65af5c [Support] Replace Windows __declspec(thread) with thread_local for LLVM_THREAD_LOCAL
Windows minimum host tools version is now VS2017, which supports C++11
thread_local so use this for LLVM_THREAD_LOCAL instead of
declspec(thread). According to [1], thread_local is implemented with
declspec(thread) so this should be NFC.

[1] https://docs.microsoft.com/en-us/cpp/cpp/thread?view=vs-2017

Differential Revision: https://reviews.llvm.org/D72399
2020-01-15 11:15:25 +00:00
Cullen Rhodes 93a4dede3a [AArch64][SVE] Add ptest intrinsics
Summary:
Implements the following intrinsics:

    * @llvm.aarch64.sve.ptest.any
    * @llvm.aarch64.sve.ptest.first
    * @llvm.aarch64.sve.ptest.last

Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally, rengolin

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72398
2020-01-15 11:15:01 +00:00
Georgii Rymar 46d11e30ee [yaml2obj/obj2yaml] - Add support for SHT_RELR sections.
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.

More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272

This patch adds a support for these sections.

Differential revision: https://reviews.llvm.org/D71872
2020-01-15 13:54:08 +03:00
Igor Kudrin 2142e20f50 [DWARF] Fix DWARFDebugAranges to support 64-bit CU offsets.
DWARFContext, the only user of this class, can already handle such offsets.

Differential Revision: https://reviews.llvm.org/D71834
2020-01-15 17:19:08 +07:00
Hideto Ueno 188f9a348d [Attributor] AAValueConstantRange: Value range analysis using constant range
Summary:
This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point.
One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument).

The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty).

Currently, AAValueConstantRange is created in `getAssumedConstant` method when `AAValueSimplify` returns `nullptr`(worst state).

Supported
 - BinaryOperator(add, sub, ...)
 - CmpInst(icmp eq, ...)
 - !range metadata

`AAValueConstantRange` is not intended to extend to polyhedral range value analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: phosek, davezarzycki, baziotis, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71620
2020-01-15 16:34:23 +09:00
David Green b891490ceb [Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use ScheduleDAGMI. NFC
All the callers of this function will be ScheduleDAGMI from the
MachineScheduler. This allows us to use the extra info available in
ScheduleDAGMI without resorting to awkward casts.
2020-01-15 07:21:44 +00:00
Tom Stellard 0dbcb36394 CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439
2020-01-14 19:46:52 -08:00
Fedor Sergeev fe37d9ecaa [GVN] fix comment/argument name to match actual implementation. NFC 2020-01-15 03:58:04 +07:00
Nikita Popov 410331869d [NewPM] Port MergeFunctions pass
This ports the MergeFunctions pass to the NewPM. This was rather
straightforward, as no analyses are used.

Additionally MergeFunctions needs to be conditionally enabled in
the PassBuilder, but I left that part out of this patch.

Differential Revision: https://reviews.llvm.org/D72537
2020-01-14 20:55:41 +01:00
Teresa Johnson 7dc4bbf8ab [ThinLTO] Handle variable with twice promoted name (Rust)
Summary:
Ensure that we can internalize values produced from two rounds of
promotion.

Note that this cannot happen currently via clang, but in other use cases
such as the Rust compiler which does a first round of ThinLTO on library
code, producing bitcode, and a second round on the final binary.

In particular this can happen if a function is exported and promoted,
ending up with a ".llvm.${hash}" suffix, and then goes through a round
of optimization creating an internal switch table expansion variable
that is internal and contains the promoted name of the enclosing
function. This variable will be promoted in the second round of ThinLTO
if @foo is imported again, and therefore ends up with two
".llvm.${hash}" suffixes. Only the final one should be stripped when
consulting the index to locate the summary.

Reviewers: wmi

Subscribers: mehdi_amini, inglorion, hiraditya, JDevlieghere, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72711
2020-01-14 10:54:03 -08:00
Warren Ristow f7e9f4f4c5 SCC: Allow ReplaceNode to safely support insertion
If scc_iterator::ReplaceNode is inserting a new entry in the map,
rather than replacing an existing entry, the possibility of growing
the map could cause a failure.  This change safely implements the
insertion.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D72469
2020-01-14 10:30:24 -08:00
Dmitri Gribenko 2948ec5ca9 Removed PointerUnion3 and PointerUnion4 aliases in favor of the variadic template 2020-01-14 18:56:29 +01:00
Ulrich Weigand 6aca3e8dfa [FPEnv] Add some comments to IRBuilder.h
As requested via post-commit comment for D71467, this adds comments
documenting CreateFCmp vs. CreateFCmpS to the header file.
2020-01-14 14:21:17 +01:00
Sam Elliott dee6e39c75 [RISCV][NFC] Deduplicate Atomic Intrinsic Definitions
Summary:
This is a slight cleanup, to use multiclasses to avoid the duplication between
the different atomic intrinsic definitions. The produced intrinsics are
unchanged, they're just generated in a more succinct way.

Reviewers: asb, luismarques, jrtc27

Reviewed By: luismarques, jrtc27

Subscribers: Jim, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, s.egerton, pzheng, sameer.abuasal, apazos, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71777
2020-01-14 13:18:06 +00:00
Sam McCall 41b5201888 [Target] Fix uninitialized value in 10c11e4e2d 2020-01-14 11:28:13 +01:00
Eli Friedman e68e4cbcc5 [GlobalISel] Change representation of shuffle masks in MachineOperand.
We're planning to remove the shufflemask operand from ShuffleVectorInst
(D72467); fix GlobalISel so it doesn't depend on that Constant.

The change to prelegalizercombiner-shuffle-vector.mir happens because
the input contains a literal "-1" in the mask (so the parser/verifier
weren't really handling it properly). We now treat it as equivalent to
"undef" in all contexts.

Differential Revision: https://reviews.llvm.org/D72663
2020-01-13 16:55:41 -08:00
Alexey Lapshin f163755eb0 [Dsymutil][Debuginfo][NFC] #3 Refactor dsymutil to separate DWARF optimizing part.
Summary:
This is the next portion of patches for dsymutil.

Create DwarfEmitter interface to generate all debug info tables.
Put DwarfEmitter into DwarfLinker library and make tools/dsymutil/DwarfStreamer
to be child of DwarfEmitter.

It passes check-all testing. MD5 checksum for clang .dSYM bundle matches
for the dsymutil with/without that patch.

Reviewers: JDevlieghere, friss, dblaikie, aprantl

Reviewed By: JDevlieghere

Subscribers: merge_guards_bot, hiraditya, thegameg, probinson, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D72476
2020-01-13 23:33:25 +03:00
Teresa Johnson d0aad9f56e [LTO] Constify lto::Config reference passed to backends (NFC)
The lto::Config object saved on the global LTO object should not be
updated by any of the LTO backends. Otherwise we could run into
interference between threads utilizing it. Motivated by some proposed
changes that would have caused it to get modified in the ThinLTO
backends.
2020-01-13 12:26:17 -08:00
Daniel Sanders a0f4600f4f Rework be15dfa88f such that it works with GlobalISel which doesn't use EVT
Summary:
be15dfa88f broke GlobalISel's usage of getSetCCInverse() which currently
appears to be limited to our out-of-tree backend. GlobalISel doesn't use
EVT's and isn't able to derive them from the information it has as it
doesn't distinguish between integer and floating point types (that
distinction is made by operations rather than values). Bring back the
bool version of getSetCCInverse() in a way that doesn't break the intent
of be15dfa88f but also allows GlobalISel to continue using it.

Reviewers: spatel, bogner, arichardson

Reviewed By: arichardson

Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72309
2020-01-13 12:19:37 -08:00
Matt Arsenault d7d88b9d8b GlobalISel: Fix assertion on wide G_ZEXT sources
It's possible to have a type that needs a mask greater than 64-bits.
2020-01-13 08:29:45 -05:00
James Henderson b6ffa2fe12 [DebugInfo][Support] Replace DWARFDataExtractor size function
This patch adds a new size function to the base DataExtractor class,
which removes the need for the DWARFDataExtractor size function.

It is unclear why DWARFDataExtractor's size function returned zero in
some circumstances (i.e. when it is constructed without a section, and
with a different data source instead), so that behaviour has changed.
The old behaviour could cause an assertion in the debug line parser, as
the size did not reflect the actual data available, and could be lower
than the current offset being parsed.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72337
2020-01-13 10:53:00 +00:00
KAWASHIMA Takahiro 10c11e4e2d This option allows selecting the TLS size in the local exec TLS model,
which is the default TLS model for non-PIC objects. This allows large/
many thread local variables or a compact/fast code in an executable.

Specification is same as that of GCC. For example, the code model
option precedes the TLS size option.

TLS access models other than local-exec are not changed. It means
supoort of the large code model is only in the local exec TLS model.

Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>)
Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard
Reviewd By: peter.smith
Committed by: peter.smith

Differential Revision: https://reviews.llvm.org/D71688
2020-01-13 10:16:53 +00:00
Sam Parker 9d3e78e704 [NFC] Update loop.decrement.reg intrinsic comment
Note that the intrinsic is now understood by SCEV and that other
optimisations can treat it as a sub.
2020-01-13 09:18:57 +00:00
Fangrui Song 6fdd6a7b3f [Disassembler] Delete the VStream parameter of MCDisassembler::getInstruction()
The argument is llvm::null() everywhere except llvm::errs() in
llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no
target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds.

If we ever have the needs to add verbose log to disassemblers, we can
record log with a member function, instead of passing it around as an
argument.
2020-01-11 13:34:52 -08:00
Alexandre Ganea a1f16998f3 [Support] Optionally call signal handlers when a function wrapped by the the CrashRecoveryContext fails
This patch allows for handling a failure inside a CrashRecoveryContext in the same way as the global exception/signal handler. A failure will have the same side-effect, such as cleanup of temporarty file, printing callstack, calling relevant signal handlers, and finally returning an exception code. This is an optional feature, disabled by default.
This is a support patch for D69825.

Differential Revision: https://reviews.llvm.org/D70568
2020-01-11 15:27:07 -05:00
Craig Topper bb2553175a [TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from RunttimeLibcalls.def and all associated usages
Summary:
This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare.

This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code.

Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn

Reviewed By: efriedma

Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72536
2020-01-10 19:30:08 -08:00
Vedant Kumar e05e219926 [LockFileManager] Make default waitForUnlock timeout a parameter, NFC
Patch by Xi Ge!
2020-01-10 15:24:32 -08:00
Vedant Kumar a9052b4dfc [AArch64] Add isAuthenticated predicate to MCInstDesc
Add a predicate to MCInstDesc that allows tools to determine whether an
instruction authenticates a pointer. This can be used by diagnostic
tools to hint at pointer authentication failures.

Differential Revision: https://reviews.llvm.org/D70329

rdar://55089604
2020-01-10 14:30:52 -08:00
Jonas Devlieghere 815a3f5433 [CMake] Fix modules build after DWARFLinker reorganization
Create a dedicate module for the DWARFLinker and make it depend on
intrinsics gen.
2020-01-10 11:06:38 -08:00
Fangrui Song 4d1e23e3b3 [AArch64] Add function attribute "patchable-function-entry" to add NOPs at function entry
The Linux kernel uses -fpatchable-function-entry to implement DYNAMIC_FTRACE_WITH_REGS
for arm64 and parisc. GCC 8 implemented
-fpatchable-function-entry, which can be seen as a generalized form of
-mnop-mcount. The N,M form (function entry points before the Mth NOP) is
currently only used by parisc.

This patch adds N,0 support to AArch64 codegen. N is represented as the
function attribute "patchable-function-entry". We will use a different
function attribute for M, if we decide to implement it.

The patch reuses the existing patchable-function pass, and
TargetOpcode::PATCHABLE_FUNCTION_ENTER which is currently used by XRay.

When the integrated assembler is used, __patchable_function_entries will
be created for each text section with the SHF_LINK_ORDER flag to prevent
--gc-sections (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93197) and
COMDAT (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93195) issues.

Retrospectively, __patchable_function_entries should use a PC-relative
relocation type to avoid the SHF_WRITE flag and dynamic relocations.

"patchable-function-entry"'s interaction with Branch Target
Identification is still unclear (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 for GCC discussions).

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D72215
2020-01-10 09:55:51 -08:00
Ulrich Weigand f0fd11df7d [FPEnv] Invert sense of MIFlag::FPExcept flag
In D71841 we inverted the sense of the SDNode-level flag to ensure all nodes
default to potentially raising FP exceptions unless otherwise specified --
i.e. if we forget to propagate the flag somewhere, the effect is now only
lost performance, not incorrect code.

However, the related flag at the MI level still defaults to nodes not raising
FP exceptions unless otherwise specified. To be fully on the (conservatively)
safe side, we should invert that flag as well.

This patch does so by replacing MIFlag::FPExcept with MIFlag::NoFPExcept.
(Note that this does also introduce an incompatible change in the MIR format.)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D72466
2020-01-10 15:34:50 +01:00
Ulrich Weigand 76e9c2a987 [FPEnv] Generate constrained FP comparisons from clang
Update the IRBuilder to generate constrained FP comparisons in
CreateFCmp when IsFPConstrained is true, similar to the other
places in the IRBuilder.

Also, add a new CreateFCmpS to emit signaling FP comparisons,
and use it in clang where comparisons are supposed to be signaling
(currently, only when emitting code for the <, <=, >, >= operators).

Note that there is currently no way to add fast-math flags to a
constrained FP comparison, since this is implemented as an intrinsic
call that returns a boolean type, and FMF are only allowed for calls
returning a floating-point type. However, given the discussion around
https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself
really shouldn't have any FMF either, so this is probably OK.

Reviewed by: craig.topper

Differential Revision: https://reviews.llvm.org/D71467
2020-01-10 14:33:10 +01:00
Peng Guo cfd8498401 [MIR] Fix cyclic dependency of MIR formatter
Summary:
Move MIR formatter pointer from TargetMachine to TargetInstrInfo to
avoid cyclic dependency between target & codegen.

Reviewers: dsanders, bkramer, arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72485
2020-01-10 11:18:12 +01:00
Wei Mi 21a4710c67 [ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP
down to pass builder in ltobackend.

Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.

Differential Revision: https://reviews.llvm.org/D72386
2020-01-09 21:13:11 -08:00
Matt Arsenault 10edb1d0d4 TableGen/GlobalISel: Fix pattern matching of immarg literals
For arguments that are not expected to be materialized with
G_CONSTANT, this was emitting predicates which could never match. It
was first adding a meaningless LLT check, which would always fail due
to the operand not being a register.

Infer the cases where a literal should check for an immediate operand,
instead of a register This avoids needing to invent a special way of
representing timm literal values.

Also handle immediate arguments in GIM_CheckLiteralInt. The comments
stated it handled isImm() and isCImm(), but that wasn't really true.

This unblocks work on the selection of all of the complicated AMDGPU
intrinsics in future commits.
2020-01-09 17:37:52 -05:00
Matt Arsenault b4a647449f TableGen/GlobalISel: Add way for SDNodeXForm to work on timm
The current implementation assumes there is an instruction associated
with the transform, but this is not the case for
timm/TargetConstant/immarg values. These transforms should directly
operate on a specific MachineOperand in the source
instruction. TableGen would assert if you attempted to define an
equivalent GISDNodeXFormEquiv using timm when it failed to find the
instruction matcher.

Specially recognize SDNodeXForms on timm, and pass the operand index
to the render function.

Ideally this would be a separate render function type that looks like
void renderFoo(MachineInstrBuilder, const MachineOperand&), but this
proved to be somewhat mechanically painful. Add an optional operand
index which will only be passed if the transform should only look at
the one source operand.

Theoretically it would also be possible to only ever pass the
MachineOperand, and the existing renderers would check the parent. I
think that would be somewhat ugly for the standard usage which may
want to inspect other operands, and I also think MachineOperand should
eventually not carry a pointer to the parent instruction.

Use it in one sample pattern. This isn't a great example, since the
transform exists to satisfy DAG type constraints. This could also be
avoided by just changing the MachineInstr's arbitrary choice of
operand type from i16 to i32. Other patterns have nontrivial uses, but
this serves as the simplest example.

One flaw this still has is if you try to use an SDNodeXForm defined
for imm, but the source pattern uses timm, you still see the "Failed
to lookup instruction" assert. However, there is now a way to avoid
it.
2020-01-09 17:37:52 -05:00
Matt Arsenault 0ea3c7291f GlobalISel: Handle llvm.read_register
Compared to the attempt in bdcc6d3d26,
this uses intermediate generic instructions.
2020-01-09 17:37:52 -05:00
Matt Arsenault 255cc5a760 CodeGen: Use LLT instead of EVT in getRegisterByName
Only PPC seems to be using it, and only checks some simple cases and
doesn't distinguish between FP. Just switch to using LLT to simplify
use from GlobalISel.
2020-01-09 17:37:52 -05:00
Matt Arsenault 595ac8c46e GlobalISel: Move getLLTForMVT/getMVTForLLT
As an intermediate step, some TLI functions can be converted to using
LLT instead of MVT. Move this somewhere out of GlobalISel so DAG
functions can use these.
2020-01-09 16:32:51 -05:00
Matt Arsenault f937b43fdb TableGen/GlobalISel: Address fixme
Don't call computeAvailableFunctionFeatures for every instruction.
2020-01-09 16:29:44 -05:00
Eric Astor 1c545f6dbc [ms] [X86] Use "P" modifier on all branch-target operands in inline X86 assembly.
Summary:
Extend D71677 to apply to all branch-target operands, rather than special-casing call instructions.

Also add a regression test for llvm.org/PR44272, since this finishes fixing it.

Reviewers: thakis, rnk

Reviewed By: thakis

Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72417
2020-01-09 14:55:03 -05:00
Bruno Ricci ed6daa2e1d
[Support][NFC] Add a comment about the semantics of MF_HUGE_HINT flag 2020-01-09 17:34:18 +00:00
Whitney Tsang d27a15fed7 [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function
pass.

Summary: This patch changes LoopUnrollAndJamPass to a function pass, and
keeps the loops traversal order same as defined in
FunctionToLoopPassAdaptor LoopPassManager.h.

The next patch will change the loop traversal to outer to inner order,
so more loops can be transform.

Discussion in llvm-dev mailing list:
https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D72230
2020-01-09 16:18:36 +00:00
Simon Tatham 9704ba652a [ARM,MVE] Add missing IntrNoMem flag on IR intrinsics.
A lot of the IR-level intrinsics we've been defining for MVE recently
accidentally had `props = []` instead of `props = [IntrNoMem]`, so
that optimization would have been overcautious about reordering them.

All the affected cases were due to instantiating the multiclasses
`MVEPredicated` and `MVEMXPredicated` without filling in the `props`
parameter, because I //thought// I remembered having set the defaults
in those multiclasses to `[IntrNoMem]`. In fact I hadn't done that.
Now I have.

(The IR intrinsics that //do// read and write memory are all
explicitly marked as `[IntrReadMem]` or `[IntrWriteMem]` already, so
they will override these defaults.)
2020-01-09 15:04:47 +00:00
Kazushi (Jam) Marukawa 00c6e98409 [VE] Target stub for NEC SX-Aurora
Summary:
This patch registers the 've' target: the NEC SX-Aurora TSUBASA Vector Engine.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D69103
2020-01-09 11:17:35 +01:00
Evgeniy Brevnov f0abe820ee [LoopUtils][NFC] Minor refactoring in getLoopEstimatedTripCount. 2020-01-09 16:49:15 +07:00
Ehud Katz 24b326cc61 [APFloat] Fix checked error assert failures
`APFLoat::convertFromString` returns `Expected` result, which must be
"checked" if the LLVM_ENABLE_ABI_BREAKING_CHECKS preprocessor flag is
set.
To mark an `Expected` result as "checked" we must consume the `Error`
within.
In many cases, we are only interested in knowing if an error occured,
without the need to examine the error info. This is achieved, easily,
with the `errorToBool()` API.
2020-01-09 09:42:32 +02:00
Daniel Sanders de3d0ee023 Revert "Revert "[MIR] Target specific MIR formating and parsing""
There was an unguarded dereference of MF in a function that permitted
nullptr. Fixed

This reverts commit 71d64f72f9.
2020-01-08 20:03:29 -08:00
Nico Weber 71d64f72f9 Revert "[MIR] Target specific MIR formating and parsing"
This reverts commit 3ef05d85be.
It broke check-llvm on many bots, see comments on D69836.
2020-01-08 22:50:49 -05:00
Peng Guo 3ef05d85be [MIR] Target specific MIR formating and parsing
Summary:
Added MIRFormatter for target specific MIR formating and parsing with
immediate and custom pseudo source values. Target machine can subclass
MIRFormatter and implement custom logic for printing and parsing
immediate and custom pseudo source values for better readability.

* Target specific immediate mnemonic need to start with "." follows by
  identifier string. When MIR parser sees immediate it will call target
  specific parsing function.

* Custom pseudo source value need to start with custom follows by
  double-quoted string. MIR parser will pass the quoted string to target
  specific PSV parsing function.

* MIRFormatter have 2 helper functions to facilitate LLVM value printing
  and parsing for custom PSV if they refers LLVM values.

Patch by Peng Guo

Reviewers: dsanders, arsenm

Reviewed By: dsanders

Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69836
2020-01-08 18:48:02 -08:00
Daniel Sanders 5ab6fa7b70 Revert "[MIR] Target specific MIR formating and parsing"
Forgot to credit Peng in the commit message.

This reverts commit be841f89d0.
2020-01-08 18:48:02 -08:00
Peng Guo be841f89d0 [MIR] Target specific MIR formating and parsing
Summary:
Added MIRFormatter for target specific MIR formating and parsing with
immediate and custom pseudo source values. Target machine can subclass
MIRFormatter and implement custom logic for printing and parsing
immediate and custom pseudo source values for better readability.

* Target specific immediate mnemonic need to start with "." follows by
  identifier string. When MIR parser sees immediate it will call target
  specific parsing function.

* Custom pseudo source value need to start with custom follows by
  double-quoted string. MIR parser will pass the quoted string to target
  specific PSV parsing function.

* MIRFormatter have 2 helper functions to facilitate LLVM value printing
  and parsing for custom PSV if they refers LLVM values.

Reviewers: dsanders, arsenm

Reviewed By: dsanders

Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69836
2020-01-08 18:34:21 -08:00
Johannes Doerfert a4088c75cc [Attributor][FIX] Carefully change invokes to calls (after manifest)
Before we manually inserted unreachable early but that could lead to
broken PHI nodes. Now we use the existing late modification
functionality.
2020-01-08 19:32:38 -06:00
Johannes Doerfert 1e46eb74be [Attributor][FIX] Avoid dangling value pointers during code modification
When we replace instructions with unreachable we delete instructions. We
now avoid dangling pointers to those deleted instructions in the
`ToBeChangedToUnreachableInsts` set. Other modification collections
might need to be updated in the future as well.
2020-01-08 19:32:37 -06:00
Justin Hibbits ff0311c4b3 [PowerPC]: Add powerpcspe target triple subarch component
Summary:
This allows the use of '-target powerpcspe-unknown-linux-gnu' or
'powerpcspe-unknown-freebsd' to be used, instead of
'-target powerpc-unknown-linux-gnu -mspe'.

Reviewed By: dim
Differential Revision: https://reviews.llvm.org/D72014
2020-01-08 19:10:53 -06:00
Jonas Paulsson 659efa21f1 Recommit "[MachineVerifier] Improve verification of live-in lists."
MachineVerifier::visitMachineFunctionAfter() is extended to check the
live-through case for live-in lists. This is only done for registers without
aliases and that are neither allocatable or reserved, such as the SystemZ::CC
register.

The MachineVerifier earlier only catched the case of a live-in use without an
entry in the live-in list (as "using an undefined physical register").

A comment in LivePhysRegs.h has been added stating a guarantee that
addLiveOuts() can be trusted for a full register both before and after
register allocation.

Review: Quentin Colombet

Differential Revision: https://reviews.llvm.org/D68267
2020-01-08 16:58:54 -08:00
Evgenii Stepanov 58deb20dd2 Revert "Merge memtag instructions with adjacent stack slots."
*** Bad machine code: Tied use must be a register ***
- function:    stg_alloca17
- basic block: %bb.0 entry (0x20076710580)
- instruction: early-clobber %0:gpr64common, early-clobber %1:gpr64sp = STGloop 272, %stack.0.a :: (store 272 into %ir.a, align 16)
- operand 3:   %stack.0.a

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/21481/steps/test-check-all/logs/stdio

This reverts commit b675a7628c.
2020-01-08 14:36:12 -08:00
Kazu Hirata 2d258ed931 Revert "[JumpThreading] Thread jumps through two basic blocks"
It looks like my patch breaks the sanitizer-windows build:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324

This reverts commit ead815924e.
2020-01-08 13:58:39 -08:00
Evgenii Stepanov b675a7628c Merge memtag instructions with adjacent stack slots.
Summary:
Detect a run of memory tagging instructions for adjacent stack frame slots,
and replace them with a shorter instruction sequence
* replace STG + STG with ST2G
* replace STGloop + STGloop with STGloop

This code needs to run when stack slot offsets are already known, but before
FrameIndex operands in STG instructions are eliminated; that's the
reason for the new hook in PrologueEpilogue.

This change modifies STGloop and STZGloop pseudos to take the size as an
immediate integer operand, and base address as a FI operand when
possible. This is needed to simplify recognizing an STGloop instruction
as operating on a stack slot post-regalloc.

This improves memtag code size by ~0.25%, and it looks like an additional ~0.1%
is possible by rearranging the stack frame such that consecutive STG
instructions reference adjacent slots (patch pending).

Reviewers: pcc, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70286
2020-01-08 11:02:03 -08:00
Philip Reames 29ccb12e2c [BranchAlign] Compiler support for suppressing branch align
As discussed heavily in the original review (D70157), there's a need for the compiler to be able to selective suppress padding (either nop or prefix) to respect assumptions about the meaning of labels and instructions in generated code.

Rather than wait for syntax to be finalized - which appears to be a very slow process - this patch focuses on the compiler use case and *only* worries about the integrated assembler. To my knowledge, this covers all cases mentioned to date for clang/JIT support.

For testing purposes, I wired it up so that if the integrated assembler was using autopadding for branch alignment (e.g. enabled at command line) then the textual assembly output would contain a comment for each location where padding was enabled or disabled. This seemed like the least painful choice overall.

Note that the result of this patch effective disables the jcc errata mitigation for many constructs (statepoints, implicit null checks, xray, etc...) which is non ideal. It is at least *correct* and should allow us to enable the mitigation for the compiler. Once that's done, and a few other items are worked through, we probably want to come back to this an explore a bundling based approach instead so that we can pad instructions while keeping labels in the right place.

Differential Revision: https://reviews.llvm.org/D72303
2020-01-08 10:03:30 -08:00
Kazu Hirata ead815924e [JumpThreading] Thread jumps through two basic blocks
Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-01-08 06:57:36 -08:00
Simon Tatham dac7b23cc3 [ARM,MVE] Intrinsics for variable shift instructions.
This batch of intrinsics fills in all the shift instructions that take
a variable shift distance in a register, instead of an immediate. Some
of these instructions take a single shift distance in a scalar
register and apply it to all lanes; others take a vector of per-lane
distances.

These instructions are all basically one family, varying in whether
they saturate out-of-range values, and whether they round when bits
are shifted off the bottom. I've implemented them at the IR level by a
much smaller family of IR intrinsics, which take flag parameters to
indicate saturating and/or rounding (along with the usual one to
specify signed/unsigned integers).

An oddity is that all of them are //left// shift instructions – but if
you pass a negative shift count, they'll shift right. So the vector
shift distances are always vectors of //signed// integers, regardless
of whether you're considering the other input vector to be of signed
or unsigned. Also, even the simplest `vshlq` instruction in this
family (neither saturating nor rounding) has to be implemented as an
IR intrinsic, because the ordinary LLVM IR `shl` operation would
consider an out-of-range shift count to be undefined behavior.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72329
2020-01-08 14:42:24 +00:00
Simon Tatham 3100480925 [ARM,MVE] Intrinsics for partial-overwrite imm shifts.
This batch of intrinsics covers two sets of immediate shift
instructions, which have in common that they only overwrite part of
their output register and so they need an extra input giving its
previous value.

The VSLI and VSRI instructions shift each lane of the input vector
left or right just as if they were normal immediate VSHL/VSHR, but
then they only overwrite the output bits that correspond to actual
shifted bits of the input. So VSLI will leave the low n bits of each
output lane unchanged, and VSRI the same with the top n bits.

The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input
vector of 2n-bit integers, shift each lane right by a constant, and
then narrowing the shifted result to only n bits. So they only
overwrite half of the n-bit lanes in the output register, and the B/T
suffix indicates whether it's the bottom or top half of each 2n-bit
lane.

I've implemented the whole of the latter family using a single IR
intrinsic `vshrn`, which takes a lot of i32 parameters indicating
which instruction it expands to (by specifying signedness of the input
and output types, whether it saturates and/or rounds, etc).

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72328
2020-01-08 14:42:24 +00:00
Bevin Hansson 8e2b44f7e0 [Intrinsic] Add fixed point division intrinsics.
Summary:
This patch adds intrinsics and ISelDAG nodes for
signed and unsigned fixed-point division:

  llvm.sdiv.fix.*
  llvm.udiv.fix.*

These intrinsics perform scaled division on two
integers or vectors of integers. They are required
for the implementation of the Embedded-C fixed-point
arithmetic in Clang.

Patch by: ebevhan

Reviewers: bjope, leonardchan, efriedma, craig.topper

Reviewed By: craig.topper

Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70007
2020-01-08 15:17:46 +01:00
Qiu Chaofan b2c2fe7219 [NFC] Move InPQueue into arguments of releaseNode
This patch moves `InPQueue` into function arguments instead of template
arguments of `releaseNode`, which is a cleaner approach.

Differential Revision: https://reviews.llvm.org/D72125
2020-01-08 22:15:32 +08:00
Alexey Lapshin 1cf11a4c67 [Dsymutil][Debuginfo][NFC] Reland: Refactor dsymutil to separate DWARF optimizing part. #2.
Summary:
This patch relands D71271. The problem with D71271 is that it has cyclic dependency:
CodeGen->AsmPrinter->DebugInfoDWARF->CodeGen. To avoid cyclic dependency this patch
puts implementation for DWARFOptimizer into separate library: lib/DWARFLinker.

Thus the difference between this patch and D71271 is in that DWARFOptimizer renamed into
DWARFLinker and it`s files are put into lib/DWARFLinker.

Reviewers: JDevlieghere, friss, dblaikie, aprantl

Reviewed By: JDevlieghere

Subscribers: thegameg, merge_guards_bot, probinson, mgorny, hiraditya, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D71839
2020-01-08 14:15:31 +03:00
Tim Northover 903e5c3028 AArch64: add missing Apple CPU names and use them by default.
Apple's CPUs are called A7-A13 in official communication, occasionally with
weird suffixes which we probably don't need to care about. This adds each one
and describes its features. It also switches the default CPU to the canonical
name for Cyclone, but leaves legacy support in so that existing bitcode still
compiles.
2020-01-08 09:24:06 +00:00
Wang, Pengfei 9a621de1ec [X86] Adding fp128 support for strict fcmp
Summary: Adding fp128 support for strict fcmp

Reviewers: craig.topper, LiuChen3, andrew.w.kaylor, RKSimon, uweigand

Subscribers: hiraditya, llvm-commits, LuoYuanke

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71897
2020-01-08 12:59:31 +08:00
czhengsz 8b8ba44047 [SCEV] get more accurate range for AddExpr with wrap flag.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D64869
2020-01-07 20:58:04 -05:00
Amara Emerson b6598bcf4b [AArch64][GlobalISel] Fold a chain of two G_PTR_ADDs of constant offsets.
E.g.
%addr1 = G_PTR_ADD %base, G_CONSTANT 20
%addr2 = G_PTR_ADD %addr1, G_CONSTANT 8
  -->
%addr2 = G_PTR_ADD %base, G_CONSTANT 28

Differential Revision: https://reviews.llvm.org/D72351
2020-01-07 14:12:42 -08:00
Bill Wendling e886e762dd Revert "Allow output constraints on "asm goto""
This reverts commit 52366088a8.

I accidentally pushed this before supporting changes.
2020-01-07 13:44:08 -08:00
Bill Wendling 52366088a8 Allow output constraints on "asm goto"
Summary:
Remove the restrictions that preventing "asm goto" from returning non-void
values. The values returned by "asm goto" are only valid on the "fallthrough"
path.

Reviewers: jyknight, nickdesaulniers, hfinkel

Reviewed By: jyknight, nickdesaulniers

Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69876
2020-01-07 13:40:26 -08:00
Fangrui Song 8edf759ca7 [PowerPC][Triple] Use elfv2 on freebsd>=13 and linux-musl
Summary:
Every powerpc64le platform uses elfv2.

For powerpc64, the environments "elfv1" and "elfv2" were added for
FreeBSD ELFv1->ELFv2 migration in D61950.  FreeBSD developers have
decided to use OS versions to select ABI, and no one is relying on the
environments.

Also use elfv2 on powerpc64-linux-musl.

Users can always use -mabi=elfv1 and -mabi=elfv2 to override the default
ABI.

Reviewed By: adalava

Differential Revision: https://reviews.llvm.org/D72352
2020-01-07 11:40:56 -08:00
Daniel Sanders 1d94fb2111 [gicombiner] Add GIMatchTree and use it for the code generation
Summary:
GIMatchTree's job is to build a decision tree by zipping all the
GIMatchDag's together.

Each DAG is added to the tree builder as a leaf and partitioners are used
to subdivide each node until there are no more partitioners to apply. At
this point, the code generator is responsible for testing any untested
predicates and following any unvisited traversals (there shouldn't be any
of the latter as the getVRegDef partitioner handles them all).

Note that the leaves don't always fit into partitions cleanly and the
partitions may overlap as a result. This is resolved by cloning the leaf
into every partition it belongs to. One example of this is a rule that can
match one of N opcodes. The leaf for this rule would end up in N partitions
when processed by the opcode partitioner. A similar example is the
getVRegDef partitioner where having rules (add $a, $b), and (add ($a, $b), $c)
will result in the former being in the partition for successfully
following the vreg-def and failing to do so as it doesn't care which
happens.

Depends on D69151

Fixed the issues with the windows bots which were caused by stdout/stderr
interleaving.

Reviewers: bogner, volkan

Reviewed By: volkan

Subscribers: lkail, mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69152
2020-01-07 11:12:53 -08:00
Matt Arsenault f26ed6e47c llc: Change behavior of -mcpu with existing attribute
Don't overwrite existing target-cpu attributes.

I've often found the replacement behavior annoying, and this is
inconsistent with how the fast math command line flags interact with
the function attributes.

Does not yet change target-features, since I think that should behave
as a concatenation.
2020-01-07 10:10:25 -05:00
Simon Pilgrim c758e46923 Fix Wdocumentation warnings. NFCI. 2020-01-07 10:55:38 +00:00
Ehud Katz 08de551f4f [APFloat] Fix fusedMultiplyAdd when `this` equals to `Addend`
Up until now, the arguments to `fusedMultiplyAdd` are passed by
reference. We must save the `Addend` value on the beginning of the
function, before we modify `this`, as they may be the same reference.

To fix this, we now pass the `addend` parameter of `multiplySignificand`
by value (instead of by-ref), and have a default value of zero.

Fix PR44051.

Differential Revision: https://reviews.llvm.org/D70422
2020-01-07 08:45:18 +02:00
Juneyoung Lee ff554a9179 Let PassBuilder Expose PassInstrumentationCallbacks
Summary:
This is an effort to allowing external libraries register their own pass instrumentation during their llvmGetPassPluginInfo() calls.

By exposing this through the added getPIC(), now a pass writer can do something like this:

```
extern "C" ::llvm::PassPluginLibraryInfo LLVM_ATTRIBUTE_WEAK
llvmGetPassPluginInfo() {
  return {
    ..,
    [](llvm::PassBuilder &PB) {
      PB.getPIC()->registerAfterPassCallback(move(f));
    }
  };
}
```

Reviewers: chandlerc, philip.pfaffe, fedor.sergeev

Reviewed By: fedor.sergeev

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71086
2020-01-07 14:10:37 +09:00
Fangrui Song aa708763d3 [MC] Add parameter `Address` to MCInstPrinter::printInst
printInst prints a branch/call instruction as `b offset` (there are many
variants on various targets) instead of `b address`.

It is a convention to use address instead of offset in most external
symbolizers/disassemblers. This difference makes `llvm-objdump -d`
output unsatisfactory.

Add `uint64_t Address` to printInst(), so that it can pass the argument to
printInstruction(). `raw_ostream &OS` is moved to the last to be
consistent with other print* methods.

The next step is to pass `Address` to printInstruction() (generated by
tablegen from the instruction set description). We can gradually migrate
targets to print addresses instead of offsets.

In any case, downstream projects which don't know `Address` can pass 0 as
the argument.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72172
2020-01-06 20:42:22 -08:00
Fangrui Song 6904cd9486 Add Triple::isX86()
Reviewed By: craig.topper, skan

Differential Revision: https://reviews.llvm.org/D72247
2020-01-06 15:51:02 -08:00
Matt Arsenault f3de8ab5cc GlobalISel: Implement lower for G_INTRINSIC_ROUND
Mostly copied from AMDGPU lowering implementation, except used
G_SITOFP instead of directly creating a select on -1.0, 0.0.
2020-01-06 18:26:42 -05:00
Matt Arsenault ee6b8722ff GlobalISel: Fix unsupported legalize action
This would complain about invalid legalizer rules otherwise.

Mark some operations as unsupported for AMDGPU. This currently seems
to produce the same legalize error as when no rules are defined, but
eventually this should produce a proper user facing error.
2020-01-06 17:21:51 -05:00
Matt Arsenault 0b093f0212 GlobalISel: Start adding computeNumSignBits to GISelKnownBits 2020-01-06 17:21:51 -05:00
Matt Arsenault 5518a02a83 llc/MIR: Fix setFunctionAttributes for MIR functions
A random set of attributes are implemented by llc/opt forcing the
string attributes on the IR functions before processing anything. This
would not happen for MIR functions, which have not yet been created at
this point.

Use a callback in the MIR parser, purely to avoid dealing with the
ugliness that the command line flags are in a .inc file, and would
require allowing access to these flags from multiple places (either
from the MIR parser directly, or a new utility pass to implement these
flags). It would probably be better to cleanup the flag handling into
a separate library.

This is in preparation for treating more command line flags with a
corresponding function attribute in a more uniform way. The fast math
flags in particular have a messy system where the command line flag
sets the behavior from a function attribute if present, and otherwise
the command line flag. This means if any other pass tries to inspect
the function attributes directly, it will be inconsistent with the
intended behavior. This is also inconsistent with the current behavior
of -mcpu and -mattr, which overwrites any pre-existing function
attributes. I would like to move this to consistenly have the command
line flags not overwrite any pre-existing attributes, and to always
ensure the command line flags are consistent with the function
attributes.
2020-01-06 17:21:51 -05:00
Simon Tatham 34817e04fe [ARM,MVE] Fix many signedness errors in MVE intrinsics.
Summary:
Running an end-to-end test last week I noticed that a lot of the ACLE
intrinsics that operate differently on vectors of signed and unsigned
integers were ending up generating the signed version of the
instruction unconditionally. This is because the IR intrinsics had no
way to distinguish signed from unsigned: the LLVM type system just
calls them both `v8i16` (or whatever), so you need either separate
intrinsics for signed and unsigned, or a flag parameter that tells
ISel which one to choose.

This patch fixes all the problems of that kind that I've noticed, by
adding an i32 flag parameter to many of the IR intrinsics which is set
to 1 for unsigned (matching the existing practice in cases where we
got it right), and conditioning all the isel patterns on that flag. So
the fundamental change is in `IntrinsicsARM.td`, changing the
low-level IR intrinsics API; there are knock-on changes in
`arm_mve.td` (adjusting code gen for the ACLE intrinsics to use the
modified API) and in `ARMInstrMVE.td` (adjusting isel to expect the
new unsigned flags). The rest of this patch is boringly updating tests.

Reviewers: dmgreen, miyuki, MarkMurrayARM

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72270
2020-01-06 16:33:16 +00:00
Matt Arsenault e4464bf3d4 AMDGPU/GlobalISel: Select scalar v2s16 G_BUILD_VECTOR 2020-01-06 11:19:33 -05:00
James Henderson d68904f957 [NFC] Fix trivial typos in comments
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72143

Patch by Kazuaki Ishizaki.
2020-01-06 10:50:26 +00:00
Shengchen Kan 5173bfcbc4 Add interface emitPrefix for MCCodeEmitter
Differential Revision: https://reviews.llvm.org/D72047
2020-01-06 17:51:14 +08:00
Ehud Katz c5fb73c5d1 [APFloat] Add recoverable string parsing errors to APFloat
Implementing the APFloat part in PR4745.

Differential Revision: https://reviews.llvm.org/D69770
2020-01-06 10:09:01 +02:00
Anton Afanasyev a792953330 [Metadata] Add TBAA struct metadata to `AAMDNode`
Summary:
Make `AAMDNodes`' `getAAMetadata()` and `setAAMetadata()` to take `!tbaa.struct`
into account as well as `!tbaa`. This impacts llvm.org/pr42022.
This is a temprorary fix needed to keep `!tbaa.struct` tag by SROA pass.
New field `TBAAStruct` should be deleted when `!tbaa` tag replaces `!tbaa.struct`.
Merging two `!tbaa.struct`'s to one is conservatively considered to be `nullptr`
(giving `MayAlias`) -- this could be enhanced, but relying on the said future
replacement.

Reviewers: RKSimon, spatel, vporpo

Subscribers: hiraditya, kosarev, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70924
2020-01-06 11:05:15 +03:00
Fangrui Song 2e46695003 [MC] Reorder members of MCFragment's subclasses to decrease padding
On a 64-bit platform:

sizeof(MCBoundaryAlignFragment): 64 -> 56
sizeof(MCOrgFragment): 72 -> 64
sizeof(MCFillFragment): 80 -> 72
sizeof(MCLEBFragment): 88 -> 80
2020-01-05 20:22:16 -08:00
Fangrui Song 806a2b1f3d [MC] Reorder MCFragment members to decrease padding
sizeof(MCFragment) does not change, but some if its subclasses do, e.g.
on a 64-bit platform,
sizeof(MCEncodedFragment) decreases from 64 to 56,
sizeof(MCDataFragment) decreases from 224 to 216.
2020-01-05 19:09:40 -08:00
Fangrui Song 2c053109fa [MC] Delete MCFragment::isDummy. NFC
isa<...>, dyn_cast<...> and cast<...> are used by other fragments.
Don't make MCDummyFragment special.
2020-01-05 18:49:47 -08:00
Fangrui Song 5511861e6d [MC][ARM] Delete MCSection::HasData and move SHF_ARM_PURECODE logic to ARMELFObjectWriter::addTargetSectionFlags
This simplifies the generic interface and also makes SHF_ARM_PURECODE
more robust (fixes a TODO). Inspecting MCDataFragment contents covers
more cases than MCObjectStreamer::EmitBytes.
2020-01-05 14:20:34 -08:00
Fangrui Song 586acd8490 [MC] Delete MCSection::{rbegin,rend} 2020-01-05 12:51:15 -08:00
Fangrui Song 124b918bd3 [MC] Merge MCSymbol::getSectionPtr into getSection and simplify 2020-01-05 12:03:40 -08:00
Florian Hahn b8a3c34eee Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."
This reverts commit 51ef53f3bd, as it
breaks some bots.
2020-01-04 18:44:38 +00:00
Florian Hahn 51ef53f3bd [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-01-04 18:29:35 +00:00
Matt Arsenault 1f950ced50 GlobalISel: Define G_READCYCLECOUNTER 2020-01-04 13:10:19 -05:00
Daniel Sanders 5d304d68dd Revert "[gicombiner] Add GIMatchTree and use it for the code generation"
All the windows bots are failing match-tree.td and there's no obvious cause that
I can see. It's not just the %p formatting problem. My best guess is that
there's an ordering issue too but I'll need further information to figure that
out. Revert while I'm investigating.

This reverts commit 64f1bb5cd2 and 77d4b5f5fe
2020-01-03 18:17:00 -08:00
Francis Visoiu Mistrih c8ab40ca0e [Remarks] Warn if a remark file is not found when processing static archives
Static archives contain object files which contain sections pointing to
external remark files.

When static archives are shipped without the remark files, dsymutil
shouldn't generate an error.

Instead, generate a warning to inform the user that remarks for that
library won't be available in the .dSYM.
2020-01-03 17:02:10 -08:00
Daniel Sanders 64f1bb5cd2 [gicombiner] Add GIMatchTree and use it for the code generation
Summary:
GIMatchTree's job is to build a decision tree by zipping all the
GIMatchDag's together.

Each DAG is added to the tree builder as a leaf and partitioners are used
to subdivide each node until there are no more partitioners to apply. At
this point, the code generator is responsible for testing any untested
predicates and following any unvisited traversals (there shouldn't be any
of the latter as the getVRegDef partitioner handles them all).

Note that the leaves don't always fit into partitions cleanly and the
partitions may overlap as a result. This is resolved by cloning the leaf
into every partition it belongs to. One example of this is a rule that can
match one of N opcodes. The leaf for this rule would end up in N partitions
when processed by the opcode partitioner. A similar example is the
getVRegDef partitioner where having rules (add $a, $b), and (add ($a, $b), $c)
will result in the former being in the partition for successfully
following the vreg-def and failing to do so as it doesn't care which
happens.

Depends on D69151

Reviewers: bogner, volkan

Reviewed By: volkan

Subscribers: lkail, mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69152
2020-01-03 16:23:23 -08:00
Matt Arsenault 21309eafde GlobalISel: Add type argument to getRegBankFromRegClass
AMDGPU can't unambiguously go back from the selected instruction
register class to the register bank without knowing if this was used
in a boolean context.
2020-01-03 16:25:10 -05:00
Stefan Gränitz c7191d3acd [NFC][ORC] Fix typos and whitespaces in comments 2020-01-03 21:54:03 +01:00
Jay Foad 07bc851b21 [TargetLowering] Remove comments referring to TLOF
These have been obsolete since about r221926, when
TargetLoweringObjectFile was completely moved from TargetLowering to
TargetMachine.
2020-01-03 13:35:03 +00:00
Hideto Ueno 5fc02dc0a7 Revert "[Attributor] AAValueConstantRange: Value range analysis using constant range"
This reverts commit e996303431.
2020-01-03 11:03:56 +09:00
Reid Kleckner 783db78835 [PDB] Print the most redundant type record indices with /summary
Summary:
I used this information to motivate splitting up the Intrinsic::ID enum
(5d986953c8) and adding a key method to
clang::Sema (586f65d31f) which saved a
fair amount of object file size.

Example output for clang.pdb:

  Top 10 types responsible for the most TPI input bytes:
         index     total bytes   count     size
        0x3890:      8,671,220 = 1,805 *  4,804
       0xE13BE:      5,634,720 =   252 * 22,360
       0x6874C:      5,181,600 =   408 * 12,700
        0x2A1F:      4,520,528 = 1,574 *  2,872
       0x64BFF:      4,024,020 =   469 *  8,580
        0x1123:      4,012,020 = 2,157 *  1,860
        0x6952:      3,753,792 =   912 *  4,116
        0xC16F:      3,630,888 =   633 *  5,736
        0x69DD:      3,601,160 =   985 *  3,656
        0x678D:      3,577,904 =   319 * 11,216

In this case, we can see that record 0x3890 is responsible for ~8MB of
total object file size for objects in clang.

The user can then use llvm-pdbutil to find out what the record is:

  $ llvm-pdbutil dump -types -type-index 0x3890
                       Types (TPI Stream)
  ============================================================
    Showing 1 records.
       0x3890 | LF_FIELDLIST [size = 4804]
                - LF_STMEMBER [name = `WORDTYPE_MAX`, type = 0x1001, attrs = public]
                - LF_MEMBER [name = `U`, Type = 0x37F0, offset = 0, attrs = private]
                - LF_MEMBER [name = `BitWidth`, Type = 0x0075 (unsigned), offset = 8, attrs = private]
                - LF_METHOD [name = `APInt`, # overloads = 8, overload list = 0x3805]
  ...

In this case, we can see that these are members of the APInt class,
which is emitted in 1805 object files.

The next largest type is ASTContext:

  $ llvm-pdbutil dump -types -type-index 0xE13BE bin/clang.pdb
      0xE13BE | LF_FIELDLIST [size = 22360]
                - LF_BCLASS
                  type = 0x653EA, offset = 0, attrs = public
                - LF_MEMBER [name = `Types`, Type = 0x653EB, offset = 8, attrs = private]
                - LF_MEMBER [name = `ExtQualNodes`, Type = 0x653EC, offset = 24, attrs = private]
                - LF_MEMBER [name = `ComplexTypes`, Type = 0x653ED, offset = 48, attrs = private]
                - LF_MEMBER [name = `PointerTypes`, Type = 0x653EE, offset = 72, attrs = private]
  ...

ASTContext only appears 252 times, but the list of members is long, and
must be repeated everywhere it is used.

This was the output before I split Intrinsic::ID:

  Top 10 types responsible for the most TPI input:
        0x686C:     69,823,920 = 1,070 * 65,256
        0x686D:     69,819,640 = 1,070 * 65,252
        0x686E:     69,819,640 = 1,070 * 65,252
        0x686B:     16,371,000 = 1,070 * 15,300
        ...

These records were all lists of intrinsic enums.

Reviewers: MaskRay, ruiu

Subscribers: mgrang, zturner, thakis, hans, akhuang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71437
2020-01-02 16:10:36 -08:00
Saleem Abdulrasool abb0075306 build: reduce CMake handling for zlib
Rather than handling zlib handling manually, use `find_package` from CMake
to find zlib properly. Use this to normalize the `LLVM_ENABLE_ZLIB`,
`HAVE_ZLIB`, `HAVE_ZLIB_H`. Furthermore, require zlib if `LLVM_ENABLE_ZLIB` is
set to `YES`, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.

This restores 68a235d07f,
e6c7ed6d21.  The problem with the windows
bot is a need for clearing the cache.
2020-01-02 11:19:12 -08:00
James Henderson bd402fc3f3 [DebugInfo][NFC] Use function_ref consistently in debug line parsing
This patch fixes an inconsistency where we were using std::function in
some places and function_ref in others to pass around the error handling
callback.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D71762
2020-01-02 18:01:54 +00:00
Alina Sbirlea a0d496d5b0 [NewPassManager] Rename AM to OuterAM in the OuterAnalysisManagerProxy [NFCI].
Provides clarity and consistency with the InnerAnalysisManagerProxy.
2020-01-02 09:42:53 -08:00
James Henderson e406cca5f9 Revert "build: reduce CMake handling for zlib"
This reverts commit 68a235d07f.

This commit broke the clang-x64-windows-msvc build bot and a follow-up
commit did not fix it. Reverting to fix the bot.
2020-01-02 16:02:10 +00:00
Ulrich Weigand 63336795f0 [FPEnv] Default NoFPExcept SDNodeFlag to false
The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all
other such flags. This is a problem, because it implies that all code that
transforms SDNodes without copying flags can introduce a correctness bug,
not just a missed optimization.

This patch changes the default to false. This makes it necessary to move
setting the (No)FPExcept flag for constrained intrinsics from the
visitConstrainedIntrinsic routine to the generic visit routine at the
place where the other flags are set, or else the intersectFlagsWith
call would erase the NoFPExcept flag again.

In order to avoid making non-strict FP code worse, whenever
SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes
none of which can raise FP exceptions, it will preserve this property
on all results nodes generated, by setting the NoFPExcept flag on
those result nodes that would otherwise be considered as raising
an FP exception.

To check whether or not an SD node should be considered as raising
an FP exception, the following logic applies:

- For machine nodes, check the mayRaiseFPException property of
  the underlying MI instruction
- For regular nodes, check isStrictFPOpcode
- For target nodes, check a newly introduced isTargetStrictFPOpcode

The latter is implemented by reserving a range of target opcodes,
similarly to how memory opcodes are identified. (Note that there a
bit of a quirk in identifying target nodes that are both memory nodes
and strict FP nodes. To simplify the logic, right now all target memory
nodes are automatically also considered strict FP nodes -- this could
be fixed by adding one more range.)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D71841
2020-01-02 16:59:45 +01:00
serge_sans_paille 24ab9b537e Generalize the pass registration mechanism used by Polly to any third-party tool
There's quite a lot of references to Polly in the LLVM CMake codebase. However
the registration pattern used by Polly could be useful to other external
projects: thanks to that mechanism it would be possible to develop LLVM
extension without touching the LLVM code base.

This patch has two effects:

1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it
   with a generic mechanism

2. Provide a generic mechanism to register compiler extensions.

A compiler extension is similar to a pass plugin, with the notable difference
that the compiler extension can be configured to be built dynamically (like
plugins) or statically (like regular passes).

As a result, people willing to add extra passes to clang/opt can do it using a
separate code repo, but still have their pass be linked in clang/opt as built-in
passes.

Differential Revision: https://reviews.llvm.org/D61446
2020-01-02 16:45:31 +01:00
Andrzej Warzynski 404da13e1e [AArch64][SVE] Gather loads: pass 32 bit unpacked offsets as nxv2i32
Summary:
Currently 32 bit unpacked offsets are passed as nxv2i64. However, as
pointed out in https://reviews.llvm.org/D71074, using nxv2i32 instead
would improve consistency with:
  * how other arguments are treated
  * how scatter stores are implemented
This patch makes sure that 32 bit unpacked offsets are passes as nxv2i32
instead of nxv2i64.

Reviewers: sdesmalen, efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71724
2020-01-02 13:01:28 +00:00
Brian Gesiak 2fcf7691df [Coroutines] Rename "legacy" passes (NFC)
A series of patches beginning with https://reviews.llvm.org/D71898
propose to add an implementation of the coroutine passes to the new pass
manager. As part of these changes, the coroutine passes that implement
the legacy pass manager interface are renamed, to `<PassName>Legacy`.
This mirrors similar changes that have been made to many other passes in
LLVM as they've been transitioned to support both old and new pass
managers.

This commit splits out the renaming portion of that patch and commits it
in advance as an NFC (no functional change intended) commit. It renames:

* `CoroEarly` => `CoroEarlyLegacy`
* `CoroSplit` => `CoroSplitLegacy`
* `CoroElide` => `CoroElideLegacy`
* `CoroCleanup` => `CoroCleanupLegacy`
2020-01-01 21:41:16 -05:00
Saleem Abdulrasool 68a235d07f build: reduce CMake handling for zlib
Rather than handling zlib handling manually, use `find_package` from CMake
to find zlib properly. Use this to normalize the `LLVM_ENABLE_ZLIB`,
`HAVE_ZLIB`, `HAVE_ZLIB_H`. Furthermore, require zlib if `LLVM_ENABLE_ZLIB` is
set to `YES`, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.
2020-01-01 16:36:59 -08:00
Lorenzo Casalino f9f78cf6ac [MachineScheduler] improve reuse of 'releaseNode'method
The 'SchedBoundary::releaseNode' is merely invoked for releasing the Top/Bottom root nodes.
However,  'SchedBoundary::releasePending' uses its same logic to check if the Pending queue
has any releasable SUnit.
It is possible to slightly modify the body of the two, allowing re-use of the former ('releaseNode')
in the latter.

Patch by Lorenzo Casalino <lorenzo.casalino93@gmail.com>

Reviewers: MatzeB, fhahn, atrick

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D65506
2020-01-01 20:22:32 +00:00
Mark de Wever 8dc7b982b4 [NFC] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D71857
2020-01-01 20:01:37 +01:00
Fangrui Song d2bb8c16e7 [MC][TargetMachine] Delete MCTargetOptions::MCPIECopyRelocations
clang/lib/CodeGen/CodeGenModule performs the -mpie-copy-relocations
check and sets dso_local on applicable global variables. We don't need
to duplicate the work in TargetMachine shouldAssumeDSOLocal.

Verified that -mpie-copy-relocations can still emit PC relative
relocations for external variable accesses.

clang -target x86_64 -fpie -mpie-copy-relocations -c => R_X86_64_PC32
clang -target aarch64 -fpie -mpie-copy-relocations -c => R_AARCH64_ADR_PREL_PG_HI21+R_AARCH64_LDST64_ABS_LO12_NC
2020-01-01 00:50:18 -08:00
Hideto Ueno e996303431 [Attributor] AAValueConstantRange: Value range analysis using constant range
This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point.
One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument).

The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty).

Currently, AAValueConstantRange is created when AAValueSimplify cannot
simplify the value.

Supported
 - BinaryOperator(add, sub, ...)
 - CmpInst(icmp eq, ...)
 - !range metadata

`AAValueConstantRange` is not intended to extend to polyhedral range value analysis.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D71620
2020-01-01 15:35:56 +09:00
Johannes Doerfert df3b56c905 [Attributor][Fix] Avoid leaking memory after D68765 2019-12-31 10:55:07 -06:00
Johannes Doerfert 751336340d [Attributor] Function signature rewrite infrastructure
As part of the Attributor manifest we want to change the signature of
functions. This patch introduces a fairly generic interface to do so.
As a first, very simple, use case, we remove unused arguments. A second
use case, pointer privatization, will be committed with this patch as
well.

A lot of the code and ideas are taken from argument promotion and we
run all argument promotion tests through this framework as well.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D68765
2019-12-31 02:31:33 -06:00
Johannes Doerfert b1b441d22d [Attributor] Use abstract call sites to determine associated arguments
This is the second step after D67871 to make use of abstract call sites.
In this patch the argument we associate with a abstract call site
argument can be the one in the callback callee instead of the one in the
callback broker.

Caveat: We cannot allow no-alias arguments for problematic callbacks:
As described in [1], adding no-alias (or restrict) to arguments could
break synchronization as the synchronization effect, e.g., a barrier,
does not "alias" with the pointer anymore. This disables no-alias
annotation for potentially problematic arguments until we implement the
fix described in [1].

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D68008

[1] Compiler Optimizations for OpenMP, J. Doerfert and H. Finkel,
    International Workshop on OpenMP 2018,
    http://compilers.cs.uni-saarland.de/people/doerfert/par_opt18.pdf
2019-12-31 01:33:22 -06:00
Craig Topper 787e078f3e [TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues instead of creating a MERGE_VALUES node. NFCI
This allows us to clean up some places that were peeking through
the MERGE_VALUES node after the call. By returning the SDValues
directly, we can clean that up.

Unfortunately, there are several call sites in AMDGPU that wanted
the MERGE_VALUES and now need to create their own.
2019-12-30 19:36:04 -08:00
Craig Topper 831898ff8a [SelectionDAG] Fix copy/paste mistake in comment. NFC
I think this was copied from scalarizeVectorLoad where that is
what happens.
2019-12-30 19:36:04 -08:00
Johannes Doerfert 10fedd94b4 [OpenMP] Use the OpenMPIRBuilder for `omp parallel`
This allows to use the OpenMPIRBuilder for parallel regions. Code was
extracted from D61953 and adapted to work with the new version (D70109).

All but one feature should be supported. An update of this patch will
provide test coverage and privatization other than shared.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D70290
2019-12-30 13:57:13 -06:00
Johannes Doerfert 000c6a5038 [OpenMP] Use the OpenMPIRBuilder for `omp cancel`
An `omp cancel parallel` needs to be emitted by the OpenMPIRBuilder if
the `parallel` was emitted by the OpenMPIRBuilder. This patch makes
this possible. The cancel logic is shared with the cancel barriers.
Testing is done via unit tests and the clang cancel_codegen.cpp file
once D70290 lands.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D71948
2019-12-30 13:57:13 -06:00
Eric Astor 4a7aa252a3 [X86][AsmParser] re-introduce 'offset' operator
Summary:
Amend MS offset operator implementation, to more closely fit with its MS counterpart:

    1. InlineAsm: evaluate non-local source entities to their (address) location
    2. Provide a mean with which one may acquire the address of an assembly label via MS syntax, rather than yielding a memory reference (i.e. "offset asm_label" and "$asm_label" should be synonymous
    3. address PR32530

Based on http://llvm.org/D37461

Fix broken test where the break appears unrelated.

- Set up appropriate memory-input rewrites for variable references.

- Intel-dialect assembly printing now correctly handles addresses by adding "offset".

- Pass offsets as immediate operands (using "r" constraint for offsets of locals).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D71436
2019-12-30 14:35:26 -05:00
Fangrui Song 03b9f0a5e1 Ignore "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" in favor of "frame-pointer"
D56351 (included in LLVM 8.0.0) introduced "frame-pointer".  All tests
which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf"
have been migrated to use "frame-pointer".

Implement UpgradeFramePointerAttributes to upgrade the two obsoleted
function attributes for bitcode. Their semantics are ignored.

Differential Revision: https://reviews.llvm.org/D71863
2019-12-30 09:46:19 -08:00
Petar Avramovic 98f72a5107 [MIPS GlobalISel] Select bitreverse. Recommit
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.

Recommit notes:
Introduce temporary variables in order to make sure
instructions get inserted into MachineFunction in same order
regardless of compiler used to build llvm.

Differential Revision: https://reviews.llvm.org/D71363
2019-12-30 18:06:29 +01:00
Dmitri Gribenko 32cc14100e Revert "[MIPS GlobalISel] Select bitreverse"
This reverts commit dbc136e0fe.
It broke buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066
2019-12-30 14:29:47 +01:00
Petar Avramovic dbc136e0fe [MIPS GlobalISel] Select bitreverse
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.

Differential Revision: https://reviews.llvm.org/D71363
2019-12-30 11:26:45 +01:00
Petar Avramovic 94a24e7a40 [MIPS GlobalISel] Select bswap
G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates
these intrinsics from __builtin_bswap32 and __builtin_bswap64.
Add lower and narrowscalar for G_BSWAP.
Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later.

Differential Revision: https://reviews.llvm.org/D71362
2019-12-30 11:13:22 +01:00
Hideto Ueno 34fe8d0451 [Attributor] Use `changeUseAfterManifest` in AAValueSimplify manifest
Summary: This patch makes `AAValueSimplify` use `changeUsesAfterManifest` in `manifest`. This will invoke simple folding after the manifest.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71972
2019-12-30 17:08:48 +09:00
Fangrui Song 5edb40c022 [SelectionDAG] Disallow indirect "i" constraint
This allows us to delete InlineAsm::Constraint_i workarounds in
SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and
TargetLowering::getInlineAsmMemConstraint overrides.

They were introduced to X86 in r237517 to prevent crashes for
constraints like "=*imr". They were later copied to other targets.
2019-12-29 16:50:42 -08:00
Hideto Ueno ef4febd85b [Attributor] AAUndefinedBehavior: Check for branches on undef value.
A branch is considered UB if it depends on an undefined / uninitialized value.
At this point this handles simple UB branches in the form: `br i1 undef, ...`
We query `AAValueSimplify` to get a value for the branch condition, so the branch
can be more complicated than just: `br i1 undef, ...`.

Patch By: Stefanos Baziotis (@baziotis)

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D71799
2019-12-29 17:43:00 +09:00
Brian Gesiak 0bc7665d98 [ADT] Fix FoldingSet documentation typos
* "If found then M with be non-NULL" should be "will be non-NULL".
* The documentation examples (1) and (2) declare and use a variable
  `MyNode *M`, but examples (3) and (4) switch midway to using a
  variable named `N`. Unify the examples to all use `M`.
* The examples demonstrate the use of member functions of
  `FoldingSet`, but (3) and (4) invoke these as if they were free
  functions. Modify them to call member functions on the `MyFoldingSet`
  object constructed in the code above example (1).
2019-12-27 21:27:59 -05:00
Fangrui Song f7910496c8 [Intrinsic] Delete tablegen rules of llvm.{sig,}{setjmp,longjmp} 2019-12-27 18:04:39 -08:00
Matt Arsenault e29ae3799b TII: Fix using Register for a subregister index argument 2019-12-27 16:53:29 -05:00
Danilo Carvalho Grael 2abda66848 [NFC][DA] Remove duplicate code in checkSrcSubscript and checkDstSubscript
Summary:
[DA] Move common code in checkSrcSubscript and checkDstSubscript to a
new function checkSubscript. This avoids duplicate code and possible
out of sync in the future.

Reviewers: sebpop, jmolloy, reames

Reviewed By: sebpop

Subscribers: bmahjour, hiraditya, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71087

Patch by zhongduo.
2019-12-27 10:06:19 -05:00
Fangrui Song 7a7334663c Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821
Intrinsic has incorrect argument type!
  i32 (i32*)* @llvm.setjmp

*wipes tear*
2019-12-27 00:00:14 -08:00
Hideto Ueno cb5eb13eaf [Attributor] Add helper to change an instruction to `unreachable` inst
Summary: Calling `changeToUnreachable` in `manifest` from different places might cause really unpredictable problems. As other deleting functions are doing, we need to change these instructions after all `manifest`.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71910
2019-12-27 02:39:37 +09:00
Johannes Doerfert 6c5d1f40ff [OpenMP][NFCI] Use the libFrontend ProcBindKind in Clang
This removes the OpenMPProcBindClauseKind enum in favor of
llvm::omp::ProcBindKind which lives in OpenMPConstants.h and was
introduced in D70109.

No change in behavior is expected.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D70289
2019-12-26 11:04:07 -06:00
Johannes Doerfert e4add9727b [OpenMP][IR-Builder] Introduce "pragma omp parallel" code generation
This patch combines the `emitParallel` logic prototyped in D61953 with
the OpenMPIRBuilder (D69785) and introduces `CreateParallel`.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D70109
2019-12-25 18:02:23 -06:00
Johannes Doerfert f9c3c5da19 [OpenMP][IR-Builder] Introduce the finalization stack
As a permanent and generic solution to the problem of variable
finalization (destructors, lastprivate, ...), this patch introduces the
finalization stack. The objects on the stack describe (1) the
(structured) regions the OpenMP-IR-Builder is currently constructing,
(2) if these are cancellable, and (3) the callback that will perform the
finalization (=cleanup) when necessary.

As the finalization can be necessary multiple times, at different source
locations, the callback takes the position at which code is currently
generated. This position will also encode the destination of the "region
exit" block *iff* the finalization call was issues for a region
generated by the OpenMPIRBuilder. For regions generated through the old
Clang OpenMP code geneneration, the "region exit" is determined by Clang
inside the finalization call instead (see getOMPCancelDestination).

As a first user, the parallel + cancel barrier interaction is changed.
In contrast to the temporary solution before, the barrier generation in
Clang does not need to be aware of the "CancelDestination" block.
Instead, the finalization callback is and, as described above, later
even that one does not need to be.

D70109 will be updated to use this scheme.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D70258
2019-12-25 16:57:08 -06:00
Johannes Doerfert 58f324a468 [Attributor] Function level undefined behavior attribute
_Eventually_, this attribute will be assigned to a function if it
contains undefined behavior. As a first small step, I tried to make it
loop through the load instructions in a function (eventually, the plan
is to check if a load instructions causes undefined behavior, because
e.g. dereferences a null pointer - Also eventually, this won't happen in
initialize() but in updateImpl()).

Patch By: Stefanos Baziotis (@baziotis)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D71435
2019-12-24 19:23:08 -06:00
Matt Arsenault 1aa763a4a0 GlobalISel: Define equivalent node for G_INTRINSIC_ROUND 2019-12-24 10:36:54 -05:00
Matt Arsenault e351256c0d GlobalISel: Define equivalent node for G_INTRINSIC_TRUNC 2019-12-24 09:53:01 -05:00
Russell Gallop 2e9bfa12ff Revert "[Support] Extend TimeProfiler to support multiple threads"
and "[Support] Try to fix bot failure after 8ddcd1dc26"

This reverts commits f70f180148 and 8ddcd1dc26 as this was breaking the
MacOS build, which doesn't support thread_local.
2019-12-24 11:31:48 +00:00
Fangrui Song e0d855b399 [SelectionDAG] Change SelectionDAGISel::{funcInfo,SDB} to use unique_ptr
CurDAG is referenced more than 2000 times and used in many gerated .cpp
files. Don't touch it for now.
2019-12-23 22:41:05 -08:00
Shengchen Kan 70fa4c4f88 [NFC] Style cleanups
1. Remove duplicate function for class name at the beginning of the
comment.
2. Use auto where the type is already obvious from the context.
2019-12-23 17:02:36 +08:00
Simon Pilgrim 3654ed21ee Fix case style warnings in DIBuilder. NFC. 2019-12-23 07:27:18 +00:00
Yonghong Song e3d8ee35e4 reland "[DebugInfo] Support to emit debugInfo for extern variables"
Commit d77ae1552f
("[DebugInfo] Support to emit debugInfo for extern variables")
added deebugInfo for extern variables for BPF target.
The commit is reverted by 891e25b02d
as the committed tests using %clang instead of %clang_cc1 causing
test failed in certain scenarios as reported by Reid Kleckner.

This patch fixed the tests by using %clang_cc1.

Differential Revision: https://reviews.llvm.org/D71818
2019-12-22 18:28:50 -08:00
Shengchen Kan fb53396c49 [NFC] Remove unnecessary blank and rename align-branch-64-5b.s to align-branch-64-6a.s 2019-12-23 10:22:02 +08:00
Reid Kleckner 891e25b02d Revert "[DebugInfo] Support to emit debugInfo for extern variables"
This reverts commit d77ae1552f.

The tests committed along with this change do not pass, and should be
changed to use %clang_cc1.
2019-12-22 12:54:06 -08:00
Eric Astor dc5b614fa9 [ms] [X86] Use "P" modifier on operands to call instructions in inline X86 assembly.
Summary:
This is documented as the appropriate template modifier for call operands.
Fixes PR44272, and adds a regression test.

Also adds support for operand modifiers in Intel-style inline assembly.

Reviewers: rnk

Reviewed By: rnk

Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71677
2019-12-22 09:16:34 -05:00
Matt Arsenault 4af6866708 AMDGPU: Fix repeated word in comment 2019-12-21 04:57:35 -05:00
Lang Hames 9f4f237e29 [ORC] De-register eh-frames in the RTDyldObjectLinkingLayer destructor.
This matches the behavior of the legacy layer, which automatically deregistered
frames.
2019-12-20 21:10:49 -08:00
Yury Delendik adf7a0a558 [WebAssembly] Use TargetIndex operands in DbgValue to track WebAssembly operands locations
Extends DWARF expression language to express locals/globals locations. (via
target-index operands atm) (possible variants are: non-virtual registers
or address spaces)

The WebAssemblyExplicitLocals can replace virtual registers to targertindex
operand type at the time when WebAssembly backend introduces
{get,set,tee}_local instead of corresponding virtual registers.

Reviewed By: aprantl, dschuff

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D52634
2019-12-20 14:39:05 -08:00
Sam Clegg 538b485c59 Fix name of InitLibcalls() function in comment
Differential Revision: https://reviews.llvm.org/D71781
2019-12-20 14:35:05 -08:00
Adrian Prantl 44b4b833ad Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.

This attribute only appears in Clang module debug info.

Differential Revision: https://reviews.llvm.org/D71722
2019-12-20 13:11:17 -08:00
Philip Reames 8b725f0459 Comment and adjust style in the newly introduced MCBoundaryAlignFragment infrastructure. More to follow. 2019-12-20 12:04:07 -08:00
Philip Reames 14fc20ca62 Align branches within 32-Byte boundary (NOP padding)
WARNING: If you're looking at this patch because you're looking for a full
performace mitigation of the Intel JCC Erratum, this is not it!

This is a preliminary patch on the patch towards mitigating the performance
regressions caused by Intel's microcode update for Jump Conditional Code
Erratum.  For context, see:
https://www.intel.com/content/www/us/en/support/articles/000055650.html

The patch adds the required assembler infrastructure and command line options
needed to exercise the logic for INTERNAL TESTING.  These are NOT public flags,
and should not be used for anything other than LLVM's own testing/debugging
purposes.  They are likely to change both in spelling and meaning.

WARNING: This patch is knowingly incorrect in some cornercases.  We need, and
do not yet provide, a mechanism to selective enable/disable the padding.
Conversation on this will continue in parellel with work on extending this
infrastructure to support prefix padding.

The goal here is to have the assembler align specific instructions such that
they neither cross or end at a 32 byte boundary.  The impacted instructions are:
a. Conditional jump.
b. Fused conditional jump.
c. Unconditional jump.
d. Indirect jump.
e. Ret.
f. Call.

The new options for llvm-mc are:
    -x86-align-branch-boundary=NUM aligns branches within NUM byte boundary.
    -x86-align-branch=TYPE[+TYPE...] specifies types of branches to align.

A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit
NOP to align the fused/unfused branch.

alignBranchesBegin inserts MCBoundaryAlignFragment before instructions,
alignBranchesEnd marks the end of the branch to be aligned,
relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch.

Nop padding is disabled when the instruction may be rewritten by the linker,
such as TLS Call.

Process Note: I am landing a patch by skan as it has been LGTMed, and
continuing to iterate on the review is simply slowing us down at this point.
We can and will continue to iterate in tree.

Patch By: skan
Differential Revision: https://reviews.llvm.org/D70157
2019-12-20 11:35:50 -08:00
Danilo Carvalho Grael 15bfd2cd54 [AArch64][SVE] Replace integer immediate intrinsics with splat vector variant
Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics.

Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71614
2019-12-20 13:52:19 -05:00
Paul Walker 6cba90dc4d [AArch64][SVE] Correct intrinsics and patterns for logical predicate instructions
In general SVE intrinsics are considered predicated and merging
with everything else having suitable decoration.  For predicated
zeroing operations (like the predicate logical instructions) we
use the "_z" suffix.  After this change all intrinsics use their
expected names (i.e. orr instead of or and eor instead of xor).

I've removed intrinsics and patterns for condition code setting
instructions as that data is not returned as part of the intrinsic.
The expectation is to ask for a cc flag explicitly.

For example:
  a = and_z(pg, p1, p2)
  cc = ptest_<flag>(pg, a)

With the code generator expected to use "s" variants of instructions
when available.

Differential Revision: https://reviews.llvm.org/D71715
2019-12-20 14:22:27 +00:00
Andrzej Warzynski be2b7ea89a [AArch64][SVE] Add intrnisics for saturating scalar arithmetic
Summary:
The following intrnisics are added:
  * @llvm.aarch64.sve.sqdec{b|h|w|d|p}
  * @llvm.aarch64.sve.sqinc{b|h|w|d|p}
  * @llvm.aarch64.sve.uqdec{b|h|w|d|p}
  * @llvm.aarch64.sve.uqinc{b|h|w|d|p}

For every intrnisic there a scalar variants (with n32 or n64 suffix) and
vector variants (no suffix).

Reviewers: sdesmalen, rengolin, efriedma

Reviewed By: sdesmalen, efriedma

Subscribers: eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71252
2019-12-20 11:09:40 +00:00
Cullen Rhodes 3f9005eb89 Recommit "[AArch64][SVE] Add permutation and selection intrinsics"
Recommit 23c28c4043 (reverted in
dcb48f50bd) with a fix for an assert
"Request for a fixed size on a scalable object" being triggered in
`LowerSVEIntrinsicEXT`. The fix is to call `getKnownMinSize` on the
TypeSize object.
2019-12-20 10:45:17 +00:00
Andrzej Warzynski 88a973cf68 [AArch64][SVE] Add intrinsics for binary narrowing operations
Summary:
The following intrinsics for binary narrowing shift righ operations are
added:
  * @llvm.aarch64.sve.shrnb
  * @llvm.aarch64.sve.uqshrnb
  * @llvm.aarch64.sve.sqshrnb
  * @llvm.aarch64.sve.sqshrunb
  * @llvm.aarch64.sve.uqrshrnb
  * @llvm.aarch64.sve.sqrshrnb
  * @llvm.aarch64.sve.sqrshrunb
  * @llvm.aarch64.sve.shrnt
  * @llvm.aarch64.sve.uqshrnt
  * @llvm.aarch64.sve.sqshrnt
  * @llvm.aarch64.sve.sqshrunt
  * @llvm.aarch64.sve.uqrshrnt
  * @llvm.aarch64.sve.sqrshrnt
  * @llvm.aarch64.sve.sqrshrunt

Reviewers: sdesmalen, rengolin, efriedma

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71552
2019-12-20 10:20:30 +00:00
Sam Parker acbc9aed72 [ARM][MVE] Fixes for tail predication.
1) Fix an issue with the incorrect value being used for the number of
   elements being passed to [d|w]lstp. We were trying to check that
   the value was available at LoopStart, but this doesn't consider
   that the last instruction in the block could also define the
   register. Two helpers have been added to RDA for this.
2) Insert some code to now try to move the element count def or the
   insertion point so that we can perform more tail predication.
3) Related to (1), the same off-by-one could prevent us from
   generating a low-overhead loop when a mov lr could have been
   the last instruction in the block.
4) Fix up some instruction attributes so that not all the
   low-overhead loop instructions are labelled as branches and
   terminators - as this is not true for dls/dlstp.

Differential Revision: https://reviews.llvm.org/D71609
2019-12-20 09:34:18 +00:00
Lang Hames 07ac3145cc [Orc][LLJIT] Re-apply 298e183e81 (use JITLink for LLJIT where supported).
Patch d9220b580b fixed the underlying issue that casued 298e183e81 to fail.
2019-12-19 20:42:26 -08:00
Lang Hames d9220b580b [JITLink][MachO] Fix common symbol size plumbing.
This fixes the underlying bug that was exposed by 298e183e81.
2019-12-19 20:41:59 -08:00
River Riddle a77a290a4d [CommandLine] Add template instantiations of cl::parser for long and long long.
This allows cl::opt<int64_t>.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D71729
2019-12-19 17:01:22 -08:00
Philip Reames 8277c91cf3 [StackMaps] Be explicit about label formation [NFC] (try 2)
Recommit after making the same API change in non-x86 targets.  This has been build for all targets, and tested for effected ones.  Why the difference?  Because my disk filled up when I tried make check for all.

For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated.  This just rearranges the code to make the upcoming change more obvious.
2019-12-19 14:05:30 -08:00
Tim Northover 85cb560b8a ConstrainedFP: use API compatible with opaque pointers.
This just updates an IRBuilder interface to take Functions instead of
Values so the type can be derived, and fixes some callsites in Clang to
call the updated API.
2019-12-19 21:50:47 +00:00
Eric Christopher 3075cd5c9f Temporarily Revert "[Dsymutil][Debuginfo][NFC] Refactor dsymutil to separate DWARF optimizing part 2."
as it causes a layering violation/dependency cycle:

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp -> llvm/DebugInfo/DWARF/DWARFExpression.h
llvm/include/llvm/DebugInfo/DWARF/DWARFOptimizer.h -> llvm/CodeGen/NonRelocatableStringpool.h

This reverts commit abc7f6800d.
2019-12-19 13:29:02 -08:00
Eric Christopher add710eb23 Temporarily Revert "[StackMaps] Be explicit about label formation [NFC]"
as it broke the aarch64 build.

This reverts commit bc7595d934.
2019-12-19 12:52:40 -08:00
Philip Reames bc7595d934 [StackMaps] Be explicit about label formation [NFC]
For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated.  This just rearranges the code to make the upcoming change more obvious.
2019-12-19 12:38:44 -08:00
Philip Reames cf6aafa47c [FaultMaps] Make label formation a bit more explicit [NFC]
This is in advance of assembler padding directives support where we'll need to bundle the label w/the corresponding faulting instruction to avoid padding being inserted between.
2019-12-19 12:38:44 -08:00
Guillaume Chatelet b4982d6ecd [Alignment][NFC] Align compatible methods for CreateElementUnorderedAtomicMemSet
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71703
2019-12-19 20:03:35 +01:00
Bardia Mahjour 86acaa9457 [DDG] Data Dependence Graph - Ordinals
Summary:
This patch associates ordinal numbers to the DDG Nodes allowing
the builder to order nodes within a pi-block in program order. The
algorithm works by simply assuming the order in which the BBList
is fed into the builder. The builder already relies on the blocks being
in program order so that it can compute the dependencies correctly.
Similarly the order of instructions in their parent basic blocks
determine their program order.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70986
2019-12-19 10:57:33 -05:00
Cullen Rhodes dcb48f50bd Revert "[AArch64][SVE] Add permutation and selection intrinsics"
This reverts commit 23c28c4043.

It caused build failures in the following expensive checks builders:

    http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1295
    http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/700

Reverting for now whilst I figure what the issue is.
2019-12-19 14:26:14 +00:00
Cullen Rhodes 23c28c4043 [AArch64][SVE] Add permutation and selection intrinsics
Summary:
Adds the following intrinsics:

    * @llvm.aarch64.sve.clasta
    * @llvm.aarch64.sve.clasta_n
    * @llvm.aarch64.sve.clastb
    * @llvm.aarch64.sve.clastb_n
    * @llvm.aarch64.sve.compact
    * @llvm.aarch64.sve.ext
    * @llvm.aarch64.sve.lasta
    * @llvm.aarch64.sve.lastb
    * @llvm.aarch64.sve.rev
    * @llvm.aarch64.sve.splice
    * @llvm.aarch64.sve.tbl
    * @llvm.aarch64.sve.trn1
    * @llvm.aarch64.sve.trn2
    * @llvm.aarch64.sve.uzp1
    * @llvm.aarch64.sve.uzp2
    * @llvm.aarch64.sve.zip1
    * @llvm.aarch64.sve.zip2

Reviewers: sdesmalen, efriedma, dancgr, mgudim, huntergr, rengolin

Reviewed By: sdesmalen, efriedma

Subscribers: kmclaughlin, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71401
2019-12-19 13:18:40 +00:00
Alexey Lapshin abc7f6800d [Dsymutil][Debuginfo][NFC] Refactor dsymutil to separate DWARF optimizing part 2.
That patch is extracted from the D70709. It moves CompileUnit, DeclContext
into llvm/DebugInfo/DWARF. It also adds new file DWARFOptimizer with
AddressesMap class. AddressesMap generalizes functionality
from RelocationManager.

Differential Revision: https://reviews.llvm.org/D71271
2019-12-19 15:41:48 +03:00
Jay Foad c5c935ab66 Make more use of MachineInstr::mayLoadOrStore. 2019-12-19 11:51:52 +00:00
Cullen Rhodes eca0c97a6b [AArch64][SVE] Implement pfirst and pnext intrinsics
Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally

Reviewed By: cameron.mcinally

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl,
llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71472
2019-12-19 11:03:32 +00:00
Cullen Rhodes 49199465a3 [AArch64][SVE] Implement ptrue intrinsic
Reviewers: sdesmalen, eli.friedman, dancgr, mgudim, cameron.mcinally,
huntergr, efriedma

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl,
llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71457
2019-12-19 11:02:05 +00:00
David Blaikie eed0242330 DebugInfo: Don't use implicit zero addr_base
(found when LLVM fails to emit addr_base for gmlt+DWARFv5)
2019-12-18 16:28:19 -08:00
Thomas Lively 71eb8023d8 [WebAssembly] Add avgr_u intrinsics and require nuw in patterns
Summary:
The vector pattern `(a + b + 1) / 2` was previously selected to an
avgr_u instruction regardless of nuw flags, but this is incorrect in
the case where either addition may have an unsigned wrap. This CL
changes the existing pattern to require both adds to have nuw flags
and adds builtin functions and intrinsics for the avgr_u instructions
because the corrected pattern is not representable in C.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71648
2019-12-18 15:31:38 -08:00
Lang Hames 5ea91bea15 Revert "[Orc][LLJIT] Use JITLink even if a custom JITTargetMachineBuilder is supplied."
This reverts commit 298e183e81.

This commit caused some build failures -- reverting while I investigate.
2019-12-18 15:13:35 -08:00
Lang Hames 298e183e81 [Orc][LLJIT] Use JITLink even if a custom JITTargetMachineBuilder is supplied.
LLJITBuilder will now use JITLink on supported platforms even if a custom
JITTargetMachineBuilder is supplied, provided that neither the code model,
nor the relocation model, nor the ObjectLinkingLayerCreator is set.
2019-12-18 14:17:25 -08:00
Ulrich Weigand 1946461344 [FPEnv] Strict versions of llvm.minimum/llvm.maximum
Add new intrinsics
   llvm.experimental.constrained.minimum
   llvm.experimental.constrained.maximum
as strict versions of llvm.minimum and llvm.maximum.

Includes SystemZ back-end support.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D71624
2019-12-18 21:35:28 +01:00
Jakub Kuderski 3d29c41ad5 [InstCombine] Insert instructions before adding them to worklist
Summary:
This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions.
It also adds a check in `Worklist::Add` that ensures that all added instructions have parents.

Simple test case that illustrates the difference when run with `--debug-only=instcombine`:

```
define i32 @test35(i32 %a, i32 %b) {
  %1 = or i32 %a, 1135
  %2 = or i32 %1, %b
  ret i32 %2
}
```

Before this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %1 = or i32 %a, 1135
IC: Visiting:   %2 = or i32 %1, %b
IC: ADD:   %2 = or i32 %a, %b
IC: Old =   %3 = or i32 %1, %b
    New =   <badref> = or i32 %2, 1135
IC: ADD:   <badref> = or i32 %2, 1135
...
```

With this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %1 = or i32 %a, 1135
IC: Visiting:   %2 = or i32 %1, %b
IC: ADD:   %2 = or i32 %a, %b
IC: Old =   %3 = or i32 %1, %b
    New =   <badref> = or i32 %2, 1135
IC: ADD:   %3 = or i32 %2, 1135
...
```

Reviewers: fhahn, davide, spatel, foad, grosser, nikic

Reviewed By: nikic

Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71093
2019-12-18 14:55:41 -05:00
Adrian McCarthy 738b5c9639 Fix more VFS tests on Windows
Since VFS paths can be in either Posix or Windows style, we have to use
a more flexible definition of "absolute" path.

The key here is that FileSystem::makeAbsolute is now virtual, and the
RedirectingFileSystem override checks for either concept of absolute
before trying to make the path absolute by combining it with the current
directory.

Differential Revision: https://reviews.llvm.org/D70701
2019-12-18 11:38:04 -08:00
Danilo Carvalho Grael c7abf88411 Revert "[AArch64][SVE] Replace integer immediate intrinsics with splat vector variant"
This reverts commit 830e08b98b and eb1857ce0d.

This commit leads to an unexpected failure on test/CodeGen/AArch64/sve-gather-scatter-dag-combine.ll.

The review will need more changes before its re-commited.
2019-12-18 14:14:10 -05:00
Jakub Kuderski 406b6019cd [InstCombine] Allow to limit the max number of iterations
Summary:
This patch teaches InstCombine to accept a new parameter: maximum number of iterations over functions.

InstCombine tries to simplify instructions by iterating over the whole function until the function stops changing. As a consequence, the last iteration before reaching a fixpoint visits all instructions in the worklist and never performs any rewrites.

Bounding the number of iterations can have 2 benefits:
* In case the users of the pass can make a good guess about the number of required iterations, we can save the time normally spent on the last iteration that doesn't change anything.
* When the wants to use InstCombine as a cleanup pass, it may be enough to run just a few iterations and stop even before reaching a fixpoint. This can be also useful for implementing a lightweight pass pipeline (think `-O1`).

This patch does not change the behavior of opt or Clang -- limiting the number of iterations is entirely opt-in.

Reviewers: fhahn, davide, spatel, foad, nlopes, grosser, lebedev.ri, nikic, xbolva00

Reviewed By: spatel

Subscribers: craig.topper, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71145
2019-12-18 13:48:54 -05:00
Danilo Carvalho Grael 830e08b98b [AArch64][SVE] Replace integer immediate intrinsics with splat vector variant
Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics.

Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71614
2019-12-18 13:11:21 -05:00
Michael Trent 6f95d33e2b [ MC ] Match labels to existing fragments even when switching sections.
(This commit restores the original branch (4272372c57) and applies an
additional change dropped from the original in a bad merge. This change
should address the previous bot failures. Both changes reviewed by pete.)

Summary:
This commit builds upon Derek Schuff's 2014 commit for attaching labels to
existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 )

When temporary labels appear ahead of a fragment, MCObjectStreamer will
track the temporary label symbol in a "Pending Labels" list. Labels are
associated with fragments when a real fragment arrives; otherwise, an empty
data fragment will be created if the streamer's section changes or if the
stream finishes.

This commit moves the "Pending Labels" list into each MCStream, so that
this label-fragment matching process is resilient to section changes. If
the streamer emits a label in a new section, switches to another section to
do other work, then switches back to the first section and emits a
fragment, that initial label will be associated with this new fragment.
Labels will only receive empty data fragments in the case where no other
fragment exists for that section.

The downstream effects of this can be seen in Mach-O relocations. The
previous approach could produce local section relocations and external
symbol relocations for the same data in an object file, and this mix of
relocation types resulted in problems in the ld64 Mach-O linker. This
commit ensures relocations triggered by temporary labels are consistent.

Reviewers: pete, ab, dschuff

Reviewed By: pete, dschuff

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71368
2019-12-18 09:55:54 -08:00
Raphael Isemann 9a8c803771 Fix modules build by adding missing includes to LTO/Config.h 2019-12-18 17:43:19 +01:00
stozer 89d19d60ad Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
This reverts commit 1f3dd83cc1, reapplying
commit bb1b0bc4e5.

The original commit failed on some builds seemingly due to the use of a
bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.
2019-12-18 16:26:42 +00:00
evgeny ad364956ed [ThinLTO] Show preserved symbols in DOT files
Differential revision: https://reviews.llvm.org/D71608
2019-12-18 18:33:15 +03:00
Daniel Sanders c3cb089a87 [gicombiner] Import tryCombineIndexedLoadStore()
Summary:
Now that arbitrary data is supported, import tryCombineIndexedLoadStore()

Depends on D69147

Reviewers: bogner, volkan

Reviewed By: volkan

Subscribers: hiraditya, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69151
2019-12-18 14:41:38 +00:00
Daniel Sanders 55c57408b0 [gicombiner] Add support for arbitrary match data being passed from match to apply
Summary:
This is used by the extending_loads combine to tell the apply step which
use is the preferred one to fold and the other uses should be re-written
to consume.

Depends on D69117

Reviewers: volkan, bogner

Reviewed By: volkan

Subscribers: hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69147
2019-12-18 12:27:29 +00:00
stozer 1f3dd83cc1 Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel"
Reverted due to build failure on windows bots.

This reverts commit bb1b0bc4e5.
2019-12-18 11:46:10 +00:00
Daniel Sanders 7ea2e5195a Revert "Temporarily Revert "[gicombiner] Add the MatchDag structure and parse instruction DAG's from the input""
This reverts commit e62e760f29.

The issue @uweigand raised should have been fixed by iterating over the
vector that owns the operand list data instead of the FoldingSet.

The MSVC issue raised by @thakis should have been fixed by relaxing the
regexes a little. I don't have a Windows machine available to test that so
I tested it by using `perl -p -e 's/0x([0-9a-f]+)/\U\1\E/g' to convert the
output of %p to the windows style.

I've guessed at the issue @phosek raised as there wasn't enough information
to investigate it. What I think is happening on that bot is the -debug
option isn't available because the second stage build is a release build.
I'm not sure why other release-mode bots didn't report it though.
2019-12-18 11:37:12 +00:00
stozer bb1b0bc4e5 [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
Previously, LLVM had no functional way of performing casts inside of a
DIExpression(), which made salvaging cast instructions other than Noop
casts impossible. This patch enables the salvaging of casts by using the
DW_OP_LLVM_convert operator for SExt and Trunc instructions.

There is another issue which is exposed by this fix, in which fragment
DIExpressions (which are preserved more readily by this patch) for
values that must be split across registers in ISel trigger an assertion,
as the 'split' fragments extend beyond the bounds of the fragment
DIExpression causing an error. This patch also fixes this issue by
checking the fragment status of DIExpressions which are to be split, and
dropping fragments that are invalid.
2019-12-18 11:09:18 +00:00
Anna Welker 7cd1cfdd6b [NFC][TTI] Add Alignment for isLegalMasked[Gather/Scatter]
Add an extra parameter so alignment can be taken under
consideration in gather/scatter legalization.

Differential Revision: https://reviews.llvm.org/D71610
2019-12-18 09:14:39 +00:00
Eric Christopher e62e760f29 Temporarily Revert "[gicombiner] Add the MatchDag structure and parse instruction DAG's from the input"
and follow-on patches.

This is breaking a few build bots and local builds with follow-up already
on the patch thread.

This reverts commits 390c8baa54 and
520e3d66e7.
2019-12-17 16:23:29 -08:00
Mitch Phillips f827aff859 Revert "[ MC ] Match labels to existing fragments even when switching sections."
This reverts commit 4272372c57.

Caused an MSan buildbot failure. More information available in the patch
that introduced the bug: https://reviews.llvm.org/D71368
2019-12-17 15:04:26 -08:00
Whitney Tsang 36bdc3dc35 [LoopFusion] Move instructions from FC0.Latch to FC1.Latch.
Summary:This PR move instructions from FC0.Latch bottom up to the
beginning of FC1.Latch as long as they are proven safe.

To illustrate why this is beneficial, let's consider the following
example:
Before Fusion:
header1:
  br header2
header2:
  br header2, latch1
latch1:
  br header1, preheader3
preheader3:
  br header3
header3:
  br header4
header4:
  br header4, latch3
latch3:
  br header3, exit3

After Fusion (before this PR):
header1:
  br header2
header2:
  br header2, latch1
latch1:
  br header3
header3:
  br header4
header4:
  br header4, latch3
latch3:
  br header1, exit3

Note that preheader3 is removed during fusion before this PR.
Notice that we cannot fuse loop2 with loop4 as there exists block latch1
in between.
This PR move instructions from latch1 to beginning of latch3, and remove
block latch1. LoopFusion is now able to fuse loop nest recursively.

After Fusion (after this PR):
header1:
  br header2
header2:
  br header3
header3:
  br header4
header4:
  br header2, latch3
latch3:
  br header1, exit3

Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: kbarton, Meinersbur
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71165
2019-12-17 22:10:23 +00:00
Ulrich Weigand 1e89188d35 [FPEnv] Remove unnecessary rounding mode argument for constrained intrinsics
The following intrinsics currently carry a rounding mode metadata argument:

    llvm.experimental.constrained.minnum
    llvm.experimental.constrained.maxnum
    llvm.experimental.constrained.ceil
    llvm.experimental.constrained.floor
    llvm.experimental.constrained.round
    llvm.experimental.constrained.trunc

This is not useful since the semantics of those intrinsics do not in any way
depend on the rounding mode. In similar cases, other constrained intrinsics
do not have the rounding mode argument. Remove it here as well.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D71218
2019-12-17 21:10:36 +01:00
Alex Lorenz 25ce33a6e4 [driver][darwin] Pass -platform_version flag to the linker instead of the -<platform>_version_min flag
In Xcode 11, ld added a new flag called -platform_version that can be used instead of the old -<platform>_version_min flags.
The new flag allows Clang to pass the SDK version from the driver to the linker.
This patch adopts the new -platform_version flag in Clang, and starts using it by default,
unless a linker version < 520 is passed to the driver.

Differential Revision: https://reviews.llvm.org/D71579
2019-12-17 10:26:32 -08:00
Kevin P. Neal 2f40f5681d [FPEnv] IRBuilder support for constrained sitofp/uitofp. 2019-12-17 12:32:28 -05:00
Ulrich Weigand d1c0f14be8 [SystemZ][FPEnv] Back-end support for STRICT_[SU]INT_TO_FP
As of b1d8576 there is middle-end support for STRICT_[SU]INT_TO_FP,
so this patch adds SystemZ back-end support as well.

The patch is SystemZ target specific except for adding SD patterns
strict_[su]int_to_fp and any_[su]int_to_fp to TargetSelectionDAG.td
as usual.
2019-12-17 18:24:05 +01:00
Daniel Sanders 520e3d66e7 [gicombiner] Process the MatchDag such that every node is reachable from the roots
Summary:
When we build the walk across these DAG's we need to be able to reach every node
from the roots. Flip and traversal edges (so that use->def becomes def->uses)
that make nodes unreachable. Note that early on we'll just error out on these
flipped edges as def->uses edges are more complicated to match due to their
one->many nature.

Depends on D69077

Reviewers: volkan, bogner

Subscribers: llvm-commits
2019-12-17 17:03:24 +00:00