Commit Graph

191156 Commits

Author SHA1 Message Date
Fangrui Song ba3a1774a9 [Transforms] Simplify with make_early_inc_range 2020-02-02 00:54:32 -08:00
Fangrui Song ecd2aaee06 [DebugInfo] Merge DebugInfoFinder::{processDeclare,processValue} into processVariable
The two functions are identical.
2020-02-01 23:00:21 -08:00
Fangrui Song 5932f7b8f2 [PatchableFunction] Use an empty DebugLoc
The current FirstMI.getDebugLoc() is actually null in almost all cases.
If it isn't, the generated .loc will be considered initial. The .loc
will have the prologue_end flag and terminate the prologue prematurely.

Also use an overload of BuildMI that will not prepend
PATCHABLE_FUNCTION_ENTRY to a MachineInstr bundle.
2020-02-01 14:12:06 -08:00
Brian Gesiak d82e993cd3 [ADT] 'PointerUnion::is' returns 'bool'
Summary:
The return type of 'PointerUnion::is' has been 'int' since it was first
added in March 2009, in SVN r67987, or
https://github.com/llvm/llvm-project/commit/a9c6de15fb3.

The only other change to this member function was a clang-format applied
in December 2015, in SVN r256513, or
https://github.com/llvm/llvm-project/commit/548a49aacc0.

However, since the return value is the result of a `==` comparison, an
implicit cast must be made converting the boolean result to an `int`.
Change the return type to `bool` to remove the need for such a cast.

Test Plan:
I ran llvm-project `check-all` under ASAN, no failures were reported
(other than obviously unrelated tests that were already failing in
ASAN buildbots).

Reviewers: gribozavr, gribozavr2, rsmith, bkramer, dblaikie

Subscribers: dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73836
2020-02-01 16:50:20 -05:00
Nicolai Hähnle ba8110161d AMDGPU/GFX10: Fix NSA reassign pass when operands are undef
Summary:
Virtual registers that are undef have an empty LiveInterval at this
point, which means beginIndex() and endIndex() cannot be used. We
only need those indices to determine the range in which to scan for
affected other NSA instructions, and undef operands cannot contribute
to that range.

Reviewers: arsenm, rampitec, mareko

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73831
2020-02-01 22:41:40 +01:00
Craig Topper a57dd66d5e [X86] In X86FastEmitSSESelect, fall back to SelectionDAG if the inputs to the compare can't be found in registers.
We were checking that the original Value * for the compare operands
were null. But that can never happen.

I believe we intended to check for 0 registers here instead.

Fixes PR44749.
2020-02-01 12:24:55 -08:00
Craig Topper d975910c50 [X86] Don't exit from foldOffsetIntoAddress if the Offset is 0, but AM.Disp is non-zero.
This is an alternate fix for the issue D73606 was trying to
solve.

The main issue here is that we bailed out of
foldOffsetIntoAddress if Offset is 0. But if we just found a
symbolic displacement and AM.Disp became non-zero
earlier, we still need to validate that AM.Disp with the symbolic
displacement.

This is my second attempt at committing this after failing
build bots previously. One thing I realized about the previous
attempt is that its possible that AM.Disp is already non-zero
and the new Offset changes it back to zero. In that case my
previous attempt failed to update AM.Disp to zero. So this patch
removes the early out for 0 and appropriately handle the 0 case
in each check so we still update AM.Disp at the end.
2020-02-01 11:26:17 -08:00
Stefan Gränitz 234f3b1691 Add ThinLtoJIT example
Summary:
Prototype of a JIT compiler that utilizes ThinLTO summaries to compile modules ahead of time. This is an implementation of the concept I presented in my "ThinLTO Summaries in JIT Compilation" talk at the 2018 Developers' Meeting: http://llvm.org/devmtg/2018-10/talk-abstracts.html#lt8

Upfront the JIT first populates the *combined ThinLTO module index*, which provides fast access to the global call-graph and module paths by function. Next, it loads the main function's module and compiles it. All functions in the module will be emitted with prolog instructions that *fire a discovery flag* once execution reaches them. In parallel, the *discovery thread* is busy-watching the existing flags. Once it detects one has fired, it uses the module index to find all functions that are reachable from it within a given number of calls and submits their defining modules to the compilation pipeline.

While execution continues, more flags are fired and further modules added. Ideally the JIT can be tuned in a way, so that in the majority of cases the code on the execution path can be compiled ahead of time. In cases where it doesn't work, the JIT has a *definition generator* in place that loads modules if missing functions are reached.

Reviewers: lhames, dblaikie, jfb, tejohnson, pree-jackie, AlexDenisov, kavon

Subscribers: mgorny, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, jfb, merge_guards_bot, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72486
2020-02-01 20:25:09 +01:00
Craig Topper 943b5561d6 [LegalizeTypes][X86] Add a new strategy for type legalizing f16 type that softens it to i16, but promotes to f32 around arithmetic ops.
This is based on this llvm-dev thread http://lists.llvm.org/pipermail/llvm-dev/2019-December/137521.html

The current strategy for f16 is to promote type to float every except where the specific width is required like loads, stores, and bitcasts. This results in rounding occurring in odd places instead of immediately after arithmetic operations. This interacts in weird ways with the __fp16 type in clang which is a storage only type where arithmetic is always promoted to float. InstCombine can remove some fpext/fptruncs around such arithmetic and turn it into arithmetic on half. This wouldn't be so bad if SelectionDAG was able to put those fpext/fpround back in when it promotes.

It is also not obvious how to handle to make the existing strategy work with STRICT fp. We need to use STRICT versions of the conversions which require chain operands. But if the conversions are created for a bitcast, there is no place to get an appropriate chain from.

This patch implements a different strategy where conversions are emitted directly around arithmetic operations. And otherwise its passed around as an i16 including in arguments and return values. This can result in more conversions between arithmetic operations, but is closer to matching the IR the frontend generates for __fp16. And it will allow us to use the chain from constrained arithmetic nodes to link the STRICT_FP_TO_FP16/STRICT_FP16_TO_FP that will need to be added. I've set it up so that each target can opt into the new behavior. Converting all the targets myself was more than I was able to handle.

Differential Revision: https://reviews.llvm.org/D73749
2020-02-01 11:21:04 -08:00
Alex Richardson 24ee9c8496 Don't mark MIPS TRAP as isTerminator
This was causing machine verifier errors when compiling libunwind.

Reviewed By: atanasyan
Differential Revision: https://reviews.llvm.org/D73648
2020-02-01 15:50:22 +00:00
Matt Arsenault c0b12916a7 AMDGPU/GlobalISel: Use more wide vector load/stores
This improves the type breakdown for some large vectors. For example,
we now get a <4 x s32> and s32 store instead of 5 s32 stores for
<5 x s32>.
2020-02-01 10:47:21 -05:00
Matt Arsenault e3117e5c30 AMDGPU/GlobalISel: Improve legalization of wide stores
This fixes legalizations of global stores > 128-bits. It seems work is
needed on how this split actually occurs. For example, we get the
right code for s160, with an s128 and s32 load, but get 5 s32 loads
for <5 x s32>.
2020-02-01 10:47:03 -05:00
Matt Arsenault bc101ffd77 GlobalISel: Support widening unmerge results with pointer source 2020-02-01 10:47:03 -05:00
Sylvestre Ledru 2eb80a99a2 Make StringRef's std::string conversion operator explicit
The build is currenly broken when perf or ffi are enabled for llvm

Just like in https://reviews.llvm.org/rG777180a32b61070a10dd330b4f038bf24e916af1
2020-02-01 15:43:45 +01:00
Simon Pilgrim a3485301d4 Remove unused function. NFCI. 2020-02-01 13:01:58 +00:00
Simon Pilgrim 105e5c940c [ValueTracking] Add DemandedElts support to computeKnownBits/ComputeNumSignBits (PR36319)
This patch adds initial support for a DemandedElts mask to the internal computeKnownBits/ComputeNumSignBits methods, matching the SelectionDAG and GlobalISel equivalents.

So far only a couple of instructions have been setup to handle the DemandedElts, the remainder still using the existing 'all elements' default. The plan is to extend support as we have test coverage.

Differential Revision: https://reviews.llvm.org/D73435
2020-02-01 12:45:46 +00:00
Nico Weber fac4bd26c3 [gn build] unbreak mac build after 133a31cef6 2020-01-31 21:25:56 -05:00
Nico Weber 133a31cef6 [gn build] add asan runtime on linux and mac
This produces a seemingly-working dynamic (x64-only) asan dylib on macOS
and static libraries on Linux.

I've had this sitting in a branch for a long time and wanted to get
check-asan working before landing it, but smaller patches and fewer
local branches is probably better.
2020-01-31 21:23:43 -05:00
Matt Arsenault 98aaed2980 AMDGPU/GlobalISel: Fix forming G_TRUNC with vcc result
This somehow got lost when I fixed the boolean handling.
2020-01-31 20:29:41 -05:00
Matt Arsenault c28f1faaff AMDGPU: Switch some tests to use generated checks
Control flow tests are particularly annoying, and it's probably better
to be have comprehensive check lines for them.
2020-01-31 20:29:41 -05:00
Reid Kleckner b074acb82f [Support] Don't modify the current EH context during stack unwinding
Copy it instead. Otherwise, key registers (such as RBP) may get zeroed
out by the stack unwinder.

Fixes CrashRecoveryTest.DumpStackCleanup with MSVC in release builds.

Reviewed By: stella.stamenova

Differential Revision: https://reviews.llvm.org/D73809
2020-01-31 17:04:01 -08:00
Reid Kleckner a1daa7d079 Avoid std::tie in TypeSize.h
std::tie isn't saving much here, just use == && ==. No numbers to
support this, but std::tie is one of the most expensive instantiations.
2020-01-31 16:57:33 -08:00
Reid Kleckner 4b606b4af5 Move DenseMapInfo traits to TypeSize.h
Saves 2427 unneeded includes of TypeSize.h, which instantiates
std::tie<uint64_t, bool>, which instantiates std::tuple<uint64_t, bool>,
which is slow.

I'll remove the tie in a follow-up, since it's just for operator==.
2020-01-31 16:50:11 -08:00
David Blaikie 338beff4dc DwarfDebug.cpp: Fix some indentation 2020-01-31 16:01:57 -08:00
David Blaikie b33e5f3c3e DebugInfo: Split DWARF: Hash non-member function child DIEs
Significant missing hashing - as per the comment this was only meant to
skip member functions (unspecified, but I think it's legible as member
function declarations, not definitions) but was skipping all named
subprograms (so only hashed child DIEs for member function definitions -
because they didn't have a direct name, but only a name given indirectly
in the DW_AT_specification-referenced DIE)
2020-01-31 15:32:03 -08:00
Matt Arsenault 792d9b5719 DAG: Check if a value is divergent before requiresUniformRegister
This avoids a potentially expensive scan if we already know it doesn't
matter.
2020-01-31 15:27:18 -08:00
Matt Arsenault b4275bcbe4 Move target tests to target subdirectories 2020-01-31 15:27:18 -08:00
Artur Pilipenko 34547ac959 NFC. Comments cleanup in DSE::memoryIsNotModifiedBetween
Separated from https://reviews.llvm.org/D68006 review.
2020-01-31 15:22:33 -08:00
Luís Marques 24cba3312f [RISCV] Implement jump pseudo-instruction
Summary:
Implements the jump pseudo-instruction, which is used in e.g. the Linux kernel.

Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73178
2020-01-31 22:28:26 +00:00
David Blaikie dce2193358 DebugInfo: Simplify debug-macinfo-split-dwarf.ll
This test didn't need any local variables or parameters, and didn't need
to be checking the DWO ID or more detailed forms.

It was using -v to print the macro sections, but now that macro sections
are emitted when requested (-debug-macro) that's not needed either.
2020-01-31 13:09:58 -08:00
David Blaikie 9e8bff71d0 DebugInfo: Allow dumping macinfo and macinfo.dwo from the same file
If dumping an Split DWARF file that hasn't been split into separate
files (such as from llc - that includes the plain and .dwo sections in
the same file) allow both macinfo and macinfo.dwo sections to be dumped.
2020-01-31 12:47:50 -08:00
Nikita Popov ff17da3f75 [InstCombine] Push negation through multiply (PR44234)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44234 by adding
multiply support to freelyNegateValue(). Only one of the operands
needs to be negatible, so this still fits within the framework.

Differential Revision: https://reviews.llvm.org/D73410
2020-01-31 20:58:55 +01:00
Dominic Chen 562a19e079
[Typo fix] RNG: Take pass name as argument instead of pass pointer.
Summary: With the new pass manager, it is not possible to obtain a pointer to the pass.

Reviewers: jfb, rinon, yln

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73390
2020-01-31 14:40:45 -05:00
Dominic Chen 73713f3e5e
RNG: Take pass name as argument instead of pass pointer.
Summary: With the new pass manager, it is not possible to obtain a pointer to the pass.

Reviewers: jfb, rinon, yln

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73390
2020-01-31 14:21:40 -05:00
Jay Foad f465b1aff4 [GlobalISel] Tweak lowering of G_SMULO/G_UMULO
Summary:
Applying this cleanup:

    -      MIRBuilder.buildInstr(TargetOpcode::G_ASHR)
    -        .addDef(Shifted)
    -        .addUse(Res)
    -        .addUse(ShiftAmt);
    +      MIRBuilder.buildAShr(Shifted, Res, ShiftAmt);

caused an assertion failure here:

    llc: /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:404: llvm::MachineInstr *llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() || std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition"' failed.

    #4  0x00000000050a6d96 in llvm::MachineRegisterInfo::getVRegDef (this=0x74606a0, Reg=2147483650) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:403
    #5  0x00000000066148f6 in llvm::getConstantVRegValWithLookThrough (VReg=2147483650, MRI=..., LookThroughInstrs=false, HandleFConstant=true) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:244
    #6  0x00000000066147da in llvm::getConstantVRegVal (VReg=2147483650, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:210
    #7  0x0000000006615367 in llvm::ConstantFoldBinOp (Opcode=101, Op1=2147483650, Op2=2147483656, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:341
    #8  0x000000000657eee0 in llvm::CSEMIRBuilder::buildInstr (this=0x7465010, Opc=101, DstOps=..., SrcOps=..., Flag=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp:160
    #9  0x0000000003645958 in llvm::MachineIRBuilder::buildAShr (this=0x7465010, Dst=..., Src0=..., Src1=..., Flags=...) at /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:1298
    #10 0x00000000065c35b1 in llvm::LegalizerHelper::lower (this=0x7fffffffb5f8, MI=..., TypeIdx=0, Ty=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2020

because at this point there are two instructions defining Res: the
original G_SMULO/G_UMULO and the new G_MUL that we built. The fix is
to modify the original mul in place, so that there is only ever one
definition of Res.

Reviewers: arsenm, aditya_nandakumar

Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72842
2020-01-31 19:21:01 +00:00
Jessica Paquette b9bf9305d1 [AArch64][GlobalISel] Walk through G_TRUNC in getTestBitReg
When you encounter a G_TRUNC, you are moving from a larger type to a smaller
type.

Asking for the i-th bit on a larger value is the same as asking for the i-th
bit on a smaller value.

So, we should always be able to walk through G_TRUNC when computing the bit
for a TB(N)Z.

Differential Revision: https://reviews.llvm.org/D73748
2020-01-31 11:09:55 -08:00
Simon Pilgrim 8fbc7fd567 [DAG] SimplifyMultipleUseDemandedBits - peek through unused ISD::INSERT_SUBVECTOR subvectors
If we don't demand any elements of the inserted subvector then just skip it.
2020-01-31 18:57:22 +00:00
David Blaikie d379253ca1 Orc: Remove an unnecessary explicit scope
(was useful at some point in the past for scoping some error handling
that's since been tidied up a bit)
2020-01-31 10:30:25 -08:00
Fangrui Song 84f0a8626e [yaml2obj] Internlize DocNum. NFC 2020-01-31 10:11:42 -08:00
David Blaikie 53bb183a9d Orc: Remove redundant std::move 2020-01-31 10:11:24 -08:00
Simon Pilgrim 5702dadf6f [DAG] Enable ISD::INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling
This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::INSERT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.
2020-01-31 18:02:34 +00:00
alex-t 5df1ac7846 [AMDGPU] fixed divergence driven shift operations selection
Differential Revision: https://reviews.llvm.org/D73483

Reviewers: rampitec
2020-01-31 20:49:56 +03:00
Hiroshi Yamauchi ac8da31a0f [PGO][PGSO] Handle MBFIWrapper
Some code gen passes use MBFIWrapper to keep track of the frequency of new
blocks. This was not taken into account and could lead to incorrect frequencies
as MBFI silently returns zero frequency for unknown/new blocks.

Add a variant for MBFIWrapper in the PGSO query interface.

Depends on D73494.
2020-01-31 09:36:55 -08:00
Jay Foad 2a1b5af299 [GlobalISel] Tidy up unnecessary calls to createGenericVirtualRegister
Summary:
As a side effect some redundant copies of constant values are removed by
CSEMIRBuilder.

Reviewers: aemerson, arsenm, dsanders, aditya_nandakumar

Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, hiraditya, jrtc27, atanasyan, volkan, Petar.Avramovic, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73789
2020-01-31 17:07:16 +00:00
Nathan James f99133e853 - Update .clang-tidy to ignore parameters of main like functions for naming violations in clang and llvm directory
Summary: Every call to a main like function in llvm and clang lib violates the naming convention for parameters. This prevents clang-tidy warning on such breaches.

Reviewers: alexfh, hokein

Reviewed By: hokein

Subscribers: merge_guards_bot, aheejin, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73715
2020-01-31 16:49:45 +00:00
Danilo Carvalho Grael 44a4f5fc6a [AArch64][SVE] Add SVE2 mla unpredicated intrinsics.
Summary:
Add intrinsics for the MLA unpredicated sve2 instructions:
- smlalb, smlalt, umlalb, umlalt, smlslb, smlslt, umlslb, umlslt
- sqdmlalb, sqdmlalt, sqdmlslb, sqdmlslt
- sqdmlalbt, sqdmlslbt

Reviewers: efriedma, sdesmalen, cameron.mcinally, c-rhodes, rengolin, kmclaughlin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73746
2020-01-31 11:39:12 -05:00
Guillaume Chatelet 3c89b75f23 [NFC] Introduce a type to model memory operation
Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code.

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73785
2020-01-31 17:29:01 +01:00
Matt Arsenault b3726ecea4 AMDGPU: Fix potential use of undefined value 2020-01-31 10:38:58 -05:00
Matt Arsenault 6fb544d1d2 AMDGPU/GlobalISel: Combine FMIN_LEGACY/FMAX_LEGACY
Try out using combine definition rules.

This really should be a post-legalizer combine, but the combiner pass
is currently pre-legalize. Most of the target combines are really
post-legalize, so we should probably move the pass.
2020-01-31 06:58:04 -08:00
Sanjay Patel bc1148e7bc [PATCH] D73727: [SLP] drop poison-generating flags for shuffle reduction ops (PR44536)
We may calculate reassociable math ops in arbitrary order when creating a shuffle reduction,
so there's no guarantee that things like 'nsw' hold on those intermediate values. Drop all
poison-generating flags for safety.

This change is limited to shuffle reductions because I don't think we have a problem in the
general case (where we intersect flags of each scalar op that goes into a vector op), but if
there's evidence of other cases being wrong, we can extend this fix to cover those cases.

https://bugs.llvm.org/show_bug.cgi?id=44536

Differential Revision: https://reviews.llvm.org/D73727
2020-01-31 09:54:35 -05:00
Matt Arsenault 49e424e08e AMDGPU/GlobalISel: Select global MUBUF atomicrmw 2020-01-31 06:05:41 -08:00
Matt Arsenault 0426c2d07d Reapply "AMDGPU: Cleanup and fix SMRD offset handling"
This reverts commit 6a4acb9d80.
2020-01-31 06:01:28 -08:00
Jay Foad 31e29d4afe AMDGPU/GlobalISel: Make use of MachineIRBuilder helper functions. NFC. 2020-01-31 13:53:39 +00:00
serge-sans-paille fd09f12f32 Implement -fsemantic-interposition
First attempt at implementing -fsemantic-interposition.

Rely on GlobalValue::isInterposable that already captures most of the expected
behavior.

Rely on a ModuleFlag to state whether we should respect SemanticInterposition or
not. The default remains no.

So this should be a no-op if -fsemantic-interposition isn't used, and if it is,
isInterposable being already used in most optimisation, they should honor it
properly.

Note that it only impacts architecture compiled with -fPIC and no pie.

Differential Revision: https://reviews.llvm.org/D72829
2020-01-31 14:02:33 +01:00
Sjoerd Meijer 24f0b6b6d8 [llvm-objdump] avoid crash disassembling unknown instruction
Disassembly of instructions can fail when llvm-objdump is not given the right set of
architecture features, for example when the source is compiled with:

  clang -march=..+ext1+ext2

and disassembly is attempted with:

  llvm-objdump -mattr=+ext1

This patch avoids further analysing unknown instructions (as was happening
before) when disassembly has failed.

Differential Revision: https://reviews.llvm.org/D73531
2020-01-31 12:41:31 +00:00
Kerry McLaughlin 69558c8487 [AArch64][SVE] Add remaining SVE2 intrinsics for uniform DSP operations
Summary:
Implements the following intrinsics:

 - @llvm.aarch64.sve.[s|u]qadd
 - @llvm.aarch64.sve.[s|u]qsub
 - @llvm.aarch64.sve.suqadd
 - @llvm.aarch64.sve.usqadd
 - @llvm.aarch64.sve.[s|u]qsubr
 - @llvm.aarch64.sve.[s|u]rshl
 - @llvm.aarch64.sve.[s|u]qshl
 - @llvm.aarch64.sve.[s|u]qrshl
 - @llvm.aarch64.sve.[s|u]rshr
 - @llvm.aarch64.sve.sqshlu
 - @llvm.aarch64.sve.sri
 - @llvm.aarch64.sve.sli
 - @llvm.aarch64.sve.[s|u]sra
 - @llvm.aarch64.sve.[s|u]rsra
 - @llvm.aarch64.sve.[s|u]aba

Reviewers: efriedma, sdesmalen, dancgr, cameron.mcinally, c-rhodes, rengolin

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73551
2020-01-31 10:51:57 +00:00
Sam Parker e014de3a16 [NFC][ARM] Add test 2020-01-31 10:32:15 +00:00
Georgii Rymar 0654005ab2 [llvm-readobj] - Don't crash when dumping invalid dynamic relocation.
Currently when we dump dynamic relocation with use of
DT_RELA/DT_RELASZ/DT_RELAENT tags, we crash when a symbol index
is larger than the number of dynamic symbols or
when there is no dynamic symbol table.

This patch adds test cases and fixes the issues.

Differential revision: https://reviews.llvm.org/D73560
2020-01-31 13:20:51 +03:00
Georgii Rymar cf6037b561 [llvm-readobj][test] - Cleanup testing of the --sections command line option.
We have the `ELF\sections.test` to test --sections.

`ELF\sections.test` uses precompiled objects, it has a bug (does not test -s alias properly).
Also, we test machine specific section types in `ELF\machine-specific-section-types.test`,
so we probably do not need to test `--sections` for a MIPS object in `ELF\sections.test`.
I think it is enough to test ELF32 and ELF64 (we do not test ELF64 in this test).

`Object/readobj-shared-object.test` also tests how llvm-readobj handles
`--sections`. It is location is wrong, it is not complete, it uses precompiled binaries
and it duplicates the `ELF\sections.test` partially (it tests both ELF32 and ELF64).

We have `ELF\readelf-s-alias.test` that tests the `-s` alias for `--sections` in llvm-readobj
and `-s` as an alias for `--symbols` in llvm-readelf.
There is no need to have a separate test for such things.
The test for the `-s` alias for `--sections` can be included into the `ELF\sections.test`.
And the test for `-s` for llvm-readelf is already included into `ELF\symbols.test`.

So, this patch:
1) Removes `Object/readobj-shared-object.test`.
2) Removes `ELF\readelf-s-alias.test`
3) Rewrites the `ELF\sections.test`.
4) Removes ELF/Inputs/trivial.obj.elf-mipsel.

Differential revision: https://reviews.llvm.org/D73686
2020-01-31 12:58:12 +03:00
Markus Böck 3f6a2f1ec5 [Support] Wrap extern TLS variable in getter function
This patch wraps an external thread local storage variable inside of a
getter function and makes it have internal linkage. This allows LLVM to
be built with BUILD_SHARED_LIBS on windows with MinGW. Additionally it
allows Clang versions prior to 10 to compile current trunk for MinGW.

Differential Revision: https://reviews.llvm.org/D73639
2020-01-31 11:32:36 +02:00
LLVM GN Syncbot bf8357d420 [gn build] Port 16a0313ee3 2020-01-31 09:18:27 +00:00
Igor Kudrin 16a0313ee3 [DWARF] Add support for 64-bit DWARF in .debug_names.
Differential Revision: https://reviews.llvm.org/D72900
2020-01-31 16:12:35 +07:00
Sebastian Neubauer adc0217416 Fix typo
Summary: Fix typo

Subscribers: jvesely, nhaehnle, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73458
2020-01-31 08:48:22 +01:00
Jonas Devlieghere a5f479473b [SmallString] Use data() instead of begin() (NFC)
Both begin() and data() do the same thing for the SmallString case, but
the std::string and llvm::StringRef constructors that are being called
are defined as taking a pointer and size.

Addresses Craig Topper's feedback in https://reviews.llvm.org/D73640
2020-01-30 20:15:38 -08:00
Quentin Colombet cfebd77742 [GISel][KnownBits] Fix a bug where we could run out of stack space
One of the exit criteria of computeKnownBits is whether we reach the max
recursive call depth. Before this patch we would check that the
depth is exactly equal to max depth to exit.

Depth may get bigger than max depth if it gets passed to a different
GISelKnownBits object.
This may happen when say a generic part uses a GISelKnownBits object
with some max depth, but then we hit TL.computeKnownBitsForTargetInstr
which creates a new GISelKnownBits object with a different and smaller
depth. In that situation, when we hit the max depth check for the first
time in the target specific GISelKnownBits object, depth may already
be bigger than the current max depth. Hence we would continue to compute
the known bits, until we ran through the full depth of the chain of
computation or ran out of stack space.

For instance, let say we have
GISelKnownBits Info(/*MaxDepth*/ = 10);
Info.getKnownBits(Foo)
// 9 recursive calls to computeKnownBitsImpl.
// Then we hit a target specific instruction.
// The target specific GISelKnownBits does this:
  GISelKnownBits TargetSpecificInfo(/*MaxDepth*/ = 6)
  TargetSpecificInfo.computeKnownBitsImpl() // <-- next max depth checks would
                                            // always return false.

This commit does not have any test case, none of the in-tree targets
use computeKnownBitsForTargetInstr.
2020-01-30 19:30:39 -08:00
Fangrui Song 200ac6c3d8 [llvm-objcopy][test] Fix tests when path contains "bar"
Differential Revision: https://reviews.llvm.org/D72358
2020-01-30 17:56:12 -08:00
Fangrui Song 5b22bcc2b7 [X86][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local
For a MC_GlobalAddress reference to a dso_local external GlobalValue with a definition, emit .Lfoo$local to avoid a relocation.

-fno-pic and -fpie can infer dso_local but -fpic cannot.  In the future,
we can explore the possibility of inferring dso_local with -fpic. As the
description of D73228 says, LLVM's existing IPO optimization behaviors
(like -fno-semantic-interposition) and a previous assembly behavior give
us enough license to be aggressive here.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D73230
2020-01-30 17:52:35 -08:00
Leonard Chan 2d3174c4df [SafeStack][DebugInfo] Insert DW_OP_deref in correct location
This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585
where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt
symbols when debugging.

This is an attempt to reland with a few fixes for buildbot since I
haven't merged from master in a bit.

Differential Revision: https://reviews.llvm.org/D73526
2020-01-30 17:09:42 -08:00
Amara Emerson 84bd851108 [GlobalISel][IRTranslator] When translating vector geps, splat the base pointer if required.
We can have geps that have a scalar base pointer, and a vector index value, which
means that the base pointer must be splatted into a vector of pointers.

This fixes crashes on arm64 GlobalISel with optimizations enabled.
2020-01-30 16:27:27 -08:00
Leonard Chan 3b23453b6c Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location"
This reverts commit fff6a1b0f1.

This was breaking a bunch of buildbots.
2020-01-30 16:18:41 -08:00
Mehdi Amini 5f940220bf MSVC Buggy version detection: turn pre-processor error into CMake configuration time check
This allows consumer to override in a cleaner way while still prevent
them from hitting bug without knowing they run an unsupported
configuration.

Recommit after fix by Christopher Tetreault to add parens and ${} to
cmake check to work around CMake configure time "unknown arguments
specified" issue

Differential Revision: https://reviews.llvm.org/D73677
Differential Revision: https://reviews.llvm.org/D73751
2020-01-31 00:11:55 +00:00
Leonard Chan fff6a1b0f1 [SafeStack][DebugInfo] Insert DW_OP_deref in correct location
This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585
where a DW_OP_deref was placed at the end of a dwarf expression, resulting in
corrupt symbols when debugging.

Differential Revision: https://reviews.llvm.org/D73526
2020-01-30 15:58:37 -08:00
Matt Arsenault 6a4acb9d80 Revert "AMDGPU: Cleanup and fix SMRD offset handling"
This reverts commit 17dbc6611d.

A test is failing on some bots
2020-01-30 15:39:51 -08:00
Mehdi Amini 1e417ba2d4 Revert "MSVC Buggy version detection: turn pre-processor error into CMake configuration time check"
This reverts commit b4fac78246.
It broke the MSVC bot
2020-01-30 23:38:36 +00:00
Matt Arsenault 17dbc6611d AMDGPU: Cleanup and fix SMRD offset handling
I believe this also fixes bugs with CI 32-bit handling, which was
incorrectly skipping offsets that look like signed 32-bit values. Also
validate the offsets are dword aligned before folding.
2020-01-30 15:04:21 -08:00
Matt Arsenault eb7f74e300 CodeGen: Use Register 2020-01-30 15:01:56 -08:00
Jessica Paquette c8c987d310 [AArch64][GlobalISel] Fold in G_ANYEXT/G_ZEXT into TB(N)Z
This is similar to the code in getTestBitOperand in AArch64ISelLowering. Instead
of implementing all of the TB(N)Z optimizations at once, this patch implements
the simplest case first. The way that this is set up should make it fairly easy
to add the rest as we go along.

The idea here is that after determining that we can use a TB(N)Z, we can
continue looking through instructions and perform further folding.

In this case, when we have a G_ZEXT or G_ANYEXT where the extended bits are not
used, we can fold it into the TB(N)Z.

Differential Revision: https://reviews.llvm.org/D73673
2020-01-30 14:51:26 -08:00
Amara Emerson 6170272ab9 [AArch64][GlobalISel] Disallow vectors in convertPtrAddToAdd.
Found by inspection, but there's no test for this yet because G_PTR_ADD is
currently illegal for vectors. I'll add the test at a later time when the
legalizer support has landed.
2020-01-30 14:50:44 -08:00
Nikita Popov 480391035c [InstCombine] Remove unnecessary worklist add; NFCI
Again, this will already be added by IRBuilder.
2020-01-30 23:24:59 +01:00
David Tenty 809c872aae [NFC] Fix check prefix add in fcanonicalize-elimination.ll
The test fix added by "D39306: Fix
CodeGen/AMDGPU/fcanonicalize-elimination.ll on FreeBSD 11.0" uses a test
prefix which is not actually used in the FileCheck stanza. Thus the
problem originally encountered still exists and the tests fails for host
triples that  contain "1.0", including AIX 7.1.0.
2020-01-30 17:19:49 -05:00
Mehdi Amini b4fac78246 MSVC Buggy version detection: turn pre-processor error into CMake configuration time check
This allows consumer to override in a cleaner way while still prevent
them from hitting bug without knowing they run an unsupported
configuration.

Differential Revision: https://reviews.llvm.org/D73677
2020-01-30 22:17:21 +00:00
Matt Arsenault f7521dc292 AMDGPU: Replace subtarget check with an assert
This is already checked by the pattern subtarget predicate.
2020-01-30 14:15:26 -08:00
Matt Arsenault 97a1d4bc02 AMDGPU: Don't use separate cache arguments for s_buffer_load node
There's not much value to this separate node from the intrinsic. Make
the operand structure the same as the intrinsic, so we can reuse the
same pattern for GlobalISel.
2020-01-30 14:15:26 -08:00
Nikita Popov 90b5ed996b [InstCombine] Remove unnecessary worklist add; NFCI
The IRBuilder will automatically add instructions to the worklist.
Adding it manually is unnecessary, but may mess up worklist order.
2020-01-30 23:06:28 +01:00
Nikita Popov cad91074a6 [InstCombine] Create new insts in foldICmpEqIntrinsicWithConstant; NFCI
In line with current conventions, create new instructions rather
than modify two operands in place and performing manual worklist
management.

This should be NFC apart from possible worklist order changes.
2020-01-30 23:03:16 +01:00
hsmahesha 1d9e08ec35 [AMDGPU] Add file headers for few files where it is missing.
Summary:
Added file headers for files which implement iterative lightweight scheduling
strategies. Which is basically an exercise which I undertook in order to get
used to LLVM development process.

Reviewers: arsenm, vpykhtin, cdevadas

Reviewed By: vpykhtin

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73417
2020-01-31 02:06:41 +05:30
Sean Fertile 8b737688c2 [AIX] Minor cleanup in AsmPrinter. [NFC]
- Extends the comments related to function descriptors, noting how they
are only used on AIX.

- Changes the condition used to gate the creation of the current function
symbol in AsmPrinter::SetupMachineFunction to reflect being AIX
specific. The creation of the symbol is different because of AIXs
linkage conventions, not because AIX uses function descriptors.

Differential Revision: https://reviews.llvm.org/D73115
2020-01-30 14:15:02 -05:00
Fangrui Song 06b8e32d4f [AArch64] -fpatchable-function-entry=N,0: place patch label after BTI
Summary:
For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after
9a24488cb6, we place the NOP sled after
the initial BTI.

```
.Lfunc_begin0:
bti c
nop
nop

.section __patchable_function_entries,"awo",@progbits,f,unique,0
.p2align 3
.xword .Lfunc_begin0
```

This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label:

```
.Lfunc_begin0:
bti c
.Lpatch0:
nop
nop

.section __patchable_function_entries,"awo",@progbits,f,unique,0
.p2align 3
.xword .Lpatch0
```

This placement is compatible with the resolution in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 .

A local linkage function whose address is not taken does not need a BTI.
Placing the patch label after BTI has the advantage that code does not
need to differentiate whether the function has an initial BTI.

Reviewers: mrutland, nickdesaulniers, nsz, ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73680
2020-01-30 11:11:52 -08:00
Huihui Zhang b0d25fff9b [ConstantFold][SVE][NFC] Add test for select instruction in scalable vector.
Side notes from D73669, no need to guard the iteration on vectors, as
it is explicitly looking for a ConstantVector/ConstantDataVector, which
is not expected to be scalable at the moment. So, add the test only.
2020-01-30 10:56:12 -08:00
Huihui Zhang 34e6552dcb [ConstantFold][SVE] Fix constant folding for scalable vector unary operations.
Summary:
Similar to issue D71445. Scalable vector should not be evaluated element by element.
Add support to handle scalable vector UndefValue.

Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73678
2020-01-30 10:45:15 -08:00
Danilo Carvalho Grael 0610637aac [AArch64][SVE] Add remaining SVE2 mla indexed intrinsics.
Summary:
Add remaining SVE2 mla indexed intrinsics:
- sqdmlalb, sqdmlalt, sqdmlslb, sqdmlslt

Add suffix _lanes and switch immediate types to i32 for all mla indexed intrinsics to align with ACLE builtin definitions.

Reviewers: efriedma, sdesmalen, cameron.mcinally, c-rhodes, rengolin, kmclaughlin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, arphaman, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73633
2020-01-30 13:32:11 -05:00
Teresa Johnson c45bb326a6 [ThinLTO] Disable "Always import constants" due to compile time issues
Summary:
Disable the always importing of constants introduced in D70404 by
default under a new internal option, since it is causing order of
magnitude compile time regressions during the thin link. Will continue
investigating why the regressions occur.

Reviewers: evgeny777, wmi

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73724
2020-01-30 10:12:48 -08:00
Whitney Tsang e44f4a8a54 [LoopFusion] Move instructions from FC1.GuardBlock to FC0.GuardBlock and
from FC0.ExitBlock to FC1.ExitBlock when proven safe.

Summary:
Currently LoopFusion give up when the second loop nest guard
block or the first loop nest exit block is not empty. For example:

if (0 < N) {
  for (int i = 0; i < N; ++i) {}
  x+=1;
}
y+=1;
if (0 < N) {
  for (int i = 0; i < N; ++i) {}
}
The above example should be safe to fuse.
This PR moves instructions in FC1 guard block (e.g. y+=1;) to
FC0 guard block, or instructions in FC0 exit block (e.g. x+=1;) to
FC1 exit block, which then LoopFusion is able to fuse them.
Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73641
2020-01-30 18:02:22 +00:00
Nikita Popov 70d345e687 [AArch64][ARM] Always expand ordered vector reductions (PR44600)
fadd/fmul reductions without reassoc are lowered to
VECREDUCE_STRICT_FADD/FMUL nodes, which don't have legalization
support. Until that is in place, expand these intrinsics on
ARM and AArch64. Other targets always expand the vector reduction
intrinsics.

Additionally expand fmax/fmin reductions without nonan flag on
AArch64, as the backend asserts that the flag is present when
lowering VECREDUCE_FMIN/FMAX.

This fixes https://bugs.llvm.org/show_bug.cgi?id=44600.

Differential Revision: https://reviews.llvm.org/D73135
2020-01-30 18:40:24 +01:00
Roman Lebedev 8d2e9bca7e
[NFC][IndVarSimplify] Autogenerate exit_value_test2.ll check lines 2020-01-30 20:11:02 +03:00
Yonghong Song 795bbb3662 [BPF] fix a bug in BPFMISimplifyPatchable pass with -O0
The recommended optimization level for BPF programs
is O2 since (1). BPF is running inside the kernel and
linux kernel won't work at -O0 level, and (2). Verifier
is not able to handle O0 code properly, e.g., potential
large stack size and a lot of spills.

But we should keep -O0 at least compiling.
This patch fixed a bug in BPFMISimplifyPatchable phase
where with -O0, a segmentation fault will happen for a
simple program like:
  int test(int a, int b) { return a + b; }

A test case is added to capture such a case.

Differential Revision: https://reviews.llvm.org/D73681
2020-01-30 08:28:39 -08:00
jasonliu 3bbe7a681e [XCOFF][AIX] Support basic relocation type on AIX
Summary:

This patch intends to support three most common relocation type
on AIX: R_POS, R_TOC, R_RBR.
These three relocation type will be needed for object file generation
on AIX for small code model.
We will have follow up patches to bring relocation support for
large code model on AIX.

Reviewers: hubert.reinterpretcast, daltenty, DiggerLin

Differential Revision: https://reviews.llvm.org/D72027
2020-01-30 15:59:09 +00:00
LLVM GN Syncbot 8bb9642fd7 [gn build] Port 601687bf73 2020-01-30 15:06:10 +00:00
Alex Richardson 523896f64a Bring back the tests for update_cc_tests_checks.py
The tests were removed in 287307a0c6 to
avoid a dependency on python3. update_cc_tests_checks.py also works with
python2 so restore the tests without the python3 dependency.
2020-01-30 14:58:25 +00:00
Stefan Pintilie 9de1241bb2 [PowerPC][Future] Branch Distance Estimation For Prefixed Instructions
By adding the prefixed instructions the branch distances are no longer
computed correctly. Since prefixed instructions cannot cross a 64 byte
boundary we have to assume that a prefixed instruction may have a nop
prepended to it. This patch tries to take that nop into consideration
when computing the size of basic blocks.

Differential Revision: https://reviews.llvm.org/D72572
2020-01-30 08:54:33 -06:00