Commit Graph

136235 Commits

Author SHA1 Message Date
Eli Friedman b5740105d2 [BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles.
The indexing was messed up, so the result was completely broken.

Shuffle constant exprs are rare in practice; without vscale types,
constant folding generally elminates them. So sort of hard to trip over.

Fixes regression from D72467.

Differential Revision: https://reviews.llvm.org/D80330
2020-06-23 19:50:30 -07:00
Amara Emerson fceadbcb33 [AArch64][GlobalISel] Improve codegen for some constant vectors by using constant pool loads.
There's more smarts in AArch64ISelLowering that we don't have yet, but this
change incrementally improves some of the more common patterns. I think future
iterations will want to use some combination of PostLegalizerCombiner and the
selector to catch the other cases.

Differential Revision: https://reviews.llvm.org/D82340
2020-06-23 19:23:47 -07:00
Eli Friedman a2caa3b614 Remove GlobalValue::getAlignment().
This function is deceptive at best: it doesn't return what you'd expect.
If you have an arbitrary GlobalValue and you want to determine the
alignment of that pointer, Value::getPointerAlignment() returns the
correct value.  If you want the actual declared alignment of a function
or variable, GlobalObject::getAlignment() returns that.

This patch switches all the users of GlobalValue::getAlignment to an
appropriate alternative.

Differential Revision: https://reviews.llvm.org/D80368
2020-06-23 19:13:42 -07:00
Vedant Kumar f8bd6a75ed [SimplifyCFG] Drop debug loc in SpeculativelyExecuteBB
Summary:
According to HowToUpdateDebugInfo.rst:

```
Preserving the debug locations of speculated instructions can make
it seem like a condition is true when it's not (or vice versa), which
leads to a confusing single-stepping experience
```

This patch follows the recommendation to drop debug locations on
speculated instructions.

Reviewers: aprantl, davide

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82420
2020-06-23 18:25:52 -07:00
Matt Arsenault a162048a47 AMDGPU/GlobalISel: Fix fixed ABI special VGPR function arguments
I forgot to copy the new fixed function ABI into GlobalISel, so this
was mismatched with the DAG compiled calling function. This was
allocating part of the argument list to v31, which was supposed to be
reserved for the workitem IDs.
2020-06-23 21:21:35 -04:00
Eli Friedman e9d4e34ab8 [AArch64][SVE] Add legalization support for i32/i64 vector srem/urem
Implement them on top of sdiv/udiv, similar to what we do for integer
types.

Potential future work: implementing i8/i16 srem/urem, optimizations for
constant divisors, optimizing the mul+sub to mls.

Differential Revision: https://reviews.llvm.org/D81511
2020-06-23 16:27:52 -07:00
Eli Friedman 90ad786947 [IR] Prefer scalar type for struct indexes in GEP constant expressions.
This has two advantages: one, it's simpler, and two, it doesn't require
heroic pattern matching with scalable vectors.

Also includes a small fix to DataLayout to allow the scalable vector
testcase to work correctly.

Differential Revision: https://reviews.llvm.org/D82061
2020-06-23 16:14:36 -07:00
Sam Clegg e49584a34a [WebAssembly] Fix for use of uninitialized member in WasmObjectWriter.cpp
Currently, section indices may be passed uninitialized by value if
writing the section fails. Removes section indices form class
initialization and returns them from the write{Code,Data}Section
function calls instead.

Patch by Gui Andrade!

Differential Revision: https://reviews.llvm.org/D81702
2020-06-23 15:26:18 -07:00
David Green d604cc6e9a [ARM] Mark more integer instructions as not having side effects.
LDRD and STRD along with UBFX and SBFX are selected from DAGToDAG
transforms, so do not have tblgen patterns. They don't get marked as
having side effects so cannot be scheduled as efficiently as you would
like.

This specifically marks then as not having side effects.

Differential Revision: https://reviews.llvm.org/D82358
2020-06-23 22:45:51 +01:00
Christopher Tetreault 433c9adf7b [SVE] Remove calls to VectorType::getNumElements from AsmParser
Reviewers: efriedma, RKSimon, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82208
2020-06-23 14:31:49 -07:00
Zequan Wu 6a822e20ce [ASan][MSan] Remove EmptyAsm and set the CallInst to nomerge to avoid from merging.
Summary: `nomerge` attribute was added at D78659. So, we can remove the EmptyAsm workaround in ASan the MSan and use this attribute.

Reviewers: vitalybuka

Reviewed By: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82322
2020-06-23 14:22:53 -07:00
Ryan Santhiraraja f64dc4e686 Preserve GlobalsAA analysis result in InjectTLIMappings
InjectTLIMappings fails to preserve the analysis result of GlobalsAA. Not preserving the analysis might affect benchmark performance. This change fixes this issue.

Patch by: Ryan Santhiraraja <rsanthir@quicinc.com>

Reviewers: fpetrogalli, joerg, fhahn

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D82343
2020-06-23 22:05:42 +01:00
Nikita Popov 6904c7129b [IR] Remove MSVC warning workaround (NFC)
While LLVM does fold this to x+1, GCC does not. As this is hot
code, let's try to avoid that.

According to
https://developercommunity.visualstudio.com/content/problem/211134/unsigned-integer-overflows-in-constexpr-functionsa.html
this spurious warning in MSVC has been fixed in Visual Studio 2019
Version 16.4. Let's see if there are any build bots running old
MSVC versions with warnings treated as errors...
2020-06-23 22:33:57 +02:00
Christopher Tetreault e6d8636935 [SVE] Remove calls to VectorType::getNumElements from Bitcode
Reviewers: efriedma, evgeny777, tejohnson, david-arm, kmclaughlin

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82209
2020-06-23 13:21:40 -07:00
Nikita Popov 52e86797ba [IR] Remove unnecessary uint64_t casts (NFC)
As pointed out by foad, it's not necessary to work on uint64_t
here. The values used here fit uint8_t.
2020-06-23 22:20:15 +02:00
Florian Hahn ff4de8683a [DSE,MSSA] Treat `store 0` after calloc as noop stores.
This patch extends storeIsNoop to also detect stores of 0 to an calloced
object. This basically ports the logic from legacy DSE to the MemorySSA
backed version.

It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3
LTO:

Same hash: 218 (filtered out)
Remaining: 19
Metric: dse.NumNoopStores

Program                                        base   patch2 diff
 test-suite...CFP2000/177.mesa/177.mesa.test     1.00  15.00 1400.0%
 test-suite...6/482.sphinx3/482.sphinx3.test     1.00  14.00 1300.0%
 test-suite...lications/ClamAV/clamscan.test     2.00  28.00 1300.0%
 test-suite...CFP2006/433.milc/433.milc.test     1.00   8.00 700.0%
 test-suite...pplications/oggenc/oggenc.test     2.00   9.00 350.0%
 test-suite.../CINT2000/176.gcc/176.gcc.test     6.00   6.00  0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test    NaN   137.00  nan%
 test-suite...libquantum/462.libquantum.test    NaN     3.00  nan%
 test-suite...6/464.h264ref/464.h264ref.test    NaN     7.00  nan%
 test-suite...decode/alacconvert-decode.test    NaN     2.00  nan%
 test-suite...encode/alacconvert-encode.test    NaN     2.00  nan%
 test-suite...ications/JM/ldecod/ldecod.test    NaN     9.00  nan%
 test-suite...ications/JM/lencod/lencod.test    NaN    39.00  nan%
 test-suite.../Applications/lemon/lemon.test    NaN     2.00  nan%
 test-suite...pplications/treecc/treecc.test    NaN     4.00  nan%
 test-suite...hmarks/McCat/08-main/main.test    NaN     4.00  nan%
 test-suite...nsumer-lame/consumer-lame.test    NaN     3.00  nan%
 test-suite.../Prolangs-C/bison/mybison.test    NaN     1.00  nan%
 test-suite...arks/mafft/pairlocalalign.test    NaN    30.00  nan%

Reviewers: efriedma, zoecarver, asbirlea

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D82204
2020-06-23 21:01:39 +01:00
Your Name cc9d693856 [AMDGPU/MemOpsCluster] Implement new heuristic for computing max mem ops cluster size
Summary:
Make use of both the - (1) clustered bytes and (2) cluster length, to decide on
the max number of mem ops that can be clustered. On an average, when loads
are dword or smaller, consider `5` as max threshold, otherwise `4`. This
heuristic is purely based on different experimentation conducted, and there is
no analytical logic here.

Reviewers: foad, rampitec, arsenm, vpykhtin

Reviewed By: rampitec

Subscribers: llvm-commits, kerbowa, hiraditya, t-tye, Anastasia, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, thakis

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82393
2020-06-24 00:39:41 +05:30
Christopher Tetreault 4d1fd33561 [SVE] Remove calls to VectorType::getNumElements from FuzzMutate
Reviewers: efriedma, bkramer, kmclaughlin, sdesmalen

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82212
2020-06-23 11:02:20 -07:00
Simon Pilgrim e7e204a373 [X86][AVX] Attempt to lower v16i32/v16f32 shuffles with lowerShuffleAsRepeatedMaskAndLanePermute
Avoids prematurely creating permps/permd variable shuffles.

Fixes PR46249
2020-06-23 18:33:50 +01:00
Simon Pilgrim ddc6ec9470 WithColor.h - reduce CommandLine.h include to forward declaration. NFC.
WithColor.h is one of the most common headers, we can severely reduce its frontend impact (in ClangBuildAnalyzer reports) by removing the bulky CommandLine.h include, forward declaring llvm:🆑:OptionCategory and just including raw_ostream.h instead.
2020-06-23 17:07:53 +01:00
Xing GUO 45fa936855 [ObjectYAML][DWARF] Remove unused context. NFC.
The context is unused. This patch helps remove it.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82351
2020-06-24 00:02:51 +08:00
Xing GUO fad54c50e4 [ObjectYAML][ELF] Add support for emitting the .debug_pubtypes section.
This patch helps add support for emitting the .debug_pubtypes section.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82347
2020-06-24 00:01:07 +08:00
Momchil Velikov adf7973fd3 [ARM] Describe defs/uses of VLLDM and VLSTM
The VLLDM and VLSTM instructions are incompletely specified.  They
(potentially) write (or read, respectively) registers Q0-Q7, VPR, and
FPSCR, but the compiler is unaware of it.

In the new test case `cmse-vlldm-no-reorder.ll` case the compiler
missed an anti-dependency and reordered a `VLLDM` ahead of the
instruction, which stashed the return value from the non-secure call,
effectively clobbering said value.

This test case does not fail with upstream LLVM, because of scheduling
differences and I couldn't find a test case for the VLSTM either.

Differential Revision: https://reviews.llvm.org/D81586
2020-06-23 16:04:23 +01:00
Valentin Clement d90443b1d9 [openmp] Base of tablegen generated OpenMP common declaration
Summary:
As discussed previously when landing patch for OpenMP in Flang, the idea is
to share common part of the OpenMP declaration between the different Frontend.
While doing this it was thought that moving to tablegen instead of Macros will also
give a cleaner and more powerful way of generating these declaration.
This first part of a future series of patches is setting up the base .td file for
DirectiveLanguage as well as the OpenMP version of it. The base file is meant to
be used by other directive language such as OpenACC.
In this first patch, the Directive and Clause enums are generated with tablegen
instead of the macros on OMPConstants.h. The next pacth will extend this
to other enum and move the Flang frontend to use it.

Reviewers: jdoerfert, DavidTruby, fghanim, ABataev, jdenny, hfinkel, jhuber6, kiranchandramohan, kiranktp

Reviewed By: jdoerfert, jdenny

Subscribers: arphaman, martong, cfe-commits, mgorny, yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits

Tags: #llvm, #openmp, #clang

Differential Revision: https://reviews.llvm.org/D81736
2020-06-23 10:32:32 -04:00
Mikhail Maltsev 3f353a2e5a [BFloat] Add convert/copy instrinsic support
This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

Specifically it adds intrinsic support in clang and llvm for Arm and AArch64.

The bfloat type, and its properties are specified in the Arm Architecture Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:
  - Alexandros Lamprineas
  - Luke Cheeseman
  - Mikhail Maltsev
  - Momchil Velikov
  - Luke Geeson

Differential Revision: https://reviews.llvm.org/D80928
2020-06-23 14:27:05 +00:00
Matt Arsenault db777eaea3 AMDGPU/GlobalISel: Fix asserts on non-s32 sitofp/uitofp sources
The combine to form cvt_f32_ubyte0 was assuming the source type was
always 32-bit, but this needs to tolerate any legal source type.
2020-06-23 10:00:35 -04:00
Xing GUO 8c7775e9a7 [ObjectYAML][ELF] Add support for emitting the .debug_pubnames section.
This patch helps add support for emitting the .debug_pubnames section to yaml2elf.

Known issues:
- Current implementation doesn't support emitting multiple sets of entries.
- Doesn't support DWARF64.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82296
2020-06-23 20:40:33 +08:00
Mikhail Maltsev 9c579540ff [ARM] BFloat MatMul Intrinsics&CodeGen
Summary:
This patch adds support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are
provided as needed.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

 - Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman
 - Simon Tatham

Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM

Reviewed By: MarkMurrayARM

Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81740
2020-06-23 12:06:37 +00:00
hsmahesha 5832950adb [AMDGPU/MemOpsCluster] Compute `width` for `MIMG` instruction class.
Summary:
`width` computation is missing for newly added `MIMG`
instruction class. Add it.

Reviewers: foad, rampitec, arsenm

Reviewed By: foad

Subscribers: MatzeB, javed.absar, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81649
2020-06-23 17:32:17 +05:30
Georgii Rymar 1e820e82b1 [DebugInfo/DWARF] - Do not hang when CFI are truncated.
Currently when the .eh_frame section is truncated so that
CFI instructions can't be read, it is possible to enter
an infinite loop.

It happens because `CFIProgram::parse` does not handle errors properly.
This patch fixes the issue.

Differential revision: https://reviews.llvm.org/D82017
2020-06-23 14:39:24 +03:00
Simon Pilgrim cdceef4a4f [Analysis] Ensure we include CommandLine.h if we declare any cl::opt flags. NFC. 2020-06-23 12:29:51 +01:00
Sander de Smalen 121e585ec8 [AArch64][SVE] ACLE: Add bfloat16 to struct load/stores.
This patch contains:
- Support in LLVM CodeGen for bfloat16 types for ld2/3/4 and st2/3/4.
- New bfloat16 ACLE builtins for svld(2|3|4)[_vnum] and svst(2|3|4)[_vnum]

Reviewers: stuij, efriedma, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Tags: #clang, #lldb, #llvm

Differential Revision: https://reviews.llvm.org/D82187
2020-06-23 12:12:35 +01:00
Simon Pilgrim 36bc10e74a [Transforms] Ensure we include CommandLine.h if we declare any cl::opt flags 2020-06-23 12:11:51 +01:00
Kerry McLaughlin 5080503174 [SVE][CodeGen] Legalisation of vsetcc with scalable types
Summary: Changes SplitVecOp_VSETCC to use getVectorElementCount()

Reviewers: sdesmalen, efriedma, dancgr

Reviewed By: efriedma

Subscribers: david-arm, tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79167
2020-06-23 11:56:29 +01:00
Roman Lebedev d57e9aca01
[IndVarSimplify] Don't replace IV user with unsafe loop-invariant (PR45360)
Summary:
As [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]] reports,
with new cost-model we can sometimes end up being able to expand `udiv`/`urem` instructions.
And that exposes at least one instance of when we do that
regardless of whether or not it is safe to do.
In this particular case, it's `SimplifyIndvar::replaceIVUserWithLoopInvariant()`.

It seems to me, we simply need to check with `isSafeToExpandAt()` first.

The test isn't great. I'm not sure how to make it only run `-indvars`.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]].

Reviewers: mkazantsev, reames, helloqirun

Reviewed By: mkazantsev

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82108
2020-06-23 13:53:15 +03:00
Chen Zheng 7ab05d9a60 [PowerPC] fold addi's imm operand to its imm form consumer's displacement
This patch adds a function to do following transformation:

%0:g8rc_and_g8rc_nox0 = ADDI8 %5:g8rc_and_g8rc_nox0, 144
STD killed %7:g8rc, 16, %0:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8)

------>

STD killed %7:g8rc, 160, %5:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8)

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81723
2020-06-23 06:28:18 -04:00
Simon Pilgrim 4c257bb44e [X86] truncateVectorWithPACK - fix outdated comment. NFC.
We perform PACKSS/PACKUS on AVX512 targets if the calling function wants to.
2020-06-23 10:45:27 +01:00
Simon Pilgrim bcc0dc3832 [DAG] visitSIGN_EXTEND_INREG - rename EVT variable. NFCI.
We had a EVT type variable called EVT, which isn't a good idea....
2020-06-23 10:45:27 +01:00
Paul Walker 499c63288f [SVE] Code generation for fixed length vector loads & stores.
Summary:
This patch adds base support for code generating fixed length
vector operations targeting a known SVE vector length. To achieve
this we lower fixed length vector operations to equivalent scalable
vector operations, whereby SVE predication is used to limit the
elements processed to those present within the fixed length vector.

Specifically this patch implements load and store operations, which
get lowered to their masked counterparts thusly:

  V = load(Addr) =>
    V = extract_fixed_vector(masked_load(make_pred(V.NumElts), Addr))

  store(V, (Addr)) =>
    masked_store(insert_fixed_vector(V), make_pred(V.NumElts), Addr))

Reviewers: rengolin, efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80385
2020-06-23 09:39:03 +00:00
Dylan McKay b9c26a9cfe [AVR] Rewrite the function calling convention.
Summary:
The previous version relied on the standard calling convention using
std::reverse() to try to force the AVR ABI. But this only works for
simple cases, it fails for example with aggregate types.

This patch rewrites the calling convention with custom C++ code, that
implements the ABI defined in https://gcc.gnu.org/wiki/avr-gcc.

To do that it adds a few 16-bit pseudo registers for unaligned argument
passing, such as R24R23. For example this function:

    define void @fun({ i8, i16 } %a)

will pass %a.0 in R22 and %a.1 in R24R23.

There are no instructions that can use these pseudo registers, so a new
register class, DREGSMOVW, is defined to make them apart.

Also the ArgCC_AVR_BUILTIN_DIV is no longer necessary, as it is
identical to the C++ behavior (actually the clobber list is more strict
for __div* functions, but that is currently unimplemented).

Reviewers: dylanmckay

Subscribers: Gaelan, Sh4rK, indirect, jwagen, efriedma, dsprenkels, hiraditya, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68524

Patch by Rodrigo Rivas Costa.
2020-06-23 21:36:18 +12:00
James Henderson 9782c922cb [DebugInfo] Print line table extended opcode bytes if parsing fails
Previously, if there was an error whilst parsing the operands of an
extended opcode, the operands would be treated as zero and printed. This
could potentially be slightly confusing. This patch changes the
behaviour to print the raw bytes instead.

Reviewed by: ikudrin

Differential Revision: https://reviews.llvm.org/D81570
2020-06-23 10:04:02 +01:00
Simon Pilgrim 7a55d98497 ProfileSummary.cpp - fix implicit Format.h dependency. NFC.
ProfileSummary was depending on other headers (notably WithColor.h) to define format().
2020-06-23 09:43:40 +01:00
Simon Pilgrim 0acd22b8fb StatepointLowering.cpp - fix implicit CommandLine.h dependency. NFC.
StatepointLowering defines a cl::opt but don't include CommandLine.h.
2020-06-23 09:43:39 +01:00
Florian Hahn a822ec75cc [DSE,MSSA] Treat passed by value args as invisible to caller.
This updates the MemorySSA backed implementation to treat arguments
passed by value similar to allocas: in they are assumed to be invisible
in the caller. This is similar to how they are treated in legacy DSE.

Reviewers: efriedma, asbirlea, george.burgess.iv

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D82222
2020-06-23 08:58:51 +01:00
Alex Lorenz 1c4a42a4d8 [Triple] support macOS 11 os version number
macOS goes to 11! This commit adds support for the new version number by ensuring
that existing version comparison routines, and the 'darwin' OS identifier
understands the new numbering scheme. It also adds a new utility method
'getCanonicalVersionForOS', which lets users translate some uses of
macOS 10.16 into macOS 11. This utility method will be used in upcoming
clang and swift commits.

Differential Revision: https://reviews.llvm.org/D82337
2020-06-22 23:03:47 -07:00
Michael Liao f95850ce9c [SROA] Teach SROA to perform no-op pointer conversion.
Summary:
- When promoting a pointer from memory to register, SROA skips pointers
  from different address spaces. However, as `ptrtoint` and `inttoptr`
  are defined as no-op casts if that integer type has the same as the
  pointer value, generate the pair of `ptrtoint`/`inttoptr` (no-op cast)
  sequence to convert pointers from different address spaces if they
  have the same size.

Reviewers: arsenm, chandlerc, lebedev.ri

Subscribers:

Differential Revision: https://reviews.llvm.org/D81943
2020-06-23 01:49:27 -04:00
Max Kazantsev 9bff376e5c [InstCombine] Replace selects with Phis
We can sometimes replace a select with a Phi node if all of its values
are available on respective incoming edges.

Differential Revision: https://reviews.llvm.org/D82005
Reviewed By: nikic
2020-06-23 12:12:59 +07:00
Michael Liao b1360caa82 [SDAG] Add new AssertAlign ISD node.
Summary:
- AssertAlign node records the guaranteed alignment on its source node,
  where these alignments are retrieved from alignment attributes in LLVM
  IR. These tracked alignments could help DAG combining and lowering
  generating efficient code.
- In this patch, the basic support of AssertAlign node is added. So far,
  we only generate AssertAlign nodes on return values from intrinsic
  calls.
- Addressing selection in AMDGPU is revised accordingly to capture the
  new (base + offset) patterns.

Reviewers: arsenm, bogner

Subscribers: jvesely, wdng, nhaehnle, tpr, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81711
2020-06-23 00:51:11 -04:00
Amy Kwan 19df9e2959 [PowerPC][Power10] Implement VSX PCV Generate Operations in LLVM/Clang
This patch implements builtins for the following prototypes for the VSX Permute
Control Vector Generate with Mask Instructions:

vector unsigned char vec_genpcvm (vector unsigned char, const int);
vector unsigned short vec_genpcvm (vector unsigned short, const int);
vector unsigned int vec_genpcvm (vector unsigned int, const int);
vector unsigned long long vec_genpcvm (vector unsigned long long, const int);

Differential Revision: https://reviews.llvm.org/D81774
2020-06-22 21:09:34 -05:00
Sanjay Patel 8953ecf22b [InstCombine] reassociate diff of sums into sum of diffs
This is the integer sibling to D81491.

(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) -->
(a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3])

Removing the "experimental" from these intrinsics is likely
not too far away.
2020-06-22 20:47:09 -04:00
Sanjay Patel 54143e2bd5 [VectorCombine] do not use magic number for undef mask element; NFC 2020-06-22 20:47:09 -04:00
Ayke van Laethem eac4a60154
[AVR] Disassemble double register instructions
Add disassembly support for the movw, adiw, and sbiw instructions.

I had previously committed test cases for the adiw and sbiw
instructions, but had accidentally made them not runnable so they were
skipped all this time. Oops. This patch fixes that by adding support for
disassembling those instructions.

Differential Revision: https://reviews.llvm.org/D82093
2020-06-23 02:18:04 +02:00
Ayke van Laethem 9f09c29f01
[AVR] Disassemble instructions with fixed Z operand
Some instructions have a fixed Z register and don't have an explicit
register operand. This can be worked around by simply printing the
operand directly if the particular register class is detected.

The LPM and ELPM instructions also needed a custom decoder, which is
also included in this patch.

Differential Revision: https://reviews.llvm.org/D82088
2020-06-23 02:17:53 +02:00
Ayke van Laethem ec9efb856c
[AVR] Disassemble multiplication instructions
These can often only use a limited range of registers, and apparently
need special decoding support.

Differential Revision: https://reviews.llvm.org/D81971
2020-06-23 02:17:37 +02:00
Ayke van Laethem 01c2209d51
[AVR] Decode single register instructions
This is a set of instructions that take just a single register as an
operand, with no immediates. Because all instructions share the same
format, I haven't added exhaustive bit testing to all instructions but
just to the inc instruction.

Differential Revision: https://reviews.llvm.org/D81968
2020-06-23 02:17:15 +02:00
Ayke van Laethem ff4817ec2a
[AVR] Don't adjust for instruction size
I'm not entirely sure why this was ever needed, but when I remove both
adjustments all tests still pass.

This fixes a bug where a long branch (using the `jmp` instead of the
`rjmp` instruction) was incorrectly adjusted by 2 because it jumps to an
absolute address instead of a PC-relative address. I could have added
AVR::fixup_call to the list of exceptions, but it seemed more sensible
to me to just remove this code.

Differential Revision: https://reviews.llvm.org/D78459
2020-06-23 02:15:42 +02:00
Sam Clegg 79aad89d8d [WebAssembly] Add support for externalref to MC and wasm-ld
This allows code for handling externref values to be processed by the
assembler and linker.

Differential Revision: https://reviews.llvm.org/D81977
2020-06-22 15:57:24 -07:00
Christopher Tetreault cd6848f6e1 [SVE] Remove calls to VectorType::getNumElements from ARM
Reviewers: efriedma, greened, c-rhodes, david-arm, dmgreen

Reviewed By: dmgreen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, dmgreen, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82216
2020-06-22 15:18:58 -07:00
Arthur Eubanks d335c1317b Fix dynamic alloca detection in CloneBasicBlock
Summary:
Simply check AI->isStaticAlloca instead of reimplementing checks for
static/dynamic allocas.

Reviewers: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82328
2020-06-22 15:06:28 -07:00
Craig Topper 23654d9e7a Recommit "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum."
Hopefully this version will fix the previously buildbot failure
2020-06-22 13:32:03 -07:00
Greg Clayton ccf5a44917 Fix the verification of DIEs with DW_AT_ranges.
Summary: Previous code would try to verify DW_AT_ranges and if any ranges would overlap, it would stop attributing any ranges after this to the DIE which caused incorrect errors to be reported that a DIE's address ranges were not contained in the parent DIE's ranges. Added a fix and a test.

Reviewers: aprantl, labath, probinson, JDevlieghere, jhenderson

Subscribers: hiraditya, MaskRay, cmtice, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79962
2020-06-22 13:13:48 -07:00
Lei Zhang 315bd96437 Use std::make_tuple instead initializer list
Hopefully this pleases GCC-5 and fixes the build error:

LowerExpectIntrinsic.cpp:62:53: error: converting to
'std::tuple<unsigned int, unsigned int>' from initializer list would use
explicit constructor 'constexpr std::tuple<_T1, _T2>::tuple(_U1&&,
_U2&&) [with _U1 = llvm:🆑:opt<unsigned int>&; _U2 =
llvm:🆑:opt<unsigned int>&; <template-parameter-2-3> = void; _T1 =
unsigned int; _T2 = unsigned int]'
     return {LikelyBranchWeight, UnlikelyBranchWeight};

Differential Revision: https://reviews.llvm.org/D82325
2020-06-22 15:43:40 -04:00
Hans Wennborg 1357c06578 Revert "[X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions"
This caused a Chromium test to miscompile. See discussion on the Phabricator
review.

> This patch extends MatchVectorAllZeroTest to handle OR vector reduction patterns where the result is compared against zero.
>
> Fixes PR45378
>
> Differential Revision: https://reviews.llvm.org/D81547

This reverts 057c9c7ee0
2020-06-22 21:27:11 +02:00
Christopher Tetreault 5e2c736395 [SVE] Remove calls to VectorType::getNumElements from WebASM
Summary:
The getNumElements in base VectorType is being deprecated.

See: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html

Reviewers: efriedma, tlively, fpetrogalli, c-rhodes, dschuff

Reviewed By: tlively, dschuff

Subscribers: dschuff, sbc100, tschuett, jgravelle-google, hiraditya, aheejin, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82217
2020-06-22 12:25:08 -07:00
Craig Topper bebea4221d Revert "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum."
Seems to breaking build.

This reverts commit 5ac144fe64.
2020-06-22 12:20:40 -07:00
Craig Topper 5ac144fe64 [X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum.
Move 0 initialization up to the caller so we don't need to know
the size.
2020-06-22 11:46:20 -07:00
Zhi Zhuang 37fb860301 Add support of __builtin_expect_with_probability
Add a new builtin-function __builtin_expect_with_probability and
intrinsic llvm.expect.with.probability.
The interface is __builtin_expect_with_probability(long expr, long
expected, double probability).
It is mainly the same as __builtin_expect besides one more argument
indicating the probability of expression equal to expected value. The
probability should be a constant floating-point expression and be in
range [0.0, 1.0] inclusive.
It is similar to builtin-expect-with-probability function in GCC
built-in functions.

Differential Revision: https://reviews.llvm.org/D79830
2020-06-22 10:21:28 -07:00
Hiroshi Yamauchi 9e1decf743 [PGO][PGSO] Enable non-cold size opts under partial profile sample PGO.
Summary: Similar to D81020. Follow up D78949.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82053
2020-06-22 10:12:48 -07:00
Francesco Petrogalli ef597eda8e [sve][acle] Add SVE BFloat16 extensions.
Summary:
List of intrinsics:

svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfdot[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfdot_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index)

svfloat32_t svbfmmla[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)

svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfmlalb[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfmlalb_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index)

svfloat32_t svbfmlalt[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfmlalt[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfmlalt_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index)

svbfloat16_t svcvt_bf16[_f32]_m(svbfloat16_t inactive, svbool_t pg, svfloat32_t op)
svbfloat16_t svcvt_bf16[_f32]_x(svbool_t pg, svfloat32_t op)
svbfloat16_t svcvt_bf16[_f32]_z(svbool_t pg, svfloat32_t op)

svbfloat16_t svcvtnt_bf16[_f32]_m(svbfloat16_t even, svbool_t pg, svfloat32_t op)
svbfloat16_t svcvtnt_bf16[_f32]_x(svbfloat16_t even, svbool_t pg, svfloat32_t op)

For reference, see section 7.2 of "Arm C Language Extensions for SVE - Version 00bet4"

Reviewers: sdesmalen, ctetreau, efriedma, david-arm, rengolin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82141
2020-06-22 16:53:02 +00:00
Sanjay Patel 9934cc544c [VectorCombine] make helper function for shift-shuffle; NFC
This will probably be useful for other extract patterns.
2020-06-22 12:23:52 -04:00
Florian Hahn 328c8642e2 [DSE,MSSA] Reorder DSE blocking checks.
Currently we stop exploring candidates too early in some cases.

In particular, we can continue checking the defining accesses of
non-removable MemoryDefs and defs without analyzable write location
(read clobbers are already ruled out using MemorySSA at this point).
2020-06-22 17:16:34 +01:00
Fangrui Song c52bee61e9 [MCParser] Support quoted section name for COFF
This features matches ELFAsmParser and makes it possible to use `.section ".llvm.call-graph-profile","n"`

Reviewed By: zequanwu

Differential Revision: https://reviews.llvm.org/D82240
2020-06-22 09:11:44 -07:00
stozer 539381da26 [DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructions
Following on from this RFC[0] from a while back, this is the first patch towards
implementing variadic debug values.

This patch specifically adds a set of functions to MachineInstr for performing
operations specific to debug values, and replacing uses of the more general
functions where appropriate. The most prevalent of these is replacing
getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the
operands corresponding to values will no longer be at index 0, but index 2 and
upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been
added for the other operands, along with some helper functions to replace
oft-repeated code and operate on a variable number of value operands.

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste>

Differential Revision: https://reviews.llvm.org/D81852
2020-06-22 16:01:12 +01:00
Guillaume Chatelet 597a9070b5 [ARC] Add missing return statement 2020-06-22 14:57:29 +00:00
Sanjay Patel 98c2f4eea5 [VectorCombine] add helper to replace uses and rename
The tests are regenerated to show a path that missed renaming,
but there should be no functional difference from this patch.
2020-06-22 09:58:49 -04:00
Valentin Clement 8383ac6197 Revert commit 9e52530 because of dependencies issue
This reverts commit 9e525309fb.
2020-06-22 09:56:14 -04:00
Valentin Clement 9e525309fb [openmp] Base of tablegen generated OpenMP common declaration
Summary:
As discussed previously when landing patch for OpenMP in Flang, the idea is
to share common part of the OpenMP declaration between the different Frontend.
While doing this it was thought that moving to tablegen instead of Macros will also
give a cleaner and more powerful way of generating these declaration.
This first part of a future series of patches is setting up the base .td file for
DirectiveLanguage as well as the OpenMP version of it. The base file is meant to
be used by other directive language such as OpenACC.
In this first patch, the Directive and Clause enums are generated with tablegen
instead of the macros on OMPConstants.h. The next pacth will extend this
to other enum and move the Flang frontend to use it.

Reviewers: jdoerfert, DavidTruby, fghanim, ABataev, jdenny, hfinkel, jhuber6, kiranchandramohan, kiranktp

Reviewed By: jdoerfert, jdenny

Subscribers: cfe-commits, mgorny, yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits

Tags: #llvm, #openmp, #clang

Differential Revision: https://reviews.llvm.org/D81736
2020-06-22 09:34:53 -04:00
Xing GUO 03480c80d3 [DWARFYAML][debug_info] Add support for error handling.
This patch helps add support for error handling in `DWARFYAML::emitDebugInfo()`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82275
2020-06-22 21:36:13 +08:00
Xing GUO 3a48a632d0 [DWARFYAML][debug_info] Use 'AbbrCode' to index the abbreviation.
Before this patch, we use `(uint32_t)AbbrCode - (uint32_t)FirstAbbrCode` to index the abbreviation. It's impossible for we to use the preceeding abbreviation of the previous one (e.g., if the previous DIE's `AbbrCode` is 2, we are unable to use the abbreviation with index 1). In this patch, we use `AbbrCode` to index the abbreviation directly.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82173
2020-06-22 21:34:02 +08:00
Simon Pilgrim 48d1a2d6d0 [DAG] Add SimplifyMultipleUseDemandedVectorElts helper for SimplifyMultipleUseDemandedBits. NFCI.
We have many cases where we call SimplifyMultipleUseDemandedBits and demand specific vector elements, but all the bits from them - this adds a helper wrapper to handle this.
2020-06-22 14:24:39 +01:00
Sanjay Patel de65b356dc [VectorCombine] add/use pass-level IRBuilder
This saves creating/destroying a builder every time we
perform some transform.

The tests show instruction ordering diffs resulting from
always inserting at the root instruction now, but those
should be benign.
2020-06-22 09:01:29 -04:00
Jay Foad 9761d3cf9c [AMDGPU] Update more live intervals in SIWholeQuadMode
This fixes various assertion failures that would otherwise be triggered
by a later patch to move SIWholeQuadMode later in the pass pipeline.

Differential Revision: https://reviews.llvm.org/D82190
2020-06-22 13:50:15 +01:00
Sanjay Patel cce625f73d [VectorCombine] improve IR debugging by providing/salvaging value names
The tests are regenerated to show the diffs, but there should be no
functional change from this patch.
2020-06-22 08:35:47 -04:00
Tim Corringham 96ecead5a2 [AMDGPU] clang-format of SIModeRegister.cpp
Ran clang-format just to ease future reviews. No functional changes.
2020-06-22 13:31:52 +01:00
Simon Pilgrim ecc5d7ee0d [DAG] SimplifyMultipleUseDemandedBits - drop unnecessary *_EXTEND_VECTOR_INREG cases
For little endian targets, if we only need the lowest element and none of the extended bits then we can just use the (bitcasted) source vector directly.

We already do this in SimplifyDemandedBits, this adds the SimplifyMultipleUseDemandedBits equivalent.
2020-06-22 12:35:32 +01:00
Tres Popp 09d72ad399 Revert "[CGP] Enable CodeGenPrepares phi type convertion."
This reverts commit 67121d7b82.

This is causing compile times to be 2x slower on some large binaries.
2020-06-22 13:06:18 +02:00
Serguei Katkov eae0d2e9b2 Revert "[Peeling] Extend the scope of peeling a bit"
This reverts commit 29b2c1ca72.

The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!

Not sure the patch itself it wrong but revert to investigate the failure.
2020-06-22 17:48:29 +07:00
Vitaly Buka 5d964e262f [StackSafety] Check variable lifetime
We can't consider variable safe if out-of-lifetime access is possible.
So if StackLifetime can't prove that the instruction always uses
the variable when it's still alive, we consider it unsafe.
2020-06-22 03:45:29 -07:00
Vitaly Buka 8f592ed333 [StackSafety] Ignore unreachable instructions
Usually DominatorTree provides this info, but here we use
StackLifetime. The reason is that in the next patch StackLifetime
will be used for actual lifetime checks and we can avoid
forwarding the DominatorTree into this code.
2020-06-22 03:45:29 -07:00
Anton Korobeynikov 6cb80fbe40 Revert "[MSP430] Update register names"
This reverts commit 8f6620f663.
2020-06-22 13:37:22 +03:00
Anatoly Trosinenko 8f6620f663 [MSP430] Update register names
When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4.

The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it.

Differential Revision: https://reviews.llvm.org/D82184
2020-06-22 13:24:03 +03:00
Momchil Velikov 75b0bbca1d [LTO] Use StringRef instead of C-style strings in setCodeGenDebugOptions
Fixes an issue with missing nul-terminators and saves us some string
copying, compared to a version which would insert nul-terminators.

Differential Revision: https://reviews.llvm.org/D82033
2020-06-22 11:22:18 +01:00
Anatoly Trosinenko a5bd75aab8 [MSP430] Enable some basic support for debug information
This commit technically permits LLVM to emit the debug information for ELF files for MSP430 architecture. Aside from this, it only defines the register numbers as defined by part 10.1 of MSP430 EABI specification (assuming the 1-byte subregisters share the register numbers with corresponding full-size registers).

This commit was basically tested by me with TI-provided GCC 8.3.1 toolchain by compiling an example program with `clang` (please note manual linking may be required due to upstream `clang` not yet handling the `-msim` option necessary to run binaries on the GDB-provided simulator) and then running it and single-stepping with `msp430-elf-gdb` like this:

```
$sysroot/bin/msp430-elf-gdb ./test -ex "target sim" -ex "load ./test"
(gdb) ... traditional GDB commands follow ...
```

While this implementation is most probably far from completeness and is considered experimental, it can already help with debugging MSP430 programs as well as finding issues in LLVM debug info support for MSP430 itself.

One of the use cases includes trying to find a point where UBSan check in a trap-on-error mode was triggered.

The expected debug information format is described in the [MSP430 Embedded Application Binary Interface](http://www.ti.com/lit/an/slaa534/slaa534.pdf) specification, part 10.

Differential Revision: https://reviews.llvm.org/D81488
2020-06-22 13:14:07 +03:00
Anatoly Trosinenko 359fae6eb0 [DebugInfo] Explicitly permit addr_size = 0x02 when parsing DWARF data
Current LLVM implementation uses `MCAsmInfo::CodePointerSize` as addr_size when emitting the DWARF data. llvm-dwarfdump, on the other hand, handles `addr_size`s of 4 and 8 properly and considers all other sizes as an error. This works for most of mainline targets except for MSP430 and AVR.

msp430-gcc v8.3.1 emits DWARF32 with addr_size = 4 (DWARF32 does not imply addr_size = 4, 32 refers to internal offset width of 4 bytes) that is handled by llvm-dwarfdump already. Still, emitting 2-byte target pointers on MSP430 seems correct as well (but not for MSP430X that is supported by msp430-gcc but not by LLVM and has 20-bit address space).

This patch make it possible for MSP430 debug info support to be tested with llvm-dwarfdump.

Differential Revision: https://reviews.llvm.org/D82055
2020-06-22 13:11:55 +03:00
Florian Hahn 0e19ff02d8 [DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC). 2020-06-22 10:58:53 +01:00
Djordje Todorovic 792786e34d [CSInfo][MIPS] Don't describe parameters loaded by sub/super reg copy
When describing parameter value loaded by a COPY instruction, consider
case where needed Reg value is a sub- or super- register of the COPY
instruction's destination register. Without this patch, compile process
will crash with the assertion "TargetInstrInfo::describeLoadedValue
can't describe super- or sub-regs for copy instructions".

Patch by Nikola Tesic

Differential revision: https://reviews.llvm.org/D82000
2020-06-22 10:49:02 +02:00
Serguei Katkov 29b2c1ca72 [Peeling] Extend the scope of peeling a bit
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.

Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.

Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
2020-06-22 12:17:44 +07:00
Michael Liao 20a1700293 [amdgpu] Fix REL32 relocations with negative offsets.
Summary: - The offset should be treated as a signed one.

Reviewers: rampitec, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82234
2020-06-21 23:09:03 -04:00
Sanjay Patel 6bdd531af5 [VectorCombine] create class for pass to hold analyses, etc; NFC
This doesn't change anything currently, but it would make sense
to create a class-level IRBuilder instead of recreating that
everywhere. As we expand to more optimizations, we will probably
also want to hold things like the DataLayout or other constant
refs in here too.
2020-06-21 16:07:33 -04:00
David Green 67121d7b82 [CGP] Enable CodeGenPrepares phi type convertion. 2020-06-21 16:46:16 +01:00
Florian Hahn 40569db7b3 [DSE,MSSA] Move reachability check to main loop.
As we traverse the CFG backwards, we could end up reaching unreachable
blocks. For unreachable blocks, we won't have computed post order
numbers and because DomAccess is reachable, unreachable blocks cannot be
on any path from it.

This fixes a crash with unreachable blocks.
2020-06-21 16:38:10 +01:00
David Green 730ecb63ec [CGP] Convert phi types
If a collection of interconnected phi nodes is only ever loaded, stored
or bitcast then we can convert the whole set to the bitcast type,
potentially helping to reduce the number of register moves needed as the
phi's are passed across basic block boundaries. This has to be done in
CodegenPrepare as it naturally straddles basic blocks.

The alorithm just looks from phi nodes, looking at uses and operands for
a collection of nodes that all together are bitcast between float and
integer types. We record visited phi nodes to not have to process them
more than once. The whole subgraph is then replaced with a new type.
Loads and Stores are bitcast to the correct type, which should then be
folded into the load/store, changing it's type.

This comes up in the biquad testcase due to the way MVE needs to keep
values in integer registers. I have also seen it come up from aarch64
partner example code, where a complicated set of sroa/inlining produced
integer phis, where float would have been a better choice.

I also added undef and extract element handling which increased the
potency in some cases.

This adds it with an option that defaults to off, and disabled for 32bit
X86 due to potential issues around canonicalizing NaNs.

Differential Revision: https://reviews.llvm.org/D81827
2020-06-21 15:54:17 +01:00
Nikita Popov 37d3030711 [ValueTracking, BasicAA] Don't simplify instructions
GetUnderlyingObject() (and by required symmetry
DecomposeGEPExpression()) will call SimplifyInstruction() on the
passed value if other checks fail. This simplification is very
expensive, but has little effect in practice. This patch removes
the SimplifyInstruction call(), and replaces it with a check for
single-argument phis (which can occur in canonical IR in LCSSA
form), which is the only useful simplification case I was able to
identify.

At O3 the geomean CTMark improvement is -1.7%. The largest
improvement is SPASS with ThinLTO at -6%.

In test-suite, I see only two tests with a hash difference and
no code size difference (PAQ8p, Ptrdist), which indicates that
the simplification only ends up being useful very rarely. (I would
have liked to figure out which simplification is responsible here,
but wasn't able to spot it looking at transformation logs.)

The AMDGPU test case that is update was using two selects with
undef condition, in which case GetUnderlyingObject will return
the first select operand as the underlying object. This will of
course not happen with non-undef conditions, so this was not
testing anything realistic. Additionally this illustrates potential
unsoundness: While GetUnderlyingObject will pick the first operand,
the select might be later replaced by the second operand, resulting
in inconsistent assumptions about the undef value.

Differential Revision: https://reviews.llvm.org/D82261
2020-06-21 16:31:07 +02:00
Sanjay Patel 2ad42c2653 [ValueTracking] improve analysis for fdiv with same operands
(The 'nnan' variant of this pattern is already tested to produce '1.0'.)

https://alive2.llvm.org/ce/z/D4hPBy

define i1 @src(float %x, i32 %y) {
%0:
  %d = fdiv float %x, %x
  %uge = fcmp uge float %d, 0.000000
  ret i1 %uge
}
=>
define i1 @tgt(float %x, i32 %y) {
%0:
  ret i1 1
}
Transformation seems to be correct!
2020-06-21 09:07:59 -04:00
Simon Pilgrim fb9f9dc318 [X86][SSE] Add SimplifyDemandedVectorEltsForTargetShuffle to handle target shuffle variable masks
Pulled out from the ongoing work on D66004, currently we don't do a good job of simplifying variable shuffle masks that have already lowered to constant pool entries.

This patch adds SimplifyDemandedVectorEltsForTargetShuffle (a custom x86 helper) to first try SimplifyDemandedVectorElts (which we already do) and then constant pool simplification to help mark undefined elements.

To prevent lowering/combines infinite loops, we only handle basic constant pool loads instead of creating new BUILD_VECTOR nodes for lowering - e.g. we don't try to convert them to broadcast/vzext_load - there might be some benefit to this but if so I'd rather we come up with some way to reuse existing code than reimplement a lot of BUILD_VECTOR code.

Differential Revision: https://reviews.llvm.org/D81791
2020-06-21 11:16:07 +01:00
clfbbn 10b0539772 [Attributor][NFC] Fix indentation
Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, kuter, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82260
2020-06-21 15:43:32 +08:00
Wenlei He 7c8a6936bf [Remarks] Add callsite locations to inline remarks
Summary:
Add call site location info into inline remarks so we can differentiate inline sites.
This can be useful for inliner tuning. We can also reconstruct full hierarchical inline
tree from parsing such remarks. The messege of inline remark is also tweaked so we can
differentiate SampleProfileLoader inline from CGSCC inline.

Reviewers: wmi, davidxl, hoy

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82213
2020-06-20 23:32:10 -07:00
Amy Kwan cc95635b1b [PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:
```
vector signed char vec_clrl (vector signed char a, unsigned int n);
vector unsigned char vec_clrl (vector unsigned char a, unsigned int n);
vector signed char vec_clrr (vector signed char a, unsigned int n);
vector signed char vec_clrr (vector unsigned char a, unsigned int n);
```

Differential Revision: https://reviews.llvm.org/D81707
2020-06-20 18:29:16 -05:00
Eric Christopher dc20419351 Rename function to more accurately reflect what it does. 2020-06-20 14:37:29 -07:00
Eric Christopher 8116d01905 Typos around a -> an. 2020-06-20 14:04:48 -07:00
Sanjay Patel 741e20f3d6 [VectorCombine] fix assert for type of compare operand
As shown in the post-commit comment for D81661 - we need to
loosen the type assertion to allow scalarization of a compare
for vectors of pointers.
2020-06-20 15:20:17 -04:00
Sanjay Patel 7b201bfcac [InstCombine] remove unused parameter and add assert; NFC 2020-06-20 11:47:00 -04:00
Sanjay Patel d84cdb81ed [InstCombine] fabs(X) / fabs(X) -> X / X
Also, consolidate related folds so we don't miss/repeat these.
2020-06-20 10:20:21 -04:00
Simon Pilgrim 89dcbdfcfd [X86] combineSetCCMOVMSK - consistently use CmpBits variable. NFCI.
The comparison value should be the same size - I've added an assert to be absolutely certain.
2020-06-20 12:35:24 +01:00
Simon Pilgrim 56a9332328 [X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) != -1 -> !PTESTZ(X,X) allof patterns 2020-06-20 12:17:32 +01:00
Nikita Popov d3d4e4bcb7 [LVI] Extract addValueHandle() method (NFC)
There will be more places registering value handles.
2020-06-20 13:05:42 +02:00
Nikita Popov 64ecf85f63 [LVI] Use find_as() where possible (NFC)
This prevents us from creating temporary PoisoningVHs and
AssertingVHs while performing hashmap lookups. As such, it only
matters in assertion-enabled builds.
2020-06-20 13:05:42 +02:00
Florian Hahn 9a7d80a32c Revert "[BasicAA] Use known lower bounds for index values for size based check."
This potentially related to https://bugs.llvm.org/show_bug.cgi?id=46335
and causes a slight compile-time regression. Revert while investigating.

This reverts commit d99a1848c4.
2020-06-20 10:06:05 +01:00
Eric Christopher 10563e16aa [Analysis/Transforms/Sanitizers] As part of using inclusive language
within the llvm project, migrate away from the use of blacklist and
whitelist.
2020-06-20 00:42:26 -07:00
Eric Christopher 858d385578 As part of using inclusive language within the llvm project,
migrate away from the use of blacklist and whitelist.
2020-06-20 00:24:57 -07:00
Eric Christopher cf23852587 [Target] As part of using inclusive language within the llvm project,
migrate away from the use of blacklist and whitelist.

This change affects an internal llvm command line option.
2020-06-20 00:06:39 -07:00
Xing GUO 6770349592 [DWARFYAML][debug_info] Fix array index out of bounds error
This patch is trying to fix the array index out of bounds error. I observed it in (https://reviews.llvm.org/harbormaster/unit/view/99638/).

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D82139
2020-06-20 15:08:24 +08:00
Craig Topper c721bc081e [X86] Correct the implementation of ud1(a.k.a. ud2b) instruction.
We were missing the modrm byte this instruction has according
to current Intel SDM. Experiments with gcc indicate that different
modrm values are chosen based on 2 operands so I've added those
as well.

I think our previous implementation was based on an older behavior of
binutils that has since been changed.
2020-06-19 23:57:48 -07:00
Craig Topper 0dda5e4ce2 [X86] Ignore bits 2:0 of the modrm byte when disassembling lfence, mfence, and sfence.
These are documented as using modrm byte of 0xe8, 0xf0, and 0xf8
respectively. But hardware ignore bits 2:0. So 0xe9-0xef is treated
the same as 0xe8. Similar for the other two.

Fixing this required adding 8 new formats to the X86 instructions
to convey this information. Could have gotten away with 3, but
adding all 8 made for a more logical conversion from format to
modrm encoding.

I renumbered the format encodings to keep the register modrm
formats grouped together.
2020-06-19 22:24:24 -07:00
Fangrui Song 2a4317bfb3 [SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list
Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now.

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D82244
2020-06-19 22:22:47 -07:00
Yevgeny Rouban 6429471e8b [IR] Convert profile metadata in createCallMatchingInvoke()
When an invoke instruction is converted to a call its
profile metadata is dropped because it has incompatible
format (see commit 16ad6eeb94).
This patch adds an attempt to convert profile data to
format of the call instruction. This used to work well
before the commit dcfa78a4cc.

Reviewers: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82071
2020-06-20 12:10:31 +07:00
Wang Rui dd48c57da3 [Mips] Error if a non-immediate operand is used while an immediate is expected
The 32-bit type relocation (R_MIPS_32) cannot be used for instructions below:

ori $4, $4, start
ori $4, $4, (start - .)

We should print an error instead.

Reviewed By: atanasyan, MaskRay

Differential Revision: https://reviews.llvm.org/D81908
2020-06-19 22:08:59 -07:00
Vitaly Buka 3d8149db3c [StackSafety,NFC] Don't rerun on LiveIn change 2020-06-19 21:29:31 -07:00
Xing GUO 1cfdda57fa [ObjectYAML][ELF] Add support for emitting the .debug_info section.
This patch helps add support for emitting the .debug_info section to yaml2elf.

Reviewed By: jhenderson, grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D82073
2020-06-20 12:13:01 +08:00
Carl Ritson 4a7de36afc [AMDGPU] Avoid use of V_READLANE into EXEC in SGPR spills
Always prefer to clobber input SGPRs and restore them after the
spill.  This applies to both spills to VGPRs and scratch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81914
2020-06-20 12:10:47 +09:00
romanova-ekaterina d7fad626e9 Error related to ThinLTO caching needs to be downgraded to a remark
This is a fix for PR #46392 (Diagnostic message (error) related to
ThinLTO caching needs to be downgraded to a remark).

There are diagnostic messages related to ThinLTO caching that contain
the word "error", but they are really just notices/remarks for users,
and they don't cause a build failure. The word "error" appearing can be
confusing to users, and may even cause deeper problems.

User's build system might be designed to interpret any error messages
(even a benign error message as the one above) reported by the compiler
as a build failure, thus causing the build to fail "needlessly". In
short, the term "error" in this diagnostic is misleading at best, and
may be causing build systems to fail at worst.

Differential Revision: https://reviews.llvm.org/D82138
2020-06-19 16:03:29 -07:00
Eric Christopher b6536e549d As part of using inclusive language within the llvm project,
migrate away from the use of blacklist and whitelist.
2020-06-19 15:12:18 -07:00
Heejin Ahn 83c26eae23 [WebAssembly] Remove TEEs when dests are unstackified
When created in RegStackify pass, `TEE` has two destinations, where
op0 is stackified and op1 is not. But it is possible that
op0 becomes unstackified in `fixUnwindMismatches` function in
CFGStackify pass when a nested try-catch-end is introduced, violating
the invariant of `TEE`s destinations.

In this case we convert the `TEE` into two `COPY`s, which will
eventually be resolved in ExplicitLocals.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D81851
2020-06-19 14:55:21 -07:00
Martin Storsjö cdbd299800 [Support] Fix building for mingw on a case sensitive file system
This fixes cross building on a case sensitive file system after
2e613d2ded. (The official Windows
SDKs don't have self-consistent casing and can't be used as such on
case sentisive file systems without case fixups, while mingw headers
consistently use lower case.)
2020-06-20 00:39:22 +03:00
Amara Emerson 1feeecf224 [AArch64][GlobalISel] Make G_SEXT_INREG legal and add selection support.
We were defaulting to the lower action for this, resulting in SHL+ASHR
sequences. On AArch64 we can do this in one instruction for an arbitrary
extension using SBFM as we do for G_SEXT.

Differential Revision: https://reviews.llvm.org/D81992
2020-06-19 13:20:41 -07:00
Sanjay Patel 216a37bb46 [VectorCombine] refactor extract-extract logic; NFCI 2020-06-19 14:52:27 -04:00
Lang Hames bf783a6aa8 [JITLink] Display host -> target address mapping in debugging output.
This can be helpful for sanity checking JITLink memory manager behavior.
2020-06-19 10:05:02 -07:00
Sanjay Patel 6d864097a2 [VectorCombine] fix crash while transforming constants
This is a variation of the proposal in D82049 with an extra test.
2020-06-19 12:30:32 -04:00
Stanislav Mekhanoshin 2b87a44c49 [AMDGPU] Some formatting fixes. NFC. 2020-06-19 09:02:59 -07:00
Piotr Sobczak 6d9565d6d5 Revert "[AMDGPU] Select s_cselect"
This caused some failures detected by the buildbot with
expensive checks enabled.

This reverts commit 4067de569f.
2020-06-19 16:41:04 +02:00
dfukalov 129388ddc4 [AMDGPU][CostModel] Add fneg cost estimation
Summary: The estimation uses AMDGPUTargetLowering::isFNegFree()

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82065
2020-06-19 17:31:35 +03:00
Piotr Sobczak 4067de569f [AMDGPU] Select s_cselect
Summary:
Add patterns to select s_cselect in the isel.

Handle more cases of implicit SCC accesses in si-fix-sgpr-copies
to allow new patterns to work.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81925
2020-06-19 16:17:46 +02:00
Sjoerd Meijer 4aa893b8f2 [ARM][MVE] tail-predication: renamed internal option.
Renamed -force-tail-predication to -force-mve-tail-predication because
that's more descriptive and consistent.
2020-06-19 15:07:06 +01:00
Mikhail Maltsev 490f78c038 [ARM][BFloat] Implement lowering of bf16 load/store intrinsics
Reviewers: labrinea, dmgreen, pratlucas, LukeGeeson

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81486
2020-06-19 14:02:35 +00:00
Mikhail Maltsev 7526881246 [ARM][BFloat] Lowering of create/get/set/dup intrinsics
This patch adds codegen for the following BFloat
operations to the ARM backend:
* concatenation of bf16 vectors
* bf16 vector element extraction
* bf16 vector element insertion
* duplication of a bf16 value into each lane of a vector
* duplication of a bf16 vector lane into each lane

Differential Revision: https://reviews.llvm.org/D81411
2020-06-19 12:52:40 +00:00
Simon Pilgrim c143db3b10 [X86][SSE] combineHorizontalPredicateResult - improve all_of(X == 0) for vXi64 on pre-SSE41 targets
Without SSE41 we don't have the PCMPEQQ instruction, making cmp-with-zero reductions more complicated than necessary. We can compare as vXi32 (PCMPEQD) and tweak the MOVMSK comparison to test upper/lower DWORD comparisons.

This pre-fixes something that occurs with null tests for vectors of (64-bit) pointers such as in PR35129.
2020-06-19 11:43:25 +01:00
Vitaly Buka 0e1bdeafc9 [StackSafety,NFC] Fix comment 2020-06-19 03:11:13 -07:00
Tyker 67448a8ccc try to fix build bot after b7338fb1a6 2020-06-19 12:02:09 +02:00
Simon Pilgrim cad2038700 [X86][SSE] combineSetCCMOVMSK - fold MOVMSK(SHUFFLE(X,u)) -> MOVMSK(X)
If we're permuting ALL the elements of a single vector, then for allof/anyof MOVMSK tests we can avoid the shuffle entirely.
2020-06-19 10:57:52 +01:00
David Sherwood 584d0d5c17 [SVE] Fall back on DAG ISel at -O0 when encountering scalable types
At the moment we use Global ISel by default at -O0, however it is
currently not capable of dealing with scalable vectors for two
reasons:

1. The register banks know nothing about SVE registers.
2. The LLT (Low Level Type) class knows nothing about scalable
   vectors.

For now, the easiest way to avoid users hitting issues when using
the SVE ACLE is to fall back on normal DAG ISel when encountering
instructions that operate on scalable vector types.

I've added a couple of RUN lines to existing SVE tests to ensure
we can compile at -O0. I've also added some new tests to

  CodeGen/AArch64/GlobalISel/arm64-fallback.ll

that demonstrate we correctly fallback to DAG ISel at -O0 when
lowering formal arguments or translating instructions that involve
scalable vector types.

Differential Revision: https://reviews.llvm.org/D81557
2020-06-19 10:57:00 +01:00
David Sherwood 0dc28af219 [CodeGen,AArch64] Fix up warnings in performExtendCombine
Try to avoid calling getVectorNumElements() or relying upon the
TypeSize conversion to uin64_t.

Differential Revision: https://reviews.llvm.org/D81573
2020-06-19 10:34:51 +01:00
Vitaly Buka f224f3d0f2 [StackSafety] Add StackLifetime::isAliveAfter
This function is going to be added into StackSafety checks.
This patch uses function in ::print implementation to make sure
that it works as expected.
2020-06-19 02:32:17 -07:00
Vitaly Buka 306c257b00 [SafeStack,NFC] Print liveness for all instrunctions 2020-06-19 02:32:17 -07:00
Vitaly Buka 20b1094a04 [StackSafety,NFC] Replace map with vector
We don't need to lookup InstructionNumbering by number, so
we can use vector with index as assigned number.
2020-06-19 02:32:17 -07:00
Vitaly Buka 7b27c09f63 [StackSafety,NFC] Don't test terminators
Code does not track terminators and do not expose them through interface.
State there is just a state of the last instruction or entry.
So this information is just redundant and doesn't need to be tested.
2020-06-19 02:32:17 -07:00
Florian Hahn f9d8e33c32 [SCCP] Turn sext into zext for non-negative ranges.
This patch updates SCCP/IPSCCP to use the computed range info to turn
sexts into zexts, if the value is known to be non-negative. We already
to a similar transform in CorrelatedValuePropagation, but it seems like
we can catch a lot of additional cases by doing it in SCCP/IPSCCP as
well.

The transform is limited to ranges that are known to not include undef.

Currently constant ranges from conditions are treated as potentially
containing undef, due to PR46144. Once we flip this, the transform will
be more effective in practice.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81756
2020-06-19 10:17:55 +01:00
Jay Foad 7cdf4326a8 [LiveIntervals] Fix early-clobber handling in handleMoveUp
Without this fix, handleMoveUp can create an invalid live range like
this:

[98904e,98908r:0)[98908e,227504r:1)

where the two segments overlap, but only because we have lost the "e"
(early-clobber) on the end point of the first segment.

Differential Revision: https://reviews.llvm.org/D82110
2020-06-19 10:17:04 +01:00
Tyker b7338fb1a6 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-19 10:32:26 +02:00
David Sherwood 7edc7f6edb [CodeGen] Fix SimplifyDemandedBits for scalable vectors
For now I have changed SimplifyDemandedBits and it's various callers
to assume we know nothing for scalable vectors and to ignore the
demanded bits completely. I have also done something similar for
SimplifyDemandedVectorElts. These changes fix up lots of warnings
due to calls to EVT::getVectorNumElements() for types with scalable
vectors. These functions are all used for optimisations, rather than
functional requirements. In future we can revisit this code if
there is a need to improve code quality for SVE.

Differential Revision: https://reviews.llvm.org/D80537
2020-06-19 07:59:35 +01:00
David Sherwood 9e811b0d93 [CodeGen] Fix ComputeNumSignBits for scalable vectors
When trying to calculate the number of sign bits for scalable vectors
we should just bail out for now and pretend we know nothing.

Differential Revision: https://reviews.llvm.org/D81093
2020-06-19 07:58:42 +01:00
Kristof Beyls d938ec4509 [AArch64] Avoid incompatibility between SLSBLR mitigation and BTI codegen.
A "BTI c" instruction only allows jumping/calling to using a BLR* instruction.
However, the SLSBLR mitigation changes a BLR to a BR to implement the
function call. Therefore, a "BTI c" check that passed before could
trigger after the BLR->BL change done by the SLSBLR mitigation.
However, if the register used in BR is X16 or X17, this trigger will not
fire (see ArmARM for further details).

Therefore, this patch simply changes the function stubs for the SLSBLR
mitigation from
__llvm_slsblr_thunk_x<N>:
    br x<N>
    SpeculationBarrier
to
__llvm_slsblr_thunk_x<N>:
    mov x16, x<N>
    br  x16
    SpeculationBarrier

Differential Revision: https://reviews.llvm.org/D81405
2020-06-19 06:21:54 +01:00
Ronak Chauhan 5bd33de9c8 [MC] Pass the symbol rather than its name to onSymbolStart()
Summary: This allows targets to also consider the symbol's type and/or address if needed.

Reviewers: scott.linder, jhenderson, MaskRay, aardappel

Reviewed By: scott.linder, MaskRay

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82090
2020-06-19 09:30:12 +05:30
Francesco Petrogalli d32c134648 [llvm][SVE] Reg + reg addressing mode for LD1RO.
Reviewers: efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80741
2020-06-19 03:56:10 +00:00
Nemanja Ivanovic 1fed131660 [PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448
2020-06-18 21:54:22 -05:00
Carl Ritson 8f3b2c8aa3 AMDGPU/GlobalISel: Remove selection of MAD/MAC when not available
Add code to respect mad-mac-f32-insts target feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81990
2020-06-19 10:30:19 +09:00
Vitaly Buka fcd67665a8 [StackSafety] Add "Must Live" logic
Summary:
Extend StackLifetime with option to calculate liveliness
where alloca is only considered alive on basic block entry
if all non-dead predecessors had it alive at terminators.

Depends on D82043.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82124
2020-06-18 16:53:37 -07:00
Nathan James 8b0df1c1a9
[NFC] Refactor Registry loops to range for 2020-06-19 00:40:10 +01:00
Vitaly Buka f672791e08 [StackSafety] Add pass for StackLifetime testing
Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82043
2020-06-18 16:34:18 -07:00
Matt Arsenault bbd78519f9 ARC: Enforce function alignment at code emission time
Don't do this in the MachineFunctionInfo constructor. Also, ensure the
alignment rather than overwriting it outright. I vaguely remember
there was another place to enforce the target minimum alignment, but I
couldn't find it (it's there for instructions).
2020-06-18 17:40:49 -04:00
Matt Arsenault 95605b784b AMDGPU/GlobalISel: Implement computeKnownAlignForTargetInstr
We probably need to move where intrinsics are lowered to copies to
make this useful.
2020-06-18 17:28:00 -04:00
Matt Arsenault b13f6b0fe0 BypassSlowDivision: Fix dropping debug info
I don't know anything about debug info, but this seems like more work
should be necessary. This constructs a new IRBuilder and reconstructs
the original divides rather than moving the original.

One problem this has is if a div/rem pair are handled, both end up
with the same debugloc. I'm not sure how to fix this, since this uses
a cache when it sees the same input operands again, which will have
the first instance's location attached.
2020-06-18 17:27:19 -04:00
Amy Kwan c45c161130 [PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:

vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long);
vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b);
unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
unsigned long long __builtin_pextd (unsigned long long, unsigned long long);

Revision Depends on D80758

Differential Revision: https://reviews.llvm.org/D80935
2020-06-18 16:23:56 -05:00
Matt Arsenault 7f8b2e1b91 GlobalISel: Pass LegalizerHelper to custom legalize callbacks
This was passing in all the parameters needed to construct a
LegalizerHelper in the custom legalization, when it's simpler to just
pass in the existing helper.

This is slightly more annoying to use in the common case where you
don't need the legalizer helper, but we could add back the common
parameters back in addition to the helper.

I didn't propagate this to all the internal target changes that this
logically implies, but did update a sample one for
legalizeMinNumMaxNum.

This is in preparation for moving AMDGPU load/store legalization
entirely into custom lowering. The current set of legalization actions
is really constraining and not really capable of expressing all the
actions needed to legalize loads/stores. In particular there's no way
to express when the memory access itself needs to change size vs. the
result type. There's also a lot of redundancy since the same
split/widen actions need to be applied in both vector and scalar
cases. All of the sub-cases logically belong as steps in the legalizer
helper, but it will be easier to consider everything at once in custom
lowering.
2020-06-18 17:17:38 -04:00
Christopher Tetreault 8d11ec66b6 [SVE] Remove calls to VectorType::getNumElements from Transforms/Utils
Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82057
2020-06-18 13:39:14 -07:00
Alexandre Ganea 2ae0df5be7 [CodeView] Revert 8374bf4363 and 403f953792
This reverts:
8374bf4363 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.
403f953792 [CodeView] Add full repro to LF_BUILDINFO record

This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test
And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c
2020-06-18 16:18:46 -04:00
Kirill Naumov 41d53194fb [BasicBlock] Added AnnotationWriter functionality to BasicBlock class
This functionality is very similar to Function compatibility with
AnnotationWriter. This change allows us to use AnnotationWriter with
BasicBlock through BB.print() method.

Reviewed-By: apilipenko
Differntial Revision: https://reviews.llvm.org/D81321
2020-06-18 19:49:58 +00:00
Sanjay Patel 46a285ad9e [IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC
The predicate can always be used to distinguish between icmp and fcmp,
so we don't need to keep repeating this check in the callers.
2020-06-18 15:47:06 -04:00
Davide Italiano 8cdd2a158c [SimplifyCFG] Update debug location when folding branch to common destination
Sometimes a dead block gets folded and the debug information is still
retained. This manifests as jumpy stepping in lldb, see the bugzilla PR
for an end-to-end C testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46008

Differential Revision:  https://reviews.llvm.org/D82062
2020-06-18 12:33:32 -07:00
Michael Liao 2defe55722 [TTI] Expose isNoopAddrSpaceCast in TTI.
Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82025
2020-06-18 14:40:47 -04:00
serge-sans-paille 4dd332723d Fix return status of LoopDistribute
Move code that may update the IR after precondition, so that if precondition
fail, the IR isn't modified.

Differential Revision: https://reviews.llvm.org/D81225
2020-06-18 20:13:18 +02:00
Matt Arsenault 779cba79ec AMDGPU: Remove mayLoad/mayStore from some side effecting intrinsics
These don't really modify any memory, and should not expect memory
operands.
2020-06-18 14:12:19 -04:00
Stanislav Mekhanoshin 6c7e1b16fa [AMDGPU] Added new encoding to getMCOpcodeGen
Nothing breaks yet, but all encodings shall be in the map.

Differential Revision: https://reviews.llvm.org/D81974
2020-06-18 10:11:33 -07:00
Arthur Eubanks 91ef930526 [GlobalOpt] Remove preallocated calls when possible
When possible (e.g. internal linkage), strip preallocated attribute off
parameters/arguments.
This requires removing the "preallocated" operand bundle from the call
site, replacing @llvm.call.preallocated.arg() with an alloca and a
bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since
@llvm.call.preallocated.arg() can be called multiple times with the same
arg index, we create an alloca per arg index.
We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was
and a @llvm.stackrestore() after the preallocated call to prevent the
stack from blowing up. This is valid because the argument would normally
not exist on the stack after the call before the transformation.

This does not currently handle all possible preallocated calls. We will
need to figure out where to put @llvm.stackrestore() in the cases where
there is no obvious place to put it, for example conditional
preallocated calls, invokes.

This sort of transformation may need to be moved to somewhere more
accessible to accomodate similar transformations (like inlining) in the
future.

Reviewers: efriedma, hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80951
2020-06-18 09:56:13 -07:00
Alexandros Lamprineas ecdf48f15b [ARM] Basic bfloat support
This patch adds basic support for BFloat in the Arm backend.
For now the code generation relies on fullfp16 being present.

Briefly:
* adds the bfloat scalar and vector types in the necessary register classes,
* adjusts the calling convention to cope with bfloat argument passing and return,
* adds codegen patterns for moves, loads and stores.

It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy).

The following people contributed to this patch:

 * Alexandros Lamprineas
 * Ties Stuij

Differential Revision: https://reviews.llvm.org/D81373
2020-06-18 17:26:24 +01:00
Simon Pilgrim 2474421398 [TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes.
If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.
2020-06-18 16:41:08 +01:00
Matt Arsenault 6f09bb7da2 AMDGPU: Don't pass MachineFunction if only the IR Function is used 2020-06-18 11:06:46 -04:00
Ayke van Laethem b4c91462e8
[AVR] Fix miscompilation of zext + add
Code like the following:

    define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) {
    entry:
      %conv = zext i1 %b to i32
      %add = add nsw i32 %conv, %a
      ret i32 %add
    }

Would compile to the following (incorrect) code:

    foo:
        mov     r18, r20
        clr     r19
        add     r22, r18
        adc     r23, r19
        sbci    r24, 0
        sbci    r25, 0
        ret

Those sbci instructions are clearly wrong, they should have been adc
instructions.

This commit improves codegen to use adc instead:

    foo:
        mov     r18, r20
        clr     r19
        ldi     r20, 0
        ldi     r21, 0
        add     r22, r18
        adc     r23, r19
        adc     r24, r20
        adc     r25, r21
        ret

This code is not optimal (it could be just 5 instructions instead of the
current 9) but at least it doesn't miscompile.

Differential Revision: https://reviews.llvm.org/D78439
2020-06-18 16:51:37 +02:00
Matt Arsenault 243303f8d7 Lanai: Remove unused method
This was depending on the MachineFunction at MachineFunctionInfo
construction, which will soon be disallowed.
2020-06-18 10:48:14 -04:00
Simon Pilgrim fe0a85faf4 [X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) == -1 -> PTESTZ(X,X)
Allow combineSetCCMOVMSK to handle 'allof' X == 0 patterns to be replaced with PTESTZ

This is a preliminary patch before properly handling PR35129
2020-06-18 15:38:32 +01:00
Alexandre Ganea 8374bf4363 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.
Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.
2020-06-18 10:07:30 -04:00
Kamlesh Kumar 7622ea5835 [RISCV64] Emit correct lib call for fp(float/double) to ui/si
Since i32 is not legal in riscv64,
it always promoted to i64 before emitting lib call and
for conversions like float/double to int and float/double to unsigned int
wrong lib call was emitted. This commit fix it using custom lowering.

Differential Revision: https://reviews.llvm.org/D80526
2020-06-18 19:34:16 +05:30
Igor Kudrin 6853cc7221 [MC] Rename a misnamed function. NFC.
The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to
better reflect an expression it creates and fix a naming style issue.

Differential Revision: https://reviews.llvm.org/D82079
2020-06-18 20:18:19 +07:00
Alexandre Ganea 403f953792 [CodeView] Add full repro to LF_BUILDINFO record
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).

Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.

For more information see PR36198 and D43002.

Differential Revision: https://reviews.llvm.org/D80833
2020-06-18 09:17:15 -04:00
Alexandre Ganea 24eff42ba4 [CodeView] Add TypeCollection::replaceType to replace type records post-merging
The API is not called in this patch. This is to simply/support https://reviews.llvm.org/D80833
2020-06-18 09:17:14 -04:00
Alexandre Ganea a45409d885 [Clang] Move clang::Job::printArg to llvm::sys::printArg. NFCI.
This patch is to support/simplify https://reviews.llvm.org/D80833
2020-06-18 09:17:13 -04:00
Florian Hahn 1669fddc9f [Matrix] Use alignment info when lowering loads/stores.
This patch updates LowerMatrixIntrinsics to preserve the alignment
specified at the original load/stores and the align attribute for the
pointer argument of the column.major.load/store intrinsics.

We can always use the specified alignment for the load of the first
column. For subsequent columns, the alignment may need to be reduced.

For ConstantInt strides, compute the offset for the start of the column in
bytes and use commonAlignment to get the largest valid alignment.

For non-ConstantInt strides, we need to take the common alignment of the
initial alignment and the element size in bytes.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D81960
2020-06-18 13:19:31 +01:00
Lucas Prates 92ad6d57c2 [ARM] Moving CMSE handling of half arguments and return to the backend
Summary:
As half-precision floating point arguments and returns were previously
coerced to either float or int32 by clang's codegen, the CMSE handling
of those was also performed in clang's side by zeroing the unused MSBs
of the coercer values.

This patch moves this handling to the backend's calling convention
lowering, making sure the high bits of the registers used by
half-precision arguments and returns are zeroed.

Reviewers: chill, rjmccall, ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81428
2020-06-18 13:16:29 +01:00
Lucas Prates a255931c40 [ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend
Summary:
Half-precision floating point arguments and returns are currently
promoted to either float or int32 in clang's CodeGen and there's
no existing support for the lowering of `half` arguments and returns
from IR in AArch32's backend.

Such frontend coercions, implemented as coercion through memory
in clang, can cause a series of issues in argument lowering, as causing
arguments to be stored on the wrong bits on big-endian architectures
and incurring in missing overflow detections in the return of certain
functions.

This patch introduces the handling of half-precision arguments and returns in
the backend using the actual "half" type on the IR. Using the "half"
type the backend is able to properly enforce the AAPCS' directions for
those arguments, making sure they are stored on the proper bits of the
registers and performing the necessary floating point convertions.

Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer

Reviewed By: ostannard

Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75169
2020-06-18 13:15:13 +01:00
Paul Walker 4612f39120 [SVE] Add flag to specify SVE register size, using this to calculate legal vector types.
Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE
data registers (in bits) to be specified. This allows the code
generator to make assumptions it normally couldn't. As a starting
point this information is used to mark fixed length vector types
that can fit within the specified size as legal.

Reviewers: rengolin, efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80384
2020-06-18 12:11:16 +00:00
Sameer Sahasrabuddhe 7aad220795 [DA] conservatively mark the join of every divergent branch
For a loop, a join block is a block that is reachable along multiple
disjoint paths from the exiting block of a loop. If the exit condition
of the loop is divergent, then such join blocks must also be marked
divergent. This currently fails in some cases because not all join
blocks are identified correctly.

The workaround is to conservatively mark every join block of any
branch (not necessarily the exiting block of a loop) as divergent.

https://bugs.llvm.org/show_bug.cgi?id=46372

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81806
2020-06-18 17:39:20 +05:30
Florian Hahn d88acd8f7d [Matrix] Preserve volatile when loading loads/stores.
Currently the matrix lowering turns volatile loads/stores into
non-volatile ones. This patch updates the lowering to preserve the
volatile bit.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D81498
2020-06-18 12:14:19 +01:00
Jeremy Morse 3626eba11f [NFC][LiveDebugValues] Document how LiveDebugValues operates
We're missing a plain English explanation of how this pass is supposed
to operate -- add one to the file comment.

Differential Revision: https://reviews.llvm.org/D80929
2020-06-18 10:54:09 +01:00
Ayke van Laethem 15bf42d503
[AVR] Implement disassembly of 32-bit instructions
This needed two fixes:

  * 32-bit instructions were read in the wrong order. The machine code
    swaps the two 16-bit instruction words, which wasn't undone when
    decoding instructions.
  * Jump and call instructions don't encode the lowest address bit,
    which is always zero. Therefore, the address needed to be shifted by
    one to fix that.

Differential Revision: https://reviews.llvm.org/D81961
2020-06-18 11:26:58 +02:00
David Sherwood 7e30ef77f6 [CodeGen] Fix warnings in getVectorTypeBreakdown
Added NextPowerOf2() routine to TypeSize and rewritten the code
in getVectorTypeBreakdown to avoid warnings being generated.

Differential Revision: https://reviews.llvm.org/D81578
2020-06-18 09:54:16 +01:00
Florian Hahn 6d18c2067e [Matrix] Update load/store intrinsics.
This patch adjust the load/store matrix intrinsics, formerly known as
llvm.matrix.columnwise.load/store, to improve the naming and allow
passing of extra information (volatile).

The patch performs the following changes:
 * Rename columnwise.load/store to column.major.load/store. This is more
   expressive and also more in line with the naming in Clang.
 * Changes the stride arguments from i32 to i64. The stride can be
   larger than i32 and this makes things more uniform with the way
   things are handled in Clang.
 * A new boolean argument is added to indicate whether the load/store
   is volatile. The lowering respects that when emitting vector
   load/store instructions
 * MatrixBuilder is updated to require both Alignment and IsVolatile
   arguments, which are passed through to the generated intrinsic. The
   alignment is set using the `align` attribute.

The changes are grouped together in a single patch, to have a single
commit that breaks the compatibility. We probably should be fine with
updating the intrinsics, as we did not yet officially support them in
the last stable release. If there are any concerns, we can add
auto-upgrade rules for the columnwise intrinsics though.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse

Reviewed By: anemet, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D81472
2020-06-18 09:44:52 +01:00
David Sherwood 65912a9768 [CodeGen] Fix warnings in foldCONCAT_VECTORS
Instead of asserting the number of elements is the same, we should be
comparing the element counts instead. In addition, when looking at
concats of extract_subvectors it's fine to use getVectorMinNumElements()
for scalable vectors.

I discovered these warnings when compiling the structured loads tests in
this file:

  test/CodeGen/AArch64/sve-intrinsics-loads.ll

Differential Revision: https://reviews.llvm.org/D81936
2020-06-18 09:29:37 +01:00
serge-sans-paille f9c7e3136e Correctly report modified status for HWAddressSanitizer
Differential Revision: https://reviews.llvm.org/D81238
2020-06-18 10:27:44 +02:00
David Green 158e734af1 [ARM] Adjust AND/OR combines to not call isConstantSplat on i1 vectors. NFC.
The rearranges PerformANDCombine and PerformORCombine to try and make
sure we don't call isConstantSplat on any i1 vectors. As pointed out in
D81860 it may not be very well defined in those cases.
2020-06-18 08:25:44 +01:00
Kristof Beyls 832cfc7672 [IndirectThunks] Make generated MF structure as expected by all instruction selectors.
This also enables running the AArch64 SLSHardening pass with GlobalISel,
so add a test for that.

Differential Revision: https://reviews.llvm.org/D81403
2020-06-18 06:44:53 +01:00
Kristof Beyls 3f0cc96a96 [AArch64] SLSHardening: compute correct thunk name for X29.
The enum values for AArch64 registers are not all consecutive.
Therefore, the computation
  "__llvm_slsblr_thunk_x" + utostr(Reg - AArch64::X0)
is not always correct. utostr(Reg - AArch64::X0) will not generate the
expected string for the registers that do not have consecutive values in
the enum.
This happened to work for most registers, but does not for AArch64::FP
(i.e. register X29).
This can get triggered when the X29 is not used as a frame pointer.

Differential Revision: https://reviews.llvm.org/D81997
2020-06-18 06:36:49 +01:00
Xing GUO d261a1c0e0 [DWARFYAML][debug_abbrev] Make the abbreviation code optional.
This patch helps make the `Code` optional in abbreviations table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81826
2020-06-18 13:02:54 +08:00
Mehdi Amini 77b79d79c0 Remove "unused" member ModuleSlice from `struct OpenMPOpt`
This is fixing warning from clang:

 warning: private field 'ModuleSlice' is not used [-Wunused-private-field]
  SmallPtrSetImpl<Function *> &ModuleSlice;
                               ^

Differential Revision: https://reviews.llvm.org/D82027
2020-06-18 03:02:26 +00:00
Kang Zhang 58e19d465a [PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator
Summary:
For PPC BinaryOperator of fp128 will become libcall, we shouldn't
convert loop to CTR loop if the loop contain libCall.

But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal
with BinaryOperator for ppc_fp128, don't deal with the fp128.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81353
2020-06-18 02:54:19 +00:00
Xing GUO 1f391afbf4 [ObjectYAML][ELF] Add support for emitting the .debug_abbrev section.
This patch enables yaml2elf emit the .debug_abbrev section.

The generated .debug_abbrev is verified using `llvm-dwarfdump`.

Known issues that will be addressed later:
- Current implementation doesn't support generating multiple abbreviation tables in one .debug_abbrev section.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D81820
2020-06-18 10:50:38 +08:00
Esme-Yi ad6024e29f [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected.
In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81076
2020-06-18 01:32:23 +00:00
Sam Clegg 7ee758d691 [WebAssembly] MC: Fix for data aliases with offsets (getelementptr)
For some reason we hadn't seen such cases in the wild which makes
me think that clang and rustc don't generate these.  In the bug which
reproduces it only occurs with LTO so my guess is that some LTO pass
is creating this alias + gep.

See: https://github.com/emscripten-core/emscripten/issues/8731

Differential Revision: https://reviews.llvm.org/D79462
2020-06-17 16:25:50 -07:00
Matt Arsenault 5f5f566b26 AMDGPU: Don't use 16-bit FP inline constants in integer operands
It seems to be a hardware defect that the half inline constants do not
work as expected for the 16-bit integer operations (the inverse does
work correctly). Experimentation seems to show these are really
reading the 32-bit inline constants, which can be observed by writing
inline asm using op_sel to see what's in the high half of the
constant. Theoretically we could fold the high halves of the 32-bit
constants using op_sel.

The *_asm_all.s MC tests are broken, and I don't know where the script
to autogenerate these are. I started manually fixing it, but there's
just too many cases to fix. This also does break the
assembler/disassembler support for these values, and I'm not sure what
to do about it. These are still valid encodings, so it seems like you
should be able to use them in some way. If you wrote assembly using
them, you could have really meant it (perhaps to read the high bits
with op_sel?). The disassembler will print the invalid literal
constant which will fail to re-assemble. The behavior is also
different depending on the use context. Consider this example, which
was previously accepted and encoded using the inline constant:

  v_mad_i16 v5, v1, -4.0, v3
  ; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04]

In contexts where an inline immediate is required (such as on gfx8/9),
this will now be rejected. For gfx10, this will produce the literal
encoding and change the printed format:
  v_mad_i16 v5, v1, 0xc400, v3
  ; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00]

This is just another variation of the issue that we don't perfectly
handle round trip assembly/disassembly due to not tracking how
immediates were encoded. This doesn't matter much in practice, since
compilers don't emit the suboptimal encoding. I doubt any users are
relying on this behavior (although I did make use of the old behavior
to figure out what was wrong).

Fixes bug 46302.
2020-06-17 19:14:10 -04:00
Yonghong Song 89648eb16d [BPF] fix a bug for BTF pointee type pruning
In BTF, pointee type pruning is used to reduce cluttering
too many unused types into prog BTF. For example,
   struct task_struct {
      ...
      struct mm_struct *mm;
      ...
   }
If bpf program does not access members of "struct mm_struct",
there is no need to bring types for "struct mm_struct" to BTF.

This patch fixed a bug where an incorrect pruning happened.
The test case like below:
    struct t;
    typedef struct t _t;
    struct s1 { _t *c; };
    int test1(struct s1 *arg) { ... }

    struct t { int a; int b; };
    struct s2 { _t c; }
    int test2(struct s2 *arg) { ... }

After processing test1(), among others, BPF backend generates BTF types for
    "struct s1", "_t" and a placeholder for "struct t".
Note that "struct t" is not really generated. If later a direct access
to "struct t" member happened, "struct t" BTF type will be generated
properly.

During processing test2(), when processing member type "_t c",
BPF backend sees type "_t" already generated, so returned.
This caused the problem that "struct t" BTF type is never generated and
eventually causing incorrect type definition for "struct s2".

To fix the issue, during DebugInfo type traversal, even if a
typedef/const/volatile/restrict derived type has been recorded in BTF,
if it is not a type pruning candidate, type traversal of its base type continues.

Differential Revision: https://reviews.llvm.org/D82041
2020-06-17 15:13:46 -07:00
Eric Christopher a8dad30388 Revert "Remove unused class variable ModuleSlice." as it was
used in debug only code.

This reverts commit 07a1749081.
2020-06-17 14:45:17 -07:00
Eric Christopher 07a1749081 Remove unused class variable ModuleSlice. 2020-06-17 14:33:29 -07:00
Christopher Tetreault 8819202dfd [SVE] Eliminate bad VectorType::getNumElements() calls from ConstantFold
Summary:
Assume all usages of this function are explicitly fixed-width operations
and cast to FixedVectorType

Reviewers: efriedma, sdesmalen, c-rhodes, majnemer, dblaikie

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80262
2020-06-17 14:19:56 -07:00
Christopher Tetreault 4b776a98f1 [SVE] Fix invalid usages of getNumElements in ShuffleVectorInstruction
Summary:
Fix invalid usages of getNumElements identified by test case
LLVM.Transforms/InstCombine::vscale_extractelement.ll.

changesLength: Since the length of the llvm::SmallVector shufflemask
is related to the minimum number of elements in a scalable vector, it is
fine to just get the Min field of the ElementCount

isIdentityWithExtract: Since it is not possible to express the mask
needed for this pattern for scalable vectors, we can just bail before
calling getNumElements()

Reviewers: efriedma, sdesmalen, fpetrogalli, gchatelet, yrouban, craig.topper

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81969
2020-06-17 13:45:34 -07:00
Roman Lebedev 84b4f5a6a6
[InstCombine] Negator: while there, add detection for cycles during negation
I don't have any testcases showing it happening,
and i haven't succeeded in creating one,
but i'm also not positive it can't ever happen,
and i recall having something that looked like
that in the very beginning of Negator creation.

But since we now already have a negation cache,
we can now detect such cases practically for free.

Let's do so instead of "relying" on stack overflow :D
2020-06-17 22:47:20 +03:00
Roman Lebedev e3d8cb1e1d
[InstCombine] Negator: cache negation results (PR46362)
It is possible that we can try to negate the same value multiple times.
For example, PHI nodes may happen to have multiple incoming values
(all of which must be the same value) for the same incoming basic block.
It may happen that we try to negate such a PHI node, and succeed,
and that might result in having now-different incoming values..

To avoid that, and in general to reduce the amount of duplicated
work we might be doing, let's introduce a cache where
we'll track results of negating each value.

The added test was previously failing -verify after -instcombine.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46362
2020-06-17 22:47:20 +03:00
Roman Lebedev c4166f3d84
[NFC][InstCombine] Negator: add thin negate() wrapped before visit() 2020-06-17 22:47:20 +03:00
Roman Lebedev 2b85147337
[NFC][InstCombine] Negator: do not include unneeded "llvm/IR/DerivedTypes.h" header 2020-06-17 22:47:19 +03:00
Thomas Lively 49754dcf22 [WebAssembly] Fix bug in FixBrTables and use branch analysis utils
Summary:
This commit fixes a bug in the FixBrTables pass in which an
unconditional branch from the switch header block to the jump table
block was not removed before the blocks were combined. The result was
an invalid CFG in the MachineFunction. This commit also switches from
using bespoke branch analysis and deletion code to using the standard
utilities for the same.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81909
2020-06-17 12:34:45 -07:00
Nick Desaulniers e7816f263b [InlineSpiller] add assert about spills post terminators
Summary:
This invariant is being violated in the test case
https://reviews.llvm.org/D77849, related to the use of the relatively
new ability for callbr to have return values, and MachineBasicBlocks
with INLINEASM_BR terminators to emit live out register defs.

As noted in the comment, this triggers invariant violations in
MachineVerifier via `llc -verify-machineinstrs` or
`llc -verify-regalloc`, since only MachineInstrs that are terminators
are allowed to follow the first terminator.

https://reviews.llvm.org/D75098 may rework this very assertion if we're
spilling via a (proposed) TCOPY MachineInstr.

Reviewers: void, efriedma, arsenm

Reviewed By: efriedma

Subscribers: qcolombet, wdng, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78166
2020-06-17 11:51:58 -07:00
Nick Desaulniers 88c965ba14 BreakCriticalEdges for callbr indirect dests
Summary:
llvm::SplitEdge was failing an assertion that the BasicBlock only had
one successor (for BasicBlocks terminated by CallBrInst, we typically
have multiple successors).  It was surprising that the earlier call to
SplitCriticalEdge did not handle the critical edge (there was an early
return).  Removing that triggered another assertion relating to creating
a BlockAddress for a BasicBlock that did not (yet) have a parent, which
is a simple order of operations issue in llvm::SplitCriticalEdge (a
freshly constructed BasicBlock must be inserted into a Function's basic
block list to have a parent).

Thanks to @nathanchance for the report.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018

Reviewers: craig.topper, jyknight, void, fhahn, efriedma

Reviewed By: efriedma

Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81607
2020-06-17 11:45:06 -07:00
Davide Italiano 1cbaf847ab [CGP] Reset the debug location when promoting zext(s).
When the zext gets promoted, it used to retain the original location,
which pessimizes the debugging experience causing an unexpected
jump in stepping at -Og.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also
contains a full C repro).

Differential Revision:  https://reviews.llvm.org/D81437
2020-06-17 11:13:13 -07:00
Ian Levesque 7c7c8e0da4 [xray] Option to omit the function index
Summary:
Add a flag to omit the xray_fn_idx to cut size overhead and relocations
roughly in half at the cost of reduced performance for single function
patching.  Minor additions to compiler-rt support per-function patching
without the index.

Reviewers: dberris, MaskRay, johnislarry

Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D81995
2020-06-17 13:49:01 -04:00
Alexandre Ganea acb30f6856 [X86] For 32-bit targets, emit two-byte NOP when possible
In order to support hot-patching, we need to make sure the first emitted instruction in a function is a two-byte+ op. This is already the case on x86_64, which seems to always emit two-byte+ ops. However on 32-bit targets this wasn't the case.

PATCHABLE_OP now lowers to a XCHG AX, AX, (66 90) like MSVC does. However when targetting pentium3 (/arch:SSE) or i386 (/arch:IA32) targets, we generate MOV EDI,EDI (8B FF) like MSVC does. This is for compatiblity reasons with older tools that rely on this two byte pattern.

Differential Revision: https://reviews.llvm.org/D81301
2020-06-17 13:44:38 -04:00
Alexandre Ganea ad879b31f0 [X86] Change signature of EmitNops. NFC.
This is to support https://reviews.llvm.org/D81301.
2020-06-17 13:44:37 -04:00
Fangrui Song c8b082a3ab [llvm-cov gcov] Support clang<11 fake 4.2 format
Test cases are restored from a3bed4bd37
2020-06-17 10:17:15 -07:00
Michał Górny 5c621900a6 [llvm] [CommandLine] Do not suggest really hidden opts in nearest lookup
Skip 'really hidden' options when performing lookup of the nearest
option when invalid option was passed.  Since these options aren't even
documented in --help-hidden, it seems inconsistent to suggest them
to users.

This fixes clang-tools-extra test failures due to unexpected suggestions
when linking the tools to LLVM dylib (that provides more options than
the subset of LLVM libraries linked directly).

Differential Revision: https://reviews.llvm.org/D82001
2020-06-17 19:00:26 +02:00
Scott Linder 691ff4682f [AMDGPU] Skip CFIInstructions in SIInsertWaitcnts
Summary:
CFI emitted during PEI at the beginning of the prologue needs to apply
to any inserted waitcnts on function entry.

Reviewers: arsenm, t-tye, RamNalamothu

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D76881
2020-06-17 12:41:03 -04:00
vnalamot 2e28009981 [NFC] Move getAll{S,V}GPR{32,128} methods to SIFrameLowering
Summary:
Future patch needs some of these in multiple places.

The definitions of these can't be in the header and be eligible for
inlining without making the full declaration of GCNSubtarget visible.
I'm not sure what the right trade-off is, but I opted to not bloat
SIRegisterInfo.h

Reviewers: arsenm, cdevadas

Reviewed By: arsenm

Subscribers: RamNalamothu, qcolombet, jvesely, wdng, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79878
2020-06-17 12:08:09 -04:00
sstefan1 7cfd267c51 [OpenMPOPT][NFC] Introducing OMPInformationCache.
Summary:
Introduction of OpenMP-specific information cache based on Attributor's `InformationCache`. This should make it easier to share information between them.

Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku

Subscribers: yaxunl, hiraditya, guansong, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81798
2020-06-17 16:56:45 +02:00
Jay Foad def2e4c47f [AMDGPU] Simplify GCNPassConfig::addOptimizedRegAlloc. NFC. 2020-06-17 15:56:15 +01:00
Simon Pilgrim a5f1f9c9b8 ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC.
Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency.

Add implicit header dependency to source files where necessary.
2020-06-17 15:48:23 +01:00
Sjoerd Meijer d1522513d4 [ARM] Reimplement MVE Tail-Predication pass using @llvm.get.active.lane.mask
To set up a tail-predicated loop, we need to to calculate the number of
elements processed by the loop. We can now use intrinsic
@llvm.get.active.lane.mask() to do this, which is emitted by the vectoriser in
D79100. This intrinsic generates a predicate for the masked loads/stores, and
consumes the Backedge Taken Count (BTC) as its second argument. We can now use
that to reconstruct the loop tripcount, instead of the IR pattern match
approach we were using before.

Many thanks to Eli Friedman and Sam Parker for all their help with this work.

This also adds overflow checks for the different, new expressions that we
create: the loop tripcount, and the sub expression that calculates the
remaining elements to be processed. For the latter, SCEV is not able to
calculate precise enough bounds, so we work around that at the moment, but is
not entirely correct yet, it's conservative. The overflow checks can be
overruled with a force flag, which is thus potentially unsafe (but not really
because the vectoriser is the only place where this intrinsic is emitted at the
moment). It's also good to mention that the tail-predication pass is not yet
enabled by default.  We will follow up to see if we can implement these
overflow checks better, either by a change in SCEV or we may want revise the
definition of llvm.get.active.lane.mask.

Differential Revision: https://reviews.llvm.org/D79175
2020-06-17 15:17:42 +01:00
Kirill Naumov ea844c7520 Revert "[InlineCost] InlineCostAnnotationWriterPass introduced"
This reverts commit 37e06e8f5c.
2020-06-17 14:02:34 +00:00
Kirill Naumov dcf2a9f2ee Revert "[InlineCost] PrinterPass prints constants to which instructions are simplified"
This reverts commit 52b0db22f8.
2020-06-17 14:02:29 +00:00
Kirill Naumov 39a4505e34 Revert "[InlineCost] GetElementPtr with constant operands"
This reverts commit 34fba68d80.
2020-06-17 14:02:18 +00:00
Kirill Naumov 34fba68d80 [InlineCost] GetElementPtr with constant operands
If the GEP instruction contanins only constants as its arguments,
then it should be recognized as a constant. For now, there was
also added a flag to turn off this simplification if it causes
any regressions ("disable-gep-const-evaluation") which is off
by default. Once I gather needed data of the effectiveness of
this simplification, the flag will be deleted.

Reviewers: apilipenko, davidxl, mtrofin

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81026
2020-06-17 13:40:19 +00:00
Kirill Naumov 52b0db22f8 [InlineCost] PrinterPass prints constants to which instructions are simplified
This patch enables printing of constants to see which instructions were
constant-folded. Needed for tests and better visiual analysis of
inliner's work.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81024
2020-06-17 13:40:18 +00:00
Kirill Naumov 37e06e8f5c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-17 13:40:17 +00:00
Benjamin Kramer df9a51dab3 Remove global std::strings. NFCI. 2020-06-17 14:29:42 +02:00
Sjoerd Meijer c1034d044a Follow up of rGe345d547a0d5, and attempt to pacify buildbot:
"error: 'get' is deprecated: The base class version of get with the scalable
argument defaulted to false is deprecated."

Changed VectorType::get() -> FixedVectorType::get().
2020-06-17 13:24:09 +01:00
Sjoerd Meijer e345d547a0 Recommit "[LV] Emit @llvm.get.active.lane.mask for tail-folded loops"
Fixed ARM regression test.

Please see the original commit message rG47650451738c for details.
2020-06-17 13:12:15 +01:00
David Green 076e08aa45 [LSR] Filter for postinc formulae
In more complicated loops we can easily hit the complexity limits of
loop strength reduction. If we do and filtering occurs, it's all too
easy to remove the wrong formulae for post-inc preferring accesses due
to it attempting to maximise register re-use. The patch adds an
alternative filtering step when the target is preferring postinc to pick
postinc formulae instead, hopefully lowering the complexity to below the
limit so that aggressive filtering is not needed.

There is also a change in here to stop considering existing addrecs as
free under postinc. We should already be modelling them as a reg so
don't want it to cause us to get the cost wrong. (I'm not sure that code
makes sense in general, but there are X86 tests specifically for it
where it seems to be helping so have left it around for the standard
non-post-inc case).

Differential Revision: https://reviews.llvm.org/D80273
2020-06-17 12:32:04 +01:00
Carl Ritson ac8a2f132b [AMDGPU] Fix failure in VCC spilling
Spills of VCC (SGPR64) will fail with new SGPR spill code,
because super register is not correctly resolved.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81224
2020-06-17 20:11:15 +09:00
Benjamin Kramer 547b6da73c [CallPrinter] Remove static constructor.
No need to have std::string here. NFC.
2020-06-17 13:02:58 +02:00
Sam Parker 5bf0858c0b Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant"
I originally reverted the patch because it was causing performance
issues, but now I think it's just enabling simplify-cfg to do
something that I don't want instead :)

Sorry for the noise.

This reverts commit 3e39760f8e.
2020-06-17 11:38:59 +01:00
Paul Walker 95db1e7fb9 [FileCheck] Implement * and / operators for ExpressionValue.
Subscribers: arichardson, hiraditya, thopre, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80915
2020-06-17 09:39:17 +00:00
Hans Wennborg 16ad6eeb94 [IR] Don't copy profile metadata in createCallMatchingInvoke()
The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.

Differential revision: https://reviews.llvm.org/D81996
2020-06-17 11:18:23 +02:00
serge-sans-paille 1cafd8a5d1 Fix LoopIdiomRecognize pass return status
Introduce an helper class to aggregate the cleanup in case of rollback.

Differential Revision: https://reviews.llvm.org/D81230
2020-06-17 11:12:03 +02:00
Sjoerd Meijer d4e183f686 Revert "[LV] Emit @llvm.get.active.mask for tail-folded loops"
This reverts commit 4765045173
while I investigate the build bot failures.
2020-06-17 10:09:54 +01:00
Max Kazantsev 4ac9a6902f [NFC] Add API for edge domination check in dom tree 2020-06-17 16:05:05 +07:00
Florian Hahn 773353be4e [SCCP] Move common code to simplify basic block to helper (NFC).
Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81755
2020-06-17 10:03:43 +01:00
Sjoerd Meijer 4765045173 [LV] Emit @llvm.get.active.mask for tail-folded loops
This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised
loops if the intrinsic is supported by the backend, which is checked by
querying TargetTransform hook emitGetActiveLaneMask.

This intrinsic creates a mask representing active and inactive vector lanes,
which is used by the masked load/store instructions that are created for
tail-folded loops. The semantics of @llvm.get.active.mask are described here in
LangRef:

https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics

This intrinsic is also used to provide a hint to the backend. That is, the
second argument of the intrinsic represents the back-edge taken count of the
loop. For MVE, for example, we use that to set up tail-predication, which is a
new form of predication in MVE for vector loops that implicitely predicates the
last vector loop iteration by implicitely setting active/inactive lanes, i.e.
the tail loop is predicated. In order to set up a tail-predicated vector loop,
we need to know the number of data elements processed by the vector loop, which
corresponds the the tripcount of the scalar loop, which we can now reconstruct
using @llvm.get.active.mask.

Differential Revision: https://reviews.llvm.org/D79100
2020-06-17 09:53:58 +01:00
Sjoerd Meijer 20835cff27 [TTI] Refactor emitGetActiveLaneMask
Refactor TTI hook emitGetActiveLaneMask and remove the unused arguments
as suggested in D79100.
2020-06-17 09:53:58 +01:00
Kirill Bobyrev 3847737fa4
[CallPrinter] Handle freq = 0 case
Improvement of the following revision:
bbc629ebd6

This might still be problematic if freq = 0, so it's better to check for
that.
2020-06-17 10:52:18 +02:00
Kirill Bobyrev bbc629ebd6
[CallPrinter] Fix maxFreq = 0 case
llvm::getHeatColor becomes a problem when maxFreq = 0 -> freq = 0 =>
log2(double(freq)) / log2(maxFreq) -> log2(0.) / log2(0.) which
results in illegal instruction on some architectures.

Problematic revision: https://reviews.llvm.org/D77172
2020-06-17 10:44:28 +02:00
Florian Hahn e4b58ea8c1 [MemDep] Also remove load instructions from NonLocalDesCache.
Currently load instructions are added to the cache for invariant pointer
group dependencies, but only pointer values are removed currently. That
leads to dangling AssertingVHs in the test case below, where we delete a
load from an invariant pointer group. We should also remove the entries
from the cache.

Fixes PR46054.

Reviewers: efriedma, hfinkel, asbirlea

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81726
2020-06-17 09:36:53 +01:00
James Henderson b21794a91c [DebugInfo] Unify Cursor usage for all debug line opcodes
This is a natural extension of the previous changes to use the Cursor
class independently in the standard and extended opcode paths, and in
turn allows delaying error handling until the entire line has been
printed in verbose mode, removing interleaved output in some cases.

Reviewed by: MaskRay, JDevlieghere

Differential Revision: https://reviews.llvm.org/D81562
2020-06-17 09:19:24 +01:00
Vitaly Buka d812efb121 [SafeStack,NFC] Fix names after files move
Summary: Depends on D81831.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81832
2020-06-17 01:08:40 -07:00
Vitaly Buka 6754a0e2ed [SafeStack,NFC] Move SafeStackColoring code
Summary:
This code is going to be used in StackSafety.
This patch is file move with minimal changes. Identifiers
will be fixed in the followup patch.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81831
2020-06-17 01:07:47 -07:00
Jonas Paulsson d3f7448e3c [SystemZ] Bugfix in storeLoadCanUseBlockBinary().
Check that the MemoryVT of LoadA matches that of LoadB.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46239.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D81671
2020-06-17 09:49:31 +02:00
Kang Zhang c2574dc9f7 [NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass
Summary:

In the patch D62907 the PPC CTRLoops pass has been replaced by Generic
Hardware Loop pass, and it has imported some new intrinsic for Generic
Hardware Loop.

The old intrinsic used in PPC CTRLoops int_ppc_mtctr and
int_ppc_is_decremented_ctr_nonzero is been replaced by
int_set_loop_iterations and loop_decrement.

This patch is to remove above unused two instrinsic.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81539
2020-06-17 07:06:46 +00:00
Serge Pavlov 2e613d2ded [Support] Get process statistics in ExecuteAndWait and Wait
The functions sys::ExcecuteAndWait and sys::Wait now have additional
argument of type pointer to structure, which is filled with process
execution statistics upon process termination. These are total and user
execution times and peak memory consumption. By default this argument is
nullptr so existing users of these function must not change behavior.

Differential Revision: https://reviews.llvm.org/D78901
2020-06-17 13:39:59 +07:00
Igor Kudrin ccbd7e8d46 [DebugInfo] Support parsing and dumping of DWARF64 macro units.
Differential Revision: https://reviews.llvm.org/D81844
2020-06-17 12:57:54 +07:00
Sameer Sahasrabuddhe d3963b3a5f [DA] propagate loop live-out values that get used in a branch
Values that are uniform within a loop but appear divergent to uses
outside the loop are "tainted" so that such uses are marked
divergent. But if such a use is a branch, then it's divergence needs
to be propagated. The simplest way to do that is to put the branch
back in the main worklist so that it is processed appropriately.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81822
2020-06-17 09:21:00 +05:30
Itay Bookstein df9d64ed9c [IR] Add missing GlobalAlias copying of ThreadLocalMode attribute
Summary:
Previously, GlobalAlias::copyAttributesFrom did not preserve ThreadLocalMode,
causing incorrect IR generation in IR linking flows. This patch pushes the code
responsible for copying this attribute from GlobalVariable::copyAttributesFrom
down to GlobalValue::copyAttributesFrom so that it is shared by GlobalAlias.
Fixes PR46297.

Reviewers: tejohnson, pcc, hans

Reviewed By: tejohnson, hans

Subscribers: hiraditya, ibookstein, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81605
2020-06-16 20:15:27 -07:00
Matt Arsenault 3b34f3fcca AMDGPU/GlobalISel: Fix obvious bug in ported 32-bit udiv/urem
This was hidden by the IR expansion in AMDGPUCodeGenPrepare, which I
forgot to turn off.
2020-06-16 22:46:35 -04:00
Xing GUO 9aaa32cfcb [ObjectYAML][DWARF] Let writeVariableSizedInteger() return Error.
This patch helps change the return type of `writeVariableSizedInteger()` from `void` to `Error`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81915
2020-06-17 09:30:14 +08:00
Matt Arsenault c5c58fd6b5 AMDGPU: Remove intermediate DAG node for trig_preop intrinsic
We weren't doing anything with this, and keeping it would just add
more boilerplate for GlobalISel.
2020-06-16 21:06:25 -04:00
Christopher Tetreault 8e204f807b [SVE] Generalize size checks in Verifier to use getElementCount
Summary:
Attempts to call getNumElements on scalable vectors identified by test
LLVM.Other::scalable-vectors-core-ir.ll. Since these checks are all
attempting to find if two vectors are the same size, calling
getElementCount will only increase safety.

Reviewers: efriedma, aprantl, reames, kmclaughlin, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81895
2020-06-16 16:03:36 -07:00
Aaron Smith 7e01675ea5 [SelectionDAG] Add MVT::bf16 to getConstantFP()
Summary:
This was probably overlooked in recent bfloat patches.
Needed to handle bf16 constants in SelectionDAG.

  ConstantFP:bf16<APFloat(0)>

Reviewers: stuij

Reviewed By: stuij

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81779
2020-06-16 15:10:05 -07:00
Fangrui Song 7f7cb79b57 [llvm-cov gcov] Don't suppress .gcov output if .gcda is corrupted
If .gcda is corrupted, gcov continues to produce a .gcov and just
assumes execution counts are zeros. This is reasonable, because the
program can corrupt its .gcda output. The code path should be similar to
the code path without .gcda.
2020-06-16 14:55:38 -07:00
Daniel Sanders e35ba09961 [gicombiner] Allow generated combiners to store additional members
Summary:
Adds the ability to add members to a generated combiner via
a State base class. In the current AArch64PreLegalizerCombiner
this is used to make Helper available without having to
provide it to every call.

As part of this, split the command line processing into a
separate object so that it still only runs once even though
the generated combiner is constructed more frequently.

Depends on D81862

Reviewers: aditya_nandakumar, bogner, volkan, aemerson, paquette, arsenm

Reviewed By: arsenm

Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81863
2020-06-16 14:47:04 -07:00
Kirill Naumov 369d00df60 [CallPrinter] Adding heat coloring to CallPrinter
This patch introduces the heat coloring of the Call Printer which is based
on the relative "hotness" of each function. The patch is a part of sequence of
three patches, related to graphs Heat Coloring.
Another feature added is the flag similar to "-cfg-dot-filename-prefix",
which allows to write the graph into a named .pdf

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D77172
2020-06-16 21:15:29 +00:00
Fangrui Song def2156389 [gcov] Add -i --intermediate-format
Between gcov 4.9~8, `gcov -i $file` prints coverage information to
$file.gcov in an intermediate text format (single file, instead of
$source.gcov for each source file).

lcov newer than 2019-05-24 detects -i support and uses it to increase
processing speed.  gcov 9 (GCC r265587) removed --intermediate-format
and -i was changed to mean --json-format. However, we consider this
format still useful and support it. geninfo (part of lcov) supports this
format even if we announce that we are compatible with gcov 9.0.0
2020-06-16 14:14:28 -07:00
Fangrui Song 4cd7ba7eca [gcov] Refactor llvm-cov gcov and add SourceInfo 2020-06-16 14:14:26 -07:00
Christopher Tetreault 616d8d942b [SVE] Eliminate calls to default-false VectorType::get() from AArch64
Reviewers: efriedma, c-rhodes, david-arm, samparker, greened

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81518
2020-06-16 13:53:25 -07:00
Christopher Tetreault b265cad93e [NFC] Bail out for scalable vectors before calling getNumElements
Summary:
Move the bail out logic to before constructing the Result and Lane
vectors. This is both potentially faster, and avoids calling
getNumElements on a potentially scalable vector

Reviewers: efriedma, sunfish, chandlerc, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81619
2020-06-16 13:41:29 -07:00
Christopher Tetreault 747486991c [SVE] Fix bad FixedVectorType cast in simplifyDivRem
Summary:
simplifyDivRem attempts to walk a VectorType elementwise. Ensure that it
only does so for FixedVectorType

Reviewers: efriedma, spatel, lebedev.ri, david-arm, kmclaughlin

Reviewed By: spatel, david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81856
2020-06-16 13:17:05 -07:00
Christopher Tetreault ff628f5f5e [SVE] Eliminate calls to default-false VectorType::get() from Vectorize
Reviewers: efriedma, fhahn, spatel, sdesmalen, kmclaughlin

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81521
2020-06-16 12:50:13 -07:00
Matt Arsenault e4f19d1dda GlobalISel: Fix not failing on widening G_INSERT_VECTOR_ELT
This doesn't actually handled type idx 0, but was reporting Legalized
on it. No test changes because nothing was trying to use this.
2020-06-16 15:48:57 -04:00
Ahsan Saghir 37e72f47a4 [PowerPC] Add -m[no-]power10-vector clang and llvm option
Summary: This patch adds command line option for enabling power10-vector support.

Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc

Reviewed By: lei, amyk, #powerpc

Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits

Tags: #llvm, #clang, #powerpc

Differential Revision: https://reviews.llvm.org/D80758
2020-06-16 14:47:35 -05:00
Matt Arsenault 8a3340d25d GlobalISel: Use early return and reduce indentation 2020-06-16 14:47:08 -04:00
Stanislav Mekhanoshin 3f0c9c1634 Fix ubsan error in tblgen with signed left shift
UBSAN complains when tblgen performs SHL of a negative
value.

Differential Revision: https://reviews.llvm.org/D81952
2020-06-16 11:15:09 -07:00
Hiroshi Yamauchi 6bc2b042f4 [TLI] Add four C++17 delete variants.
Summary:
delete(void*, unsigned int, align_val_t)
delete(void*, unsigned long, align_val_t)
delete[](void*, unsigned int, align_val_t)
delete[](void*, unsigned long, align_val_t)

Differential Revision: https://reviews.llvm.org/D81853
2020-06-16 11:12:02 -07:00
Sanjay Patel ed67f5e7ab [VectorCombine] scalarize compares with insertelement operand(s)
Generalize scalarization (recently enhanced with D80885)
to allow compares as well as binops.
Similar to binops, we are avoiding scalarization of a loaded
value because that could avoid a register transfer in codegen.
This requires 1 extra predicate that I am aware of: we do not
want to scalarize the condition value of a vector select. That
might also invert a transform that we do in instcombine that
prefers a vector condition operand for a vector select.

I think this is the final step in solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

Differential Revision: https://reviews.llvm.org/D81661
2020-06-16 13:48:10 -04:00
Jessica Paquette 7caa9caa80 [AArch64][GlobalISel] Avoid creating redundant ubfx when selecting G_ZEXT
When selecting 32 b -> 64 b G_ZEXTs, we don't have to always emit the extend.

If the instruction feeding into the G_ZEXT implicitly zero extends the high
half of the register, we can just emit a SUBREG_TO_REG instead.

Differential Revision: https://reviews.llvm.org/D81897
2020-06-16 09:50:47 -07:00
Fangrui Song 4799fb63b5 [GlobalISel] Delete unused variable after r353432 2020-06-16 08:32:09 -07:00
Leandro Vaz 56262a74c3 Fix debug line info when line markers are present inside macros.
Compiling assembly files when newlines are reduced to line markers within a `.macro` context will generate wrong information in `.debug_line` section.
This patch fixes this issue by evaluating line markers within the macro scope but not when they are used and evaluated.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D80381
2020-06-16 16:13:11 +01:00
Luke Geeson 10b6567f49 [AArch64]: BFloat MatMul Intrinsics&CodeGen
This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman

Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki,
stuij

Reviewed By: miyuki, stuij

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki, chill, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80752

Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f
2020-06-16 15:23:30 +01:00
Luke Geeson 508a4764c0 [AArch64]: BFloat Load/Store Intrinsics&CodeGen
This patch upstreams support for ld / st variants of BFloat intrinsics
in from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

 - Luke Geeson
 - Momchil Velikov
 - Luke Cheeseman

Reviewers: fpetrogalli, SjoerdMeijer, sdesmalen, t.p.northover, stuij

Reviewed By: stuij

Subscribers: arsenm, pratlucas, simon_tatham, labrinea, kristof.beyls,
hiraditya, danielkiss, cfe-commits, llvm-commits, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80716

Change-Id: I22e1dca2a8a9ec25d1e4f4b200cb50ea493d2575
2020-06-16 15:23:30 +01:00
Georgii Rymar 66fb3c39cb [DebugInfo/DWARF] - Report .eh_frame sections of version != 1.
Specification (https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html#AEN1349)
says that the value of Version field for .eh_frame should be 1.

Though we accept other values and might perform an attempt to read
it as a .debug_frame because of that, what is wrong.

This patch adds a version check.

Differential revision: https://reviews.llvm.org/D81469
2020-06-16 15:46:26 +03:00
Tyker d7deef1206 Revert "[AssumeBundles] add cannonicalisation to the assume builder"
This reverts commit 90c50cad19.
2020-06-16 14:34:55 +02:00
Ayke van Laethem 5aa8014ca8
[AVR] Remove faulty stack pushing behavior
An instruction like this will need to allocate some stack space for the
last parameter:

  %x = call addrspace(1) i16 @bar(i64 undef, i64 undef, i16 undef, i16 0)

This worked fine when passing an actual value (in this case 0). However,
when passing undef, no value was pushed to the stack and therefore no
push instructions were created. This caused an unbalanced stack leading
to interesting results.

This commit fixes that by replacing the push logic with a regular stack
adjustment and stack-relative load/stores. This is less efficient but at
least it correctly compiles the code.

I can think of a few improvements in the future:

  * The stack should have been adjusted in the function prologue when
    there are no allocas in the function.
  * Many (if not most) stack adjustments can be replaced by
    pushing/popping the values directly. Exactly like the previous code
    attempted but didn't do correctly.
  * Small stack adjustments can be done more efficiently with a few
    push/pop instructions (pushing/popping bogus values), both for code
    size and for speed.

All in all, as long as there are no allocas in the function I think that
it is almost always more efficient to emit regular push/pop
instructions. This is however left for future optimizations.

Differential Revision: https://reviews.llvm.org/D78581
2020-06-16 13:53:32 +02:00
Ayke van Laethem 3ab1c97e35
[AVR] Fix stack size in functions with a frame pointer
This patch fixes a bug in stack save/restore code. Because the frame
pointer was saved/restored manually (not by marking it as clobbered) the
StackSize variable was not updated accordingly. Most code still worked,
but code that tried to load a parameter passed on the stack did not.

This commit fixes this by marking the frame pointer as a
callee-clobbered register. This will let it be saved without any effort
in prolog/epilog code and will make sure the correct address is
calculated for loading parameters that are passed on the stack.

This approach is used by most other targets (such as X86, AArch64 and
RISC-V).

Differential Revision: https://reviews.llvm.org/D78579
2020-06-16 13:53:32 +02:00
David Green f269bb7da0 [ARM] Fix crash trying to generate i1 immediates
These code patterns attempt to call isVMOVModifiedImm on a splat of i1
values, leading to an unreachable being hit. I've guarded the call on a
more specific set of sizes, as i1 vectors are legal under MVE.

Differential Revision: https://reviews.llvm.org/D81860
2020-06-16 12:27:24 +01:00
Simon Pilgrim 9d11822f09 Fix comment typo - Uexpected -> Unexpected. NFC. 2020-06-16 12:14:51 +01:00
Tyker 90c50cad19 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-16 13:12:35 +02:00
Kristof Beyls 503a26d8e4 Silence GCC 7 warning
GCC 7 was reporting "enumeral and non-enumeral type in conditional expression"
as a warning.
The code casts an instruction opcode enum to unsigned implicitly, in
line with intentions; so this commit silences the warning by making the
cast to unsigned explicit.
2020-06-16 11:42:52 +01:00
sstefan1 e099c7b64a [NFC][OpenMPOpt] Provide function-specific foreachUse. 2020-06-16 12:33:15 +02:00
Alexandros Lamprineas f6189da938 [ARM][NFC] Explicitly specify the fp16 value type in codegen patterns.
We are planning to add the bf16 value type in the HPR register class
and this will make the codegen patterns ambiguous.

Differential Revision: https://reviews.llvm.org/D81505
2020-06-16 11:32:17 +01:00
Jay Foad 6fdd5a28b7 Revert "[IR] Clean up dead instructions after simplifying a conditional branch"
This reverts commit 69bdfb075b.

Reverting to investigate https://bugs.llvm.org/show_bug.cgi?id=46343
2020-06-16 10:32:15 +01:00
Simon Pilgrim 379c5b31f7 [X86][SSE] combineVectorSizedSetCCEquality - remove unused AVX2 MOVMSK path. NFCI.
If PTEST is not available, then we're guaranteed to be performing a 128-bit vector comparison using MOVMSK(PCMPEQB(v16i8)).
2020-06-16 10:07:41 +01:00
Igor Kudrin ffc5d98d2c [MC] Generate .debug_frame in the 64-bit DWARF format [7/7]
Note that .eh_frame sections are generated in the 32-bit format even
when debug sections are 64-bit, for compatibility reasons. They use
relative references between entries, so they hardly benefit from the
64-bit format.

Differential Revision: https://reviews.llvm.org/D81149
2020-06-16 15:50:14 +07:00
Igor Kudrin 1e081342d4 [MC] Fix DWARF forms for 64-bit DWARFv3 files [6/7]
DW_FORM_sec_offset was introduced in DWARFv4, so, for 64-bit DWARFv3,
DW_FORM_data8 should be used instead.

Differential Revision: https://reviews.llvm.org/D81148
2020-06-16 15:50:14 +07:00
Igor Kudrin ab7458fb04 [MC] Generate .debug_rnglists in the 64-bit DWARF format [5/7]
In addition, the patch fixes referencing the section within
a compilation unit.

Differential Revision: https://reviews.llvm.org/D81147
2020-06-16 15:50:13 +07:00
Igor Kudrin b5f8959bcd [MC] Generate .debug_aranges in the 64-bit DWARF format [4/7]
Differential Revision: https://reviews.llvm.org/D81146
2020-06-16 15:50:13 +07:00
Igor Kudrin 1dfcce5395 [MC] Generate a compilation unit in the 64-bit DWARF format [3/7]
The patch enables producing DWARF64 compilation units and fixes
generating references to .debug_abbrev and .debug_line sections.
A similar change for .debug_ranges/.debug_rnglists will be added
in a forthcoming patch.

Differential Revision: https://reviews.llvm.org/D81145
2020-06-16 15:50:13 +07:00
Igor Kudrin 64c049595b [MC] Generate .debug_line in the 64-bit DWARF format [2/7]
Differential Revision: https://reviews.llvm.org/D81144
2020-06-16 15:50:13 +07:00
Igor Kudrin a8ec9de406 [MC] Add --dwarf64 to generate DWARF64 debug info [1/7]
The patch adds an option `--dwarf64` to instruct a tool to generate
debug information in the 64-bit DWARF format. There is no real
implementation yet, only a few compatibility checks.

Differential Revision: https://reviews.llvm.org/D81143
2020-06-16 15:50:13 +07:00
Simon Pilgrim 057c9c7ee0 [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions
This patch extends MatchVectorAllZeroTest to handle OR vector reduction patterns where the result is compared against zero.

Fixes PR45378

Differential Revision: https://reviews.llvm.org/D81547
2020-06-16 09:42:34 +01:00
Simon Pilgrim 65c3fa849b [X86][SSE] combineVectorSizedSetCCEquality - move single Subtarget.hasAVX() use into condition. NFC.
We already have Subtarget.hasSSE2() and Subtarget.useAVX512Regs() in the condition - seems to be a legacy from when we had multiple uses.
2020-06-16 09:42:33 +01:00
Sam Parker 7158f285a8 [CostModel] Unify getCFInstrCost
Have TTI::getInstructionThroughput call getUserCost for Br, Ret and
PHI. This now means that eveything in getInstructionThroughput is
handled by getUserCost.

Differential Revision: https://reviews.llvm.org/D79849
2020-06-16 08:40:54 +01:00
Fangrui Song a3b5f428c1 [AArch64] Print the immediate operand for SPACE pseudo instruction
Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D81814
2020-06-15 20:55:53 -07:00
Amara Emerson 1035a416a6 [AArch64][GlobalISel] Emit constant pool loads for 64 bit fp immediates.
Note: don't do this for integer 64 bit materialization to match SDAG.

Differential Revision: https://reviews.llvm.org/D81893
2020-06-15 20:53:09 -07:00
Qiu Chaofan e62912b190 [LLParser] Delete temp CallInst when error occurs
Only functions with floating-point return type accepts fast-math flags.
When adding such flags to function returning integer, we'll see a crash,
because there's still an undeleted value referencing the argument. This
patch manually removes the temporary instruction when error occurs.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D78355
2020-06-16 11:41:25 +08:00
Xing GUO 8aaeaddec8 [ObjectYAML][DWARF] Implement the .debug_addr section.
This patch implements the .debug_addr section.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D81541
2020-06-16 10:53:10 +08:00
Mircea Trofin 296e47734e [llvm][NFC] Fix license on InlineFeaturesAnalysis.{h|cpp}
Summary: Also fixed the InlineAdvisor.cpp license.

Reviewers: rriddle

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81896
2020-06-15 19:34:33 -07:00
Craig Topper 255d5dbae1 [X86] Add support for inline assembly 'x' constraint for i128.
Limiting to x86-64 since that's when __int128 is legal in clang.

Differential Revision: https://reviews.llvm.org/D81817
2020-06-15 19:34:02 -07:00
Gui Andrade b0ffa8befe [MSAN] Pass Origin by parameter to __msan_warning functions
Summary:
Normally, the Origin is passed over TLS, which seems like it introduces unnecessary overhead. It's in the (extremely) cold path though, so the only overhead is in code size.

But with eager-checks, calls to __msan_warning functions are extremely common, so this becomes a useful optimization.

This can save ~5% code size.

Reviewers: eugenis, vitalybuka

Reviewed By: eugenis, vitalybuka

Subscribers: hiraditya, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D81700
2020-06-15 17:49:18 -07:00
Stanislav Mekhanoshin 576fa5a50c [AMDGPU] make ubsan happy with unsigned left shift
Fixes UBSAN error after rG9ee272f13d88f090817235ef4f91e56bb2a153d6
A trivial signed/unsigned shift.
2020-06-15 17:21:10 -07:00
Amy Huang f8170d8715 [NativeSession] Implement findLineNumbersByAddress in NativeSession,
which takes an address and a length and returns all lines within that
address range.
2020-06-15 17:05:39 -07:00
Jessica Paquette 5a4c3f6b06 [GlobalISel] Look through extends etc in CombinerHelper::matchConstantOp
It's possible to end up with a zext or something in the way of a G_CONSTANT,
even pre-legalization. This can happen with memsets.

e.g.

https://godbolt.org/z/Bjc8cw

To make sure we can catch these cases, use `getConstantVRegValWithLookThrough`
instead of `mi_match`.

Differential Revision: https://reviews.llvm.org/D81875
2020-06-15 16:34:25 -07:00
Stanislav Mekhanoshin 9ee272f13d [AMDGPU] Add gfx1030 target
Differential Revision: https://reviews.llvm.org/D81886
2020-06-15 16:18:05 -07:00
Amara Emerson fc905ae003 [GlobalISel] Don't emit multiply by magic constant for zero memset values. 2020-06-15 14:42:14 -07:00
Mircea Trofin e2cc854015 [llvm][NFC] Move content of ML subdirectory into Analysis
The initial intent was to organize ML stuff in its own directory, but
it turns out that conflicts with llvm component layering policies: it
is not a component, because subsequent changes want to rely on other
analyses, which would create a cycle; and we don't have a reliable,
cross-platform mechanism to compile files in a subdirectory, and fit in
the existing LLVM build structure.

This change moves the files into Analysis, and subsequent changes will
leverage conditional compilation for those that have optional
dependencies.
2020-06-15 14:35:33 -07:00
Nick Desaulniers 2d8e105db6 [PPCAsmPrinter] support 'L' output template for memory operands
Summary:
L is meant to support the second word used by 32b calling conventions for 64b arguments.

This is required for build 32b PowerPC Linux kernels after upstream
commit 334710b1496a ("powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'")

Thanks for the report from @nathanchance, and reference to GCC's
implementation from @segher.

Fixes: pr/46186
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1044

Reviewers: echristo, hfinkel, MaskRay

Reviewed By: MaskRay

Subscribers: MaskRay, wuzish, nemanjai, hiraditya, kbarton, steven.zhang, llvm-commits, segher, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81767
2020-06-15 14:31:44 -07:00
Davide Italiano c2dccf9d5e [CodeGenPrepare] Reset the debug location when promoting trunc(s)
The promotion machinery in CGP moves instructions retaining
debug locations. When the transformation is local, this is mostly
correct, but when instructions are moved cross-BBs, this is not
always true and causes jumpiness in line tables. This is the first
of a series of commits. sext(s) and zext(s) need to be treated
similarly.

Differential Revision:  https://reviews.llvm.org/D81879
2020-06-15 14:25:43 -07:00
Nikita Popov 35651fdd45 [IR] Add AttributeBitSet wrapper (NFC)
This wraps the uint8_t[12] type used in two places, because I
plan to introduce a third use of the same pattern.
2020-06-15 21:28:25 +02:00
Jessica Paquette 3495b884de [AArch64][GlobalISel] Add G_EXT and select ext using it
Add selection support for ext via a new opcode, G_EXT and a post-legalizer
combine which matches it.

Add an `applyEXT` function, because the AArch64ext patterns require a register
for the immediate. So, we have to create a G_CONSTANT to get these without
writing new patterns or modifying the existing ones.

Tests are the same as arm64-ext.ll.

Also prevent ext from firing on the zip test. It has higher priority, so we
don't want it potentially getting in the way of mask tests.

Also fix up the shuffle-splat test, because ext is now selected there. The
test was incorrectly regbank selected before, which could cause a verifier
failure when you emit copies.

Differential Revision: https://reviews.llvm.org/D81436
2020-06-15 12:20:59 -07:00
Mircea Trofin 29e5722949 Revert "[llvm] Added support for stand-alone cmake object libraries."
This reverts commit 695c7d6313.

Breaks windows (e.g.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/16497)

Likely to cause problems with XCode.
2020-06-15 12:15:39 -07:00
Davide Italiano e51e82745e [Target/PPC] Fold inside an assertion.
Pointed out by dblaikie.
2020-06-15 12:08:57 -07:00
Mircea Trofin 695c7d6313 [llvm] Added support for stand-alone cmake object libraries.
Summary:
Currently, add_llvm_library would create an OBJECT library alongside
of a STATIC / SHARED library, but losing the link interface (its
elements would become dependencies instead). To support scenarios
where linking an object library also brings in its usage
requirements, this patch adds support for 'stand-alone' OBJECT
libraries - i.e. without an accompanying SHARED/STATIC library, and
maintaining the link interface defined by the user.

The support is via a new option, OBJECT_ONLY, to avoid breaking changes
- since just specifying "OBJECT" would currently imply also STATIC or
SHARED, depending on BUILD_SHARED_LIBS.

This is useful for cases where, for example, we want to build a part
of a component separately. Using a STATIC target would incur the risk
that symbols not referenced in the consumer would be dropped (which may
be undesirable).

The current application is the ML part of Analysis. It should be part
of the Analysis component, so it may reference other analyses; and (in
upcoming changes) it has dependencies on optional libraries.

Reviewers: karies, davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81447
2020-06-15 12:01:43 -07:00
Matt Arsenault e07cf92377 AMDGPU/GlobalISel: Don't hardcode maximum register size
This is a somewhat artifical limit, so avoid repeating it many places
in case it changes.
2020-06-15 15:01:19 -04:00
Matt Arsenault 1a7f115dce AMDGPU/GlobalISel: Extend load/store workaround to i128 vectors 2020-06-15 14:55:11 -04:00
Rahul Joshi 72d20b9604 [LLVM] Change isa<> to a variadic function template
Change isa<> to a variadic function template, so that it can be used to test against one of multiple types as follows:
   isa<Type0, Type1, Type2>(Val)

Differential Revision: https://reviews.llvm.org/D81045
2020-06-15 18:46:57 +00:00
Lang Hames 5682f192bd [RuntimeDyld] Add dependence on Core.
Commit 498dd745f5 introduced a dependence on Core. This patch updates
LLVMbuild.txt to reflect this.
2020-06-15 11:14:27 -07:00
Davide Italiano 5cb44196aa [Target/PPC] Silence an unused variable warning. NFC. 2020-06-15 11:05:01 -07:00
Craig Topper d72cb4ce21 Recommit "[X86] Separate imm from relocImm handling."
Fix the copy/paste mistake that caused it to fail previously
2020-06-15 10:59:43 -07:00
Florian Hahn 120c059292 [DSE,MSSA] Port partial store merging.
Port partial constant store merging logic to MemorySSA backed DSE. The
heavy lifting is done by the existing helper function. It is used in
context where we already ensured that the later instruction can
eliminate the earlier one, if it is a complete overwrite.
2020-06-15 18:41:46 +01:00
Lang Hames 498dd745f5 [ORC] Honor linker private global prefix on symbol names.
If a symbol name begins with the linker private global prefix (as
described by the DataLayout) then it should be treated as non-exported,
regardless of its LLVM IR visibility value.
2020-06-15 10:28:36 -07:00
Wouter van Oortmerssen 3b29376e3f [WebAssembly] Adding 64-bit version of R_WASM_MEMORY_ADDR_* relocs
This adds 4 new reloc types.

A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.

A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307

Differential Revision: https://reviews.llvm.org/D81704
2020-06-15 10:07:42 -07:00