Commit Graph

136081 Commits

Author SHA1 Message Date
Ayke van Laethem 01c2209d51
[AVR] Decode single register instructions
This is a set of instructions that take just a single register as an
operand, with no immediates. Because all instructions share the same
format, I haven't added exhaustive bit testing to all instructions but
just to the inc instruction.

Differential Revision: https://reviews.llvm.org/D81968
2020-06-23 02:17:15 +02:00
Ayke van Laethem ff4817ec2a
[AVR] Don't adjust for instruction size
I'm not entirely sure why this was ever needed, but when I remove both
adjustments all tests still pass.

This fixes a bug where a long branch (using the `jmp` instead of the
`rjmp` instruction) was incorrectly adjusted by 2 because it jumps to an
absolute address instead of a PC-relative address. I could have added
AVR::fixup_call to the list of exceptions, but it seemed more sensible
to me to just remove this code.

Differential Revision: https://reviews.llvm.org/D78459
2020-06-23 02:15:42 +02:00
Sam Clegg 79aad89d8d [WebAssembly] Add support for externalref to MC and wasm-ld
This allows code for handling externref values to be processed by the
assembler and linker.

Differential Revision: https://reviews.llvm.org/D81977
2020-06-22 15:57:24 -07:00
Christopher Tetreault cd6848f6e1 [SVE] Remove calls to VectorType::getNumElements from ARM
Reviewers: efriedma, greened, c-rhodes, david-arm, dmgreen

Reviewed By: dmgreen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, dmgreen, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82216
2020-06-22 15:18:58 -07:00
Arthur Eubanks d335c1317b Fix dynamic alloca detection in CloneBasicBlock
Summary:
Simply check AI->isStaticAlloca instead of reimplementing checks for
static/dynamic allocas.

Reviewers: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82328
2020-06-22 15:06:28 -07:00
Craig Topper 23654d9e7a Recommit "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum."
Hopefully this version will fix the previously buildbot failure
2020-06-22 13:32:03 -07:00
Greg Clayton ccf5a44917 Fix the verification of DIEs with DW_AT_ranges.
Summary: Previous code would try to verify DW_AT_ranges and if any ranges would overlap, it would stop attributing any ranges after this to the DIE which caused incorrect errors to be reported that a DIE's address ranges were not contained in the parent DIE's ranges. Added a fix and a test.

Reviewers: aprantl, labath, probinson, JDevlieghere, jhenderson

Subscribers: hiraditya, MaskRay, cmtice, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79962
2020-06-22 13:13:48 -07:00
Lei Zhang 315bd96437 Use std::make_tuple instead initializer list
Hopefully this pleases GCC-5 and fixes the build error:

LowerExpectIntrinsic.cpp:62:53: error: converting to
'std::tuple<unsigned int, unsigned int>' from initializer list would use
explicit constructor 'constexpr std::tuple<_T1, _T2>::tuple(_U1&&,
_U2&&) [with _U1 = llvm:🆑:opt<unsigned int>&; _U2 =
llvm:🆑:opt<unsigned int>&; <template-parameter-2-3> = void; _T1 =
unsigned int; _T2 = unsigned int]'
     return {LikelyBranchWeight, UnlikelyBranchWeight};

Differential Revision: https://reviews.llvm.org/D82325
2020-06-22 15:43:40 -04:00
Hans Wennborg 1357c06578 Revert "[X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions"
This caused a Chromium test to miscompile. See discussion on the Phabricator
review.

> This patch extends MatchVectorAllZeroTest to handle OR vector reduction patterns where the result is compared against zero.
>
> Fixes PR45378
>
> Differential Revision: https://reviews.llvm.org/D81547

This reverts 057c9c7ee0
2020-06-22 21:27:11 +02:00
Christopher Tetreault 5e2c736395 [SVE] Remove calls to VectorType::getNumElements from WebASM
Summary:
The getNumElements in base VectorType is being deprecated.

See: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html

Reviewers: efriedma, tlively, fpetrogalli, c-rhodes, dschuff

Reviewed By: tlively, dschuff

Subscribers: dschuff, sbc100, tschuett, jgravelle-google, hiraditya, aheejin, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82217
2020-06-22 12:25:08 -07:00
Craig Topper bebea4221d Revert "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum."
Seems to breaking build.

This reverts commit 5ac144fe64.
2020-06-22 12:20:40 -07:00
Craig Topper 5ac144fe64 [X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum.
Move 0 initialization up to the caller so we don't need to know
the size.
2020-06-22 11:46:20 -07:00
Zhi Zhuang 37fb860301 Add support of __builtin_expect_with_probability
Add a new builtin-function __builtin_expect_with_probability and
intrinsic llvm.expect.with.probability.
The interface is __builtin_expect_with_probability(long expr, long
expected, double probability).
It is mainly the same as __builtin_expect besides one more argument
indicating the probability of expression equal to expected value. The
probability should be a constant floating-point expression and be in
range [0.0, 1.0] inclusive.
It is similar to builtin-expect-with-probability function in GCC
built-in functions.

Differential Revision: https://reviews.llvm.org/D79830
2020-06-22 10:21:28 -07:00
Hiroshi Yamauchi 9e1decf743 [PGO][PGSO] Enable non-cold size opts under partial profile sample PGO.
Summary: Similar to D81020. Follow up D78949.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82053
2020-06-22 10:12:48 -07:00
Francesco Petrogalli ef597eda8e [sve][acle] Add SVE BFloat16 extensions.
Summary:
List of intrinsics:

svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfdot[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfdot_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index)

svfloat32_t svbfmmla[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)

svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfmlalb[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfmlalb_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index)

svfloat32_t svbfmlalt[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfmlalt[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfmlalt_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index)

svbfloat16_t svcvt_bf16[_f32]_m(svbfloat16_t inactive, svbool_t pg, svfloat32_t op)
svbfloat16_t svcvt_bf16[_f32]_x(svbool_t pg, svfloat32_t op)
svbfloat16_t svcvt_bf16[_f32]_z(svbool_t pg, svfloat32_t op)

svbfloat16_t svcvtnt_bf16[_f32]_m(svbfloat16_t even, svbool_t pg, svfloat32_t op)
svbfloat16_t svcvtnt_bf16[_f32]_x(svbfloat16_t even, svbool_t pg, svfloat32_t op)

For reference, see section 7.2 of "Arm C Language Extensions for SVE - Version 00bet4"

Reviewers: sdesmalen, ctetreau, efriedma, david-arm, rengolin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82141
2020-06-22 16:53:02 +00:00
Sanjay Patel 9934cc544c [VectorCombine] make helper function for shift-shuffle; NFC
This will probably be useful for other extract patterns.
2020-06-22 12:23:52 -04:00
Florian Hahn 328c8642e2 [DSE,MSSA] Reorder DSE blocking checks.
Currently we stop exploring candidates too early in some cases.

In particular, we can continue checking the defining accesses of
non-removable MemoryDefs and defs without analyzable write location
(read clobbers are already ruled out using MemorySSA at this point).
2020-06-22 17:16:34 +01:00
Fangrui Song c52bee61e9 [MCParser] Support quoted section name for COFF
This features matches ELFAsmParser and makes it possible to use `.section ".llvm.call-graph-profile","n"`

Reviewed By: zequanwu

Differential Revision: https://reviews.llvm.org/D82240
2020-06-22 09:11:44 -07:00
stozer 539381da26 [DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructions
Following on from this RFC[0] from a while back, this is the first patch towards
implementing variadic debug values.

This patch specifically adds a set of functions to MachineInstr for performing
operations specific to debug values, and replacing uses of the more general
functions where appropriate. The most prevalent of these is replacing
getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the
operands corresponding to values will no longer be at index 0, but index 2 and
upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been
added for the other operands, along with some helper functions to replace
oft-repeated code and operate on a variable number of value operands.

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste>

Differential Revision: https://reviews.llvm.org/D81852
2020-06-22 16:01:12 +01:00
Guillaume Chatelet 597a9070b5 [ARC] Add missing return statement 2020-06-22 14:57:29 +00:00
Sanjay Patel 98c2f4eea5 [VectorCombine] add helper to replace uses and rename
The tests are regenerated to show a path that missed renaming,
but there should be no functional difference from this patch.
2020-06-22 09:58:49 -04:00
Valentin Clement 8383ac6197 Revert commit 9e52530 because of dependencies issue
This reverts commit 9e525309fb.
2020-06-22 09:56:14 -04:00
Valentin Clement 9e525309fb [openmp] Base of tablegen generated OpenMP common declaration
Summary:
As discussed previously when landing patch for OpenMP in Flang, the idea is
to share common part of the OpenMP declaration between the different Frontend.
While doing this it was thought that moving to tablegen instead of Macros will also
give a cleaner and more powerful way of generating these declaration.
This first part of a future series of patches is setting up the base .td file for
DirectiveLanguage as well as the OpenMP version of it. The base file is meant to
be used by other directive language such as OpenACC.
In this first patch, the Directive and Clause enums are generated with tablegen
instead of the macros on OMPConstants.h. The next pacth will extend this
to other enum and move the Flang frontend to use it.

Reviewers: jdoerfert, DavidTruby, fghanim, ABataev, jdenny, hfinkel, jhuber6, kiranchandramohan, kiranktp

Reviewed By: jdoerfert, jdenny

Subscribers: cfe-commits, mgorny, yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits

Tags: #llvm, #openmp, #clang

Differential Revision: https://reviews.llvm.org/D81736
2020-06-22 09:34:53 -04:00
Xing GUO 03480c80d3 [DWARFYAML][debug_info] Add support for error handling.
This patch helps add support for error handling in `DWARFYAML::emitDebugInfo()`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82275
2020-06-22 21:36:13 +08:00
Xing GUO 3a48a632d0 [DWARFYAML][debug_info] Use 'AbbrCode' to index the abbreviation.
Before this patch, we use `(uint32_t)AbbrCode - (uint32_t)FirstAbbrCode` to index the abbreviation. It's impossible for we to use the preceeding abbreviation of the previous one (e.g., if the previous DIE's `AbbrCode` is 2, we are unable to use the abbreviation with index 1). In this patch, we use `AbbrCode` to index the abbreviation directly.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82173
2020-06-22 21:34:02 +08:00
Simon Pilgrim 48d1a2d6d0 [DAG] Add SimplifyMultipleUseDemandedVectorElts helper for SimplifyMultipleUseDemandedBits. NFCI.
We have many cases where we call SimplifyMultipleUseDemandedBits and demand specific vector elements, but all the bits from them - this adds a helper wrapper to handle this.
2020-06-22 14:24:39 +01:00
Sanjay Patel de65b356dc [VectorCombine] add/use pass-level IRBuilder
This saves creating/destroying a builder every time we
perform some transform.

The tests show instruction ordering diffs resulting from
always inserting at the root instruction now, but those
should be benign.
2020-06-22 09:01:29 -04:00
Jay Foad 9761d3cf9c [AMDGPU] Update more live intervals in SIWholeQuadMode
This fixes various assertion failures that would otherwise be triggered
by a later patch to move SIWholeQuadMode later in the pass pipeline.

Differential Revision: https://reviews.llvm.org/D82190
2020-06-22 13:50:15 +01:00
Sanjay Patel cce625f73d [VectorCombine] improve IR debugging by providing/salvaging value names
The tests are regenerated to show the diffs, but there should be no
functional change from this patch.
2020-06-22 08:35:47 -04:00
Tim Corringham 96ecead5a2 [AMDGPU] clang-format of SIModeRegister.cpp
Ran clang-format just to ease future reviews. No functional changes.
2020-06-22 13:31:52 +01:00
Simon Pilgrim ecc5d7ee0d [DAG] SimplifyMultipleUseDemandedBits - drop unnecessary *_EXTEND_VECTOR_INREG cases
For little endian targets, if we only need the lowest element and none of the extended bits then we can just use the (bitcasted) source vector directly.

We already do this in SimplifyDemandedBits, this adds the SimplifyMultipleUseDemandedBits equivalent.
2020-06-22 12:35:32 +01:00
Tres Popp 09d72ad399 Revert "[CGP] Enable CodeGenPrepares phi type convertion."
This reverts commit 67121d7b82.

This is causing compile times to be 2x slower on some large binaries.
2020-06-22 13:06:18 +02:00
Serguei Katkov eae0d2e9b2 Revert "[Peeling] Extend the scope of peeling a bit"
This reverts commit 29b2c1ca72.

The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!

Not sure the patch itself it wrong but revert to investigate the failure.
2020-06-22 17:48:29 +07:00
Vitaly Buka 5d964e262f [StackSafety] Check variable lifetime
We can't consider variable safe if out-of-lifetime access is possible.
So if StackLifetime can't prove that the instruction always uses
the variable when it's still alive, we consider it unsafe.
2020-06-22 03:45:29 -07:00
Vitaly Buka 8f592ed333 [StackSafety] Ignore unreachable instructions
Usually DominatorTree provides this info, but here we use
StackLifetime. The reason is that in the next patch StackLifetime
will be used for actual lifetime checks and we can avoid
forwarding the DominatorTree into this code.
2020-06-22 03:45:29 -07:00
Anton Korobeynikov 6cb80fbe40 Revert "[MSP430] Update register names"
This reverts commit 8f6620f663.
2020-06-22 13:37:22 +03:00
Anatoly Trosinenko 8f6620f663 [MSP430] Update register names
When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4.

The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it.

Differential Revision: https://reviews.llvm.org/D82184
2020-06-22 13:24:03 +03:00
Momchil Velikov 75b0bbca1d [LTO] Use StringRef instead of C-style strings in setCodeGenDebugOptions
Fixes an issue with missing nul-terminators and saves us some string
copying, compared to a version which would insert nul-terminators.

Differential Revision: https://reviews.llvm.org/D82033
2020-06-22 11:22:18 +01:00
Anatoly Trosinenko a5bd75aab8 [MSP430] Enable some basic support for debug information
This commit technically permits LLVM to emit the debug information for ELF files for MSP430 architecture. Aside from this, it only defines the register numbers as defined by part 10.1 of MSP430 EABI specification (assuming the 1-byte subregisters share the register numbers with corresponding full-size registers).

This commit was basically tested by me with TI-provided GCC 8.3.1 toolchain by compiling an example program with `clang` (please note manual linking may be required due to upstream `clang` not yet handling the `-msim` option necessary to run binaries on the GDB-provided simulator) and then running it and single-stepping with `msp430-elf-gdb` like this:

```
$sysroot/bin/msp430-elf-gdb ./test -ex "target sim" -ex "load ./test"
(gdb) ... traditional GDB commands follow ...
```

While this implementation is most probably far from completeness and is considered experimental, it can already help with debugging MSP430 programs as well as finding issues in LLVM debug info support for MSP430 itself.

One of the use cases includes trying to find a point where UBSan check in a trap-on-error mode was triggered.

The expected debug information format is described in the [MSP430 Embedded Application Binary Interface](http://www.ti.com/lit/an/slaa534/slaa534.pdf) specification, part 10.

Differential Revision: https://reviews.llvm.org/D81488
2020-06-22 13:14:07 +03:00
Anatoly Trosinenko 359fae6eb0 [DebugInfo] Explicitly permit addr_size = 0x02 when parsing DWARF data
Current LLVM implementation uses `MCAsmInfo::CodePointerSize` as addr_size when emitting the DWARF data. llvm-dwarfdump, on the other hand, handles `addr_size`s of 4 and 8 properly and considers all other sizes as an error. This works for most of mainline targets except for MSP430 and AVR.

msp430-gcc v8.3.1 emits DWARF32 with addr_size = 4 (DWARF32 does not imply addr_size = 4, 32 refers to internal offset width of 4 bytes) that is handled by llvm-dwarfdump already. Still, emitting 2-byte target pointers on MSP430 seems correct as well (but not for MSP430X that is supported by msp430-gcc but not by LLVM and has 20-bit address space).

This patch make it possible for MSP430 debug info support to be tested with llvm-dwarfdump.

Differential Revision: https://reviews.llvm.org/D82055
2020-06-22 13:11:55 +03:00
Florian Hahn 0e19ff02d8 [DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC). 2020-06-22 10:58:53 +01:00
Djordje Todorovic 792786e34d [CSInfo][MIPS] Don't describe parameters loaded by sub/super reg copy
When describing parameter value loaded by a COPY instruction, consider
case where needed Reg value is a sub- or super- register of the COPY
instruction's destination register. Without this patch, compile process
will crash with the assertion "TargetInstrInfo::describeLoadedValue
can't describe super- or sub-regs for copy instructions".

Patch by Nikola Tesic

Differential revision: https://reviews.llvm.org/D82000
2020-06-22 10:49:02 +02:00
Serguei Katkov 29b2c1ca72 [Peeling] Extend the scope of peeling a bit
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.

Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.

Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
2020-06-22 12:17:44 +07:00
Michael Liao 20a1700293 [amdgpu] Fix REL32 relocations with negative offsets.
Summary: - The offset should be treated as a signed one.

Reviewers: rampitec, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82234
2020-06-21 23:09:03 -04:00
Sanjay Patel 6bdd531af5 [VectorCombine] create class for pass to hold analyses, etc; NFC
This doesn't change anything currently, but it would make sense
to create a class-level IRBuilder instead of recreating that
everywhere. As we expand to more optimizations, we will probably
also want to hold things like the DataLayout or other constant
refs in here too.
2020-06-21 16:07:33 -04:00
David Green 67121d7b82 [CGP] Enable CodeGenPrepares phi type convertion. 2020-06-21 16:46:16 +01:00
Florian Hahn 40569db7b3 [DSE,MSSA] Move reachability check to main loop.
As we traverse the CFG backwards, we could end up reaching unreachable
blocks. For unreachable blocks, we won't have computed post order
numbers and because DomAccess is reachable, unreachable blocks cannot be
on any path from it.

This fixes a crash with unreachable blocks.
2020-06-21 16:38:10 +01:00
David Green 730ecb63ec [CGP] Convert phi types
If a collection of interconnected phi nodes is only ever loaded, stored
or bitcast then we can convert the whole set to the bitcast type,
potentially helping to reduce the number of register moves needed as the
phi's are passed across basic block boundaries. This has to be done in
CodegenPrepare as it naturally straddles basic blocks.

The alorithm just looks from phi nodes, looking at uses and operands for
a collection of nodes that all together are bitcast between float and
integer types. We record visited phi nodes to not have to process them
more than once. The whole subgraph is then replaced with a new type.
Loads and Stores are bitcast to the correct type, which should then be
folded into the load/store, changing it's type.

This comes up in the biquad testcase due to the way MVE needs to keep
values in integer registers. I have also seen it come up from aarch64
partner example code, where a complicated set of sroa/inlining produced
integer phis, where float would have been a better choice.

I also added undef and extract element handling which increased the
potency in some cases.

This adds it with an option that defaults to off, and disabled for 32bit
X86 due to potential issues around canonicalizing NaNs.

Differential Revision: https://reviews.llvm.org/D81827
2020-06-21 15:54:17 +01:00
Nikita Popov 37d3030711 [ValueTracking, BasicAA] Don't simplify instructions
GetUnderlyingObject() (and by required symmetry
DecomposeGEPExpression()) will call SimplifyInstruction() on the
passed value if other checks fail. This simplification is very
expensive, but has little effect in practice. This patch removes
the SimplifyInstruction call(), and replaces it with a check for
single-argument phis (which can occur in canonical IR in LCSSA
form), which is the only useful simplification case I was able to
identify.

At O3 the geomean CTMark improvement is -1.7%. The largest
improvement is SPASS with ThinLTO at -6%.

In test-suite, I see only two tests with a hash difference and
no code size difference (PAQ8p, Ptrdist), which indicates that
the simplification only ends up being useful very rarely. (I would
have liked to figure out which simplification is responsible here,
but wasn't able to spot it looking at transformation logs.)

The AMDGPU test case that is update was using two selects with
undef condition, in which case GetUnderlyingObject will return
the first select operand as the underlying object. This will of
course not happen with non-undef conditions, so this was not
testing anything realistic. Additionally this illustrates potential
unsoundness: While GetUnderlyingObject will pick the first operand,
the select might be later replaced by the second operand, resulting
in inconsistent assumptions about the undef value.

Differential Revision: https://reviews.llvm.org/D82261
2020-06-21 16:31:07 +02:00
Sanjay Patel 2ad42c2653 [ValueTracking] improve analysis for fdiv with same operands
(The 'nnan' variant of this pattern is already tested to produce '1.0'.)

https://alive2.llvm.org/ce/z/D4hPBy

define i1 @src(float %x, i32 %y) {
%0:
  %d = fdiv float %x, %x
  %uge = fcmp uge float %d, 0.000000
  ret i1 %uge
}
=>
define i1 @tgt(float %x, i32 %y) {
%0:
  ret i1 1
}
Transformation seems to be correct!
2020-06-21 09:07:59 -04:00
Simon Pilgrim fb9f9dc318 [X86][SSE] Add SimplifyDemandedVectorEltsForTargetShuffle to handle target shuffle variable masks
Pulled out from the ongoing work on D66004, currently we don't do a good job of simplifying variable shuffle masks that have already lowered to constant pool entries.

This patch adds SimplifyDemandedVectorEltsForTargetShuffle (a custom x86 helper) to first try SimplifyDemandedVectorElts (which we already do) and then constant pool simplification to help mark undefined elements.

To prevent lowering/combines infinite loops, we only handle basic constant pool loads instead of creating new BUILD_VECTOR nodes for lowering - e.g. we don't try to convert them to broadcast/vzext_load - there might be some benefit to this but if so I'd rather we come up with some way to reuse existing code than reimplement a lot of BUILD_VECTOR code.

Differential Revision: https://reviews.llvm.org/D81791
2020-06-21 11:16:07 +01:00
clfbbn 10b0539772 [Attributor][NFC] Fix indentation
Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, kuter, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82260
2020-06-21 15:43:32 +08:00
Wenlei He 7c8a6936bf [Remarks] Add callsite locations to inline remarks
Summary:
Add call site location info into inline remarks so we can differentiate inline sites.
This can be useful for inliner tuning. We can also reconstruct full hierarchical inline
tree from parsing such remarks. The messege of inline remark is also tweaked so we can
differentiate SampleProfileLoader inline from CGSCC inline.

Reviewers: wmi, davidxl, hoy

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82213
2020-06-20 23:32:10 -07:00
Amy Kwan cc95635b1b [PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:
```
vector signed char vec_clrl (vector signed char a, unsigned int n);
vector unsigned char vec_clrl (vector unsigned char a, unsigned int n);
vector signed char vec_clrr (vector signed char a, unsigned int n);
vector signed char vec_clrr (vector unsigned char a, unsigned int n);
```

Differential Revision: https://reviews.llvm.org/D81707
2020-06-20 18:29:16 -05:00
Eric Christopher dc20419351 Rename function to more accurately reflect what it does. 2020-06-20 14:37:29 -07:00
Eric Christopher 8116d01905 Typos around a -> an. 2020-06-20 14:04:48 -07:00
Sanjay Patel 741e20f3d6 [VectorCombine] fix assert for type of compare operand
As shown in the post-commit comment for D81661 - we need to
loosen the type assertion to allow scalarization of a compare
for vectors of pointers.
2020-06-20 15:20:17 -04:00
Sanjay Patel 7b201bfcac [InstCombine] remove unused parameter and add assert; NFC 2020-06-20 11:47:00 -04:00
Sanjay Patel d84cdb81ed [InstCombine] fabs(X) / fabs(X) -> X / X
Also, consolidate related folds so we don't miss/repeat these.
2020-06-20 10:20:21 -04:00
Simon Pilgrim 89dcbdfcfd [X86] combineSetCCMOVMSK - consistently use CmpBits variable. NFCI.
The comparison value should be the same size - I've added an assert to be absolutely certain.
2020-06-20 12:35:24 +01:00
Simon Pilgrim 56a9332328 [X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) != -1 -> !PTESTZ(X,X) allof patterns 2020-06-20 12:17:32 +01:00
Nikita Popov d3d4e4bcb7 [LVI] Extract addValueHandle() method (NFC)
There will be more places registering value handles.
2020-06-20 13:05:42 +02:00
Nikita Popov 64ecf85f63 [LVI] Use find_as() where possible (NFC)
This prevents us from creating temporary PoisoningVHs and
AssertingVHs while performing hashmap lookups. As such, it only
matters in assertion-enabled builds.
2020-06-20 13:05:42 +02:00
Florian Hahn 9a7d80a32c Revert "[BasicAA] Use known lower bounds for index values for size based check."
This potentially related to https://bugs.llvm.org/show_bug.cgi?id=46335
and causes a slight compile-time regression. Revert while investigating.

This reverts commit d99a1848c4.
2020-06-20 10:06:05 +01:00
Eric Christopher 10563e16aa [Analysis/Transforms/Sanitizers] As part of using inclusive language
within the llvm project, migrate away from the use of blacklist and
whitelist.
2020-06-20 00:42:26 -07:00
Eric Christopher 858d385578 As part of using inclusive language within the llvm project,
migrate away from the use of blacklist and whitelist.
2020-06-20 00:24:57 -07:00
Eric Christopher cf23852587 [Target] As part of using inclusive language within the llvm project,
migrate away from the use of blacklist and whitelist.

This change affects an internal llvm command line option.
2020-06-20 00:06:39 -07:00
Xing GUO 6770349592 [DWARFYAML][debug_info] Fix array index out of bounds error
This patch is trying to fix the array index out of bounds error. I observed it in (https://reviews.llvm.org/harbormaster/unit/view/99638/).

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D82139
2020-06-20 15:08:24 +08:00
Craig Topper c721bc081e [X86] Correct the implementation of ud1(a.k.a. ud2b) instruction.
We were missing the modrm byte this instruction has according
to current Intel SDM. Experiments with gcc indicate that different
modrm values are chosen based on 2 operands so I've added those
as well.

I think our previous implementation was based on an older behavior of
binutils that has since been changed.
2020-06-19 23:57:48 -07:00
Craig Topper 0dda5e4ce2 [X86] Ignore bits 2:0 of the modrm byte when disassembling lfence, mfence, and sfence.
These are documented as using modrm byte of 0xe8, 0xf0, and 0xf8
respectively. But hardware ignore bits 2:0. So 0xe9-0xef is treated
the same as 0xe8. Similar for the other two.

Fixing this required adding 8 new formats to the X86 instructions
to convey this information. Could have gotten away with 3, but
adding all 8 made for a more logical conversion from format to
modrm encoding.

I renumbered the format encodings to keep the register modrm
formats grouped together.
2020-06-19 22:24:24 -07:00
Fangrui Song 2a4317bfb3 [SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list
Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now.

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D82244
2020-06-19 22:22:47 -07:00
Yevgeny Rouban 6429471e8b [IR] Convert profile metadata in createCallMatchingInvoke()
When an invoke instruction is converted to a call its
profile metadata is dropped because it has incompatible
format (see commit 16ad6eeb94).
This patch adds an attempt to convert profile data to
format of the call instruction. This used to work well
before the commit dcfa78a4cc.

Reviewers: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82071
2020-06-20 12:10:31 +07:00
Wang Rui dd48c57da3 [Mips] Error if a non-immediate operand is used while an immediate is expected
The 32-bit type relocation (R_MIPS_32) cannot be used for instructions below:

ori $4, $4, start
ori $4, $4, (start - .)

We should print an error instead.

Reviewed By: atanasyan, MaskRay

Differential Revision: https://reviews.llvm.org/D81908
2020-06-19 22:08:59 -07:00
Vitaly Buka 3d8149db3c [StackSafety,NFC] Don't rerun on LiveIn change 2020-06-19 21:29:31 -07:00
Xing GUO 1cfdda57fa [ObjectYAML][ELF] Add support for emitting the .debug_info section.
This patch helps add support for emitting the .debug_info section to yaml2elf.

Reviewed By: jhenderson, grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D82073
2020-06-20 12:13:01 +08:00
Carl Ritson 4a7de36afc [AMDGPU] Avoid use of V_READLANE into EXEC in SGPR spills
Always prefer to clobber input SGPRs and restore them after the
spill.  This applies to both spills to VGPRs and scratch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81914
2020-06-20 12:10:47 +09:00
romanova-ekaterina d7fad626e9 Error related to ThinLTO caching needs to be downgraded to a remark
This is a fix for PR #46392 (Diagnostic message (error) related to
ThinLTO caching needs to be downgraded to a remark).

There are diagnostic messages related to ThinLTO caching that contain
the word "error", but they are really just notices/remarks for users,
and they don't cause a build failure. The word "error" appearing can be
confusing to users, and may even cause deeper problems.

User's build system might be designed to interpret any error messages
(even a benign error message as the one above) reported by the compiler
as a build failure, thus causing the build to fail "needlessly". In
short, the term "error" in this diagnostic is misleading at best, and
may be causing build systems to fail at worst.

Differential Revision: https://reviews.llvm.org/D82138
2020-06-19 16:03:29 -07:00
Eric Christopher b6536e549d As part of using inclusive language within the llvm project,
migrate away from the use of blacklist and whitelist.
2020-06-19 15:12:18 -07:00
Heejin Ahn 83c26eae23 [WebAssembly] Remove TEEs when dests are unstackified
When created in RegStackify pass, `TEE` has two destinations, where
op0 is stackified and op1 is not. But it is possible that
op0 becomes unstackified in `fixUnwindMismatches` function in
CFGStackify pass when a nested try-catch-end is introduced, violating
the invariant of `TEE`s destinations.

In this case we convert the `TEE` into two `COPY`s, which will
eventually be resolved in ExplicitLocals.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D81851
2020-06-19 14:55:21 -07:00
Martin Storsjö cdbd299800 [Support] Fix building for mingw on a case sensitive file system
This fixes cross building on a case sensitive file system after
2e613d2ded. (The official Windows
SDKs don't have self-consistent casing and can't be used as such on
case sentisive file systems without case fixups, while mingw headers
consistently use lower case.)
2020-06-20 00:39:22 +03:00
Amara Emerson 1feeecf224 [AArch64][GlobalISel] Make G_SEXT_INREG legal and add selection support.
We were defaulting to the lower action for this, resulting in SHL+ASHR
sequences. On AArch64 we can do this in one instruction for an arbitrary
extension using SBFM as we do for G_SEXT.

Differential Revision: https://reviews.llvm.org/D81992
2020-06-19 13:20:41 -07:00
Sanjay Patel 216a37bb46 [VectorCombine] refactor extract-extract logic; NFCI 2020-06-19 14:52:27 -04:00
Lang Hames bf783a6aa8 [JITLink] Display host -> target address mapping in debugging output.
This can be helpful for sanity checking JITLink memory manager behavior.
2020-06-19 10:05:02 -07:00
Sanjay Patel 6d864097a2 [VectorCombine] fix crash while transforming constants
This is a variation of the proposal in D82049 with an extra test.
2020-06-19 12:30:32 -04:00
Stanislav Mekhanoshin 2b87a44c49 [AMDGPU] Some formatting fixes. NFC. 2020-06-19 09:02:59 -07:00
Piotr Sobczak 6d9565d6d5 Revert "[AMDGPU] Select s_cselect"
This caused some failures detected by the buildbot with
expensive checks enabled.

This reverts commit 4067de569f.
2020-06-19 16:41:04 +02:00
dfukalov 129388ddc4 [AMDGPU][CostModel] Add fneg cost estimation
Summary: The estimation uses AMDGPUTargetLowering::isFNegFree()

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82065
2020-06-19 17:31:35 +03:00
Piotr Sobczak 4067de569f [AMDGPU] Select s_cselect
Summary:
Add patterns to select s_cselect in the isel.

Handle more cases of implicit SCC accesses in si-fix-sgpr-copies
to allow new patterns to work.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81925
2020-06-19 16:17:46 +02:00
Sjoerd Meijer 4aa893b8f2 [ARM][MVE] tail-predication: renamed internal option.
Renamed -force-tail-predication to -force-mve-tail-predication because
that's more descriptive and consistent.
2020-06-19 15:07:06 +01:00
Mikhail Maltsev 490f78c038 [ARM][BFloat] Implement lowering of bf16 load/store intrinsics
Reviewers: labrinea, dmgreen, pratlucas, LukeGeeson

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81486
2020-06-19 14:02:35 +00:00
Mikhail Maltsev 7526881246 [ARM][BFloat] Lowering of create/get/set/dup intrinsics
This patch adds codegen for the following BFloat
operations to the ARM backend:
* concatenation of bf16 vectors
* bf16 vector element extraction
* bf16 vector element insertion
* duplication of a bf16 value into each lane of a vector
* duplication of a bf16 vector lane into each lane

Differential Revision: https://reviews.llvm.org/D81411
2020-06-19 12:52:40 +00:00
Simon Pilgrim c143db3b10 [X86][SSE] combineHorizontalPredicateResult - improve all_of(X == 0) for vXi64 on pre-SSE41 targets
Without SSE41 we don't have the PCMPEQQ instruction, making cmp-with-zero reductions more complicated than necessary. We can compare as vXi32 (PCMPEQD) and tweak the MOVMSK comparison to test upper/lower DWORD comparisons.

This pre-fixes something that occurs with null tests for vectors of (64-bit) pointers such as in PR35129.
2020-06-19 11:43:25 +01:00
Vitaly Buka 0e1bdeafc9 [StackSafety,NFC] Fix comment 2020-06-19 03:11:13 -07:00
Tyker 67448a8ccc try to fix build bot after b7338fb1a6 2020-06-19 12:02:09 +02:00
Simon Pilgrim cad2038700 [X86][SSE] combineSetCCMOVMSK - fold MOVMSK(SHUFFLE(X,u)) -> MOVMSK(X)
If we're permuting ALL the elements of a single vector, then for allof/anyof MOVMSK tests we can avoid the shuffle entirely.
2020-06-19 10:57:52 +01:00
David Sherwood 584d0d5c17 [SVE] Fall back on DAG ISel at -O0 when encountering scalable types
At the moment we use Global ISel by default at -O0, however it is
currently not capable of dealing with scalable vectors for two
reasons:

1. The register banks know nothing about SVE registers.
2. The LLT (Low Level Type) class knows nothing about scalable
   vectors.

For now, the easiest way to avoid users hitting issues when using
the SVE ACLE is to fall back on normal DAG ISel when encountering
instructions that operate on scalable vector types.

I've added a couple of RUN lines to existing SVE tests to ensure
we can compile at -O0. I've also added some new tests to

  CodeGen/AArch64/GlobalISel/arm64-fallback.ll

that demonstrate we correctly fallback to DAG ISel at -O0 when
lowering formal arguments or translating instructions that involve
scalable vector types.

Differential Revision: https://reviews.llvm.org/D81557
2020-06-19 10:57:00 +01:00
David Sherwood 0dc28af219 [CodeGen,AArch64] Fix up warnings in performExtendCombine
Try to avoid calling getVectorNumElements() or relying upon the
TypeSize conversion to uin64_t.

Differential Revision: https://reviews.llvm.org/D81573
2020-06-19 10:34:51 +01:00
Vitaly Buka f224f3d0f2 [StackSafety] Add StackLifetime::isAliveAfter
This function is going to be added into StackSafety checks.
This patch uses function in ::print implementation to make sure
that it works as expected.
2020-06-19 02:32:17 -07:00
Vitaly Buka 306c257b00 [SafeStack,NFC] Print liveness for all instrunctions 2020-06-19 02:32:17 -07:00
Vitaly Buka 20b1094a04 [StackSafety,NFC] Replace map with vector
We don't need to lookup InstructionNumbering by number, so
we can use vector with index as assigned number.
2020-06-19 02:32:17 -07:00
Vitaly Buka 7b27c09f63 [StackSafety,NFC] Don't test terminators
Code does not track terminators and do not expose them through interface.
State there is just a state of the last instruction or entry.
So this information is just redundant and doesn't need to be tested.
2020-06-19 02:32:17 -07:00
Florian Hahn f9d8e33c32 [SCCP] Turn sext into zext for non-negative ranges.
This patch updates SCCP/IPSCCP to use the computed range info to turn
sexts into zexts, if the value is known to be non-negative. We already
to a similar transform in CorrelatedValuePropagation, but it seems like
we can catch a lot of additional cases by doing it in SCCP/IPSCCP as
well.

The transform is limited to ranges that are known to not include undef.

Currently constant ranges from conditions are treated as potentially
containing undef, due to PR46144. Once we flip this, the transform will
be more effective in practice.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81756
2020-06-19 10:17:55 +01:00
Jay Foad 7cdf4326a8 [LiveIntervals] Fix early-clobber handling in handleMoveUp
Without this fix, handleMoveUp can create an invalid live range like
this:

[98904e,98908r:0)[98908e,227504r:1)

where the two segments overlap, but only because we have lost the "e"
(early-clobber) on the end point of the first segment.

Differential Revision: https://reviews.llvm.org/D82110
2020-06-19 10:17:04 +01:00
Tyker b7338fb1a6 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-19 10:32:26 +02:00
David Sherwood 7edc7f6edb [CodeGen] Fix SimplifyDemandedBits for scalable vectors
For now I have changed SimplifyDemandedBits and it's various callers
to assume we know nothing for scalable vectors and to ignore the
demanded bits completely. I have also done something similar for
SimplifyDemandedVectorElts. These changes fix up lots of warnings
due to calls to EVT::getVectorNumElements() for types with scalable
vectors. These functions are all used for optimisations, rather than
functional requirements. In future we can revisit this code if
there is a need to improve code quality for SVE.

Differential Revision: https://reviews.llvm.org/D80537
2020-06-19 07:59:35 +01:00
David Sherwood 9e811b0d93 [CodeGen] Fix ComputeNumSignBits for scalable vectors
When trying to calculate the number of sign bits for scalable vectors
we should just bail out for now and pretend we know nothing.

Differential Revision: https://reviews.llvm.org/D81093
2020-06-19 07:58:42 +01:00
Kristof Beyls d938ec4509 [AArch64] Avoid incompatibility between SLSBLR mitigation and BTI codegen.
A "BTI c" instruction only allows jumping/calling to using a BLR* instruction.
However, the SLSBLR mitigation changes a BLR to a BR to implement the
function call. Therefore, a "BTI c" check that passed before could
trigger after the BLR->BL change done by the SLSBLR mitigation.
However, if the register used in BR is X16 or X17, this trigger will not
fire (see ArmARM for further details).

Therefore, this patch simply changes the function stubs for the SLSBLR
mitigation from
__llvm_slsblr_thunk_x<N>:
    br x<N>
    SpeculationBarrier
to
__llvm_slsblr_thunk_x<N>:
    mov x16, x<N>
    br  x16
    SpeculationBarrier

Differential Revision: https://reviews.llvm.org/D81405
2020-06-19 06:21:54 +01:00
Ronak Chauhan 5bd33de9c8 [MC] Pass the symbol rather than its name to onSymbolStart()
Summary: This allows targets to also consider the symbol's type and/or address if needed.

Reviewers: scott.linder, jhenderson, MaskRay, aardappel

Reviewed By: scott.linder, MaskRay

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82090
2020-06-19 09:30:12 +05:30
Francesco Petrogalli d32c134648 [llvm][SVE] Reg + reg addressing mode for LD1RO.
Reviewers: efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80741
2020-06-19 03:56:10 +00:00
Nemanja Ivanovic 1fed131660 [PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448
2020-06-18 21:54:22 -05:00
Carl Ritson 8f3b2c8aa3 AMDGPU/GlobalISel: Remove selection of MAD/MAC when not available
Add code to respect mad-mac-f32-insts target feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81990
2020-06-19 10:30:19 +09:00
Vitaly Buka fcd67665a8 [StackSafety] Add "Must Live" logic
Summary:
Extend StackLifetime with option to calculate liveliness
where alloca is only considered alive on basic block entry
if all non-dead predecessors had it alive at terminators.

Depends on D82043.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82124
2020-06-18 16:53:37 -07:00
Nathan James 8b0df1c1a9
[NFC] Refactor Registry loops to range for 2020-06-19 00:40:10 +01:00
Vitaly Buka f672791e08 [StackSafety] Add pass for StackLifetime testing
Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82043
2020-06-18 16:34:18 -07:00
Matt Arsenault bbd78519f9 ARC: Enforce function alignment at code emission time
Don't do this in the MachineFunctionInfo constructor. Also, ensure the
alignment rather than overwriting it outright. I vaguely remember
there was another place to enforce the target minimum alignment, but I
couldn't find it (it's there for instructions).
2020-06-18 17:40:49 -04:00
Matt Arsenault 95605b784b AMDGPU/GlobalISel: Implement computeKnownAlignForTargetInstr
We probably need to move where intrinsics are lowered to copies to
make this useful.
2020-06-18 17:28:00 -04:00
Matt Arsenault b13f6b0fe0 BypassSlowDivision: Fix dropping debug info
I don't know anything about debug info, but this seems like more work
should be necessary. This constructs a new IRBuilder and reconstructs
the original divides rather than moving the original.

One problem this has is if a div/rem pair are handled, both end up
with the same debugloc. I'm not sure how to fix this, since this uses
a cache when it sees the same input operands again, which will have
the first instance's location attached.
2020-06-18 17:27:19 -04:00
Amy Kwan c45c161130 [PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:

vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long);
vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b);
unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
unsigned long long __builtin_pextd (unsigned long long, unsigned long long);

Revision Depends on D80758

Differential Revision: https://reviews.llvm.org/D80935
2020-06-18 16:23:56 -05:00
Matt Arsenault 7f8b2e1b91 GlobalISel: Pass LegalizerHelper to custom legalize callbacks
This was passing in all the parameters needed to construct a
LegalizerHelper in the custom legalization, when it's simpler to just
pass in the existing helper.

This is slightly more annoying to use in the common case where you
don't need the legalizer helper, but we could add back the common
parameters back in addition to the helper.

I didn't propagate this to all the internal target changes that this
logically implies, but did update a sample one for
legalizeMinNumMaxNum.

This is in preparation for moving AMDGPU load/store legalization
entirely into custom lowering. The current set of legalization actions
is really constraining and not really capable of expressing all the
actions needed to legalize loads/stores. In particular there's no way
to express when the memory access itself needs to change size vs. the
result type. There's also a lot of redundancy since the same
split/widen actions need to be applied in both vector and scalar
cases. All of the sub-cases logically belong as steps in the legalizer
helper, but it will be easier to consider everything at once in custom
lowering.
2020-06-18 17:17:38 -04:00
Christopher Tetreault 8d11ec66b6 [SVE] Remove calls to VectorType::getNumElements from Transforms/Utils
Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82057
2020-06-18 13:39:14 -07:00
Alexandre Ganea 2ae0df5be7 [CodeView] Revert 8374bf4363 and 403f953792
This reverts:
8374bf4363 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.
403f953792 [CodeView] Add full repro to LF_BUILDINFO record

This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test
And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c
2020-06-18 16:18:46 -04:00
Kirill Naumov 41d53194fb [BasicBlock] Added AnnotationWriter functionality to BasicBlock class
This functionality is very similar to Function compatibility with
AnnotationWriter. This change allows us to use AnnotationWriter with
BasicBlock through BB.print() method.

Reviewed-By: apilipenko
Differntial Revision: https://reviews.llvm.org/D81321
2020-06-18 19:49:58 +00:00
Sanjay Patel 46a285ad9e [IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC
The predicate can always be used to distinguish between icmp and fcmp,
so we don't need to keep repeating this check in the callers.
2020-06-18 15:47:06 -04:00
Davide Italiano 8cdd2a158c [SimplifyCFG] Update debug location when folding branch to common destination
Sometimes a dead block gets folded and the debug information is still
retained. This manifests as jumpy stepping in lldb, see the bugzilla PR
for an end-to-end C testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46008

Differential Revision:  https://reviews.llvm.org/D82062
2020-06-18 12:33:32 -07:00
Michael Liao 2defe55722 [TTI] Expose isNoopAddrSpaceCast in TTI.
Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82025
2020-06-18 14:40:47 -04:00
serge-sans-paille 4dd332723d Fix return status of LoopDistribute
Move code that may update the IR after precondition, so that if precondition
fail, the IR isn't modified.

Differential Revision: https://reviews.llvm.org/D81225
2020-06-18 20:13:18 +02:00
Matt Arsenault 779cba79ec AMDGPU: Remove mayLoad/mayStore from some side effecting intrinsics
These don't really modify any memory, and should not expect memory
operands.
2020-06-18 14:12:19 -04:00
Stanislav Mekhanoshin 6c7e1b16fa [AMDGPU] Added new encoding to getMCOpcodeGen
Nothing breaks yet, but all encodings shall be in the map.

Differential Revision: https://reviews.llvm.org/D81974
2020-06-18 10:11:33 -07:00
Arthur Eubanks 91ef930526 [GlobalOpt] Remove preallocated calls when possible
When possible (e.g. internal linkage), strip preallocated attribute off
parameters/arguments.
This requires removing the "preallocated" operand bundle from the call
site, replacing @llvm.call.preallocated.arg() with an alloca and a
bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since
@llvm.call.preallocated.arg() can be called multiple times with the same
arg index, we create an alloca per arg index.
We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was
and a @llvm.stackrestore() after the preallocated call to prevent the
stack from blowing up. This is valid because the argument would normally
not exist on the stack after the call before the transformation.

This does not currently handle all possible preallocated calls. We will
need to figure out where to put @llvm.stackrestore() in the cases where
there is no obvious place to put it, for example conditional
preallocated calls, invokes.

This sort of transformation may need to be moved to somewhere more
accessible to accomodate similar transformations (like inlining) in the
future.

Reviewers: efriedma, hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80951
2020-06-18 09:56:13 -07:00
Alexandros Lamprineas ecdf48f15b [ARM] Basic bfloat support
This patch adds basic support for BFloat in the Arm backend.
For now the code generation relies on fullfp16 being present.

Briefly:
* adds the bfloat scalar and vector types in the necessary register classes,
* adjusts the calling convention to cope with bfloat argument passing and return,
* adds codegen patterns for moves, loads and stores.

It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy).

The following people contributed to this patch:

 * Alexandros Lamprineas
 * Ties Stuij

Differential Revision: https://reviews.llvm.org/D81373
2020-06-18 17:26:24 +01:00
Simon Pilgrim 2474421398 [TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes.
If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.
2020-06-18 16:41:08 +01:00
Matt Arsenault 6f09bb7da2 AMDGPU: Don't pass MachineFunction if only the IR Function is used 2020-06-18 11:06:46 -04:00
Ayke van Laethem b4c91462e8
[AVR] Fix miscompilation of zext + add
Code like the following:

    define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) {
    entry:
      %conv = zext i1 %b to i32
      %add = add nsw i32 %conv, %a
      ret i32 %add
    }

Would compile to the following (incorrect) code:

    foo:
        mov     r18, r20
        clr     r19
        add     r22, r18
        adc     r23, r19
        sbci    r24, 0
        sbci    r25, 0
        ret

Those sbci instructions are clearly wrong, they should have been adc
instructions.

This commit improves codegen to use adc instead:

    foo:
        mov     r18, r20
        clr     r19
        ldi     r20, 0
        ldi     r21, 0
        add     r22, r18
        adc     r23, r19
        adc     r24, r20
        adc     r25, r21
        ret

This code is not optimal (it could be just 5 instructions instead of the
current 9) but at least it doesn't miscompile.

Differential Revision: https://reviews.llvm.org/D78439
2020-06-18 16:51:37 +02:00
Matt Arsenault 243303f8d7 Lanai: Remove unused method
This was depending on the MachineFunction at MachineFunctionInfo
construction, which will soon be disallowed.
2020-06-18 10:48:14 -04:00
Simon Pilgrim fe0a85faf4 [X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) == -1 -> PTESTZ(X,X)
Allow combineSetCCMOVMSK to handle 'allof' X == 0 patterns to be replaced with PTESTZ

This is a preliminary patch before properly handling PR35129
2020-06-18 15:38:32 +01:00
Alexandre Ganea 8374bf4363 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.
Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.
2020-06-18 10:07:30 -04:00
Kamlesh Kumar 7622ea5835 [RISCV64] Emit correct lib call for fp(float/double) to ui/si
Since i32 is not legal in riscv64,
it always promoted to i64 before emitting lib call and
for conversions like float/double to int and float/double to unsigned int
wrong lib call was emitted. This commit fix it using custom lowering.

Differential Revision: https://reviews.llvm.org/D80526
2020-06-18 19:34:16 +05:30
Igor Kudrin 6853cc7221 [MC] Rename a misnamed function. NFC.
The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to
better reflect an expression it creates and fix a naming style issue.

Differential Revision: https://reviews.llvm.org/D82079
2020-06-18 20:18:19 +07:00
Alexandre Ganea 403f953792 [CodeView] Add full repro to LF_BUILDINFO record
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).

Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.

For more information see PR36198 and D43002.

Differential Revision: https://reviews.llvm.org/D80833
2020-06-18 09:17:15 -04:00
Alexandre Ganea 24eff42ba4 [CodeView] Add TypeCollection::replaceType to replace type records post-merging
The API is not called in this patch. This is to simply/support https://reviews.llvm.org/D80833
2020-06-18 09:17:14 -04:00
Alexandre Ganea a45409d885 [Clang] Move clang::Job::printArg to llvm::sys::printArg. NFCI.
This patch is to support/simplify https://reviews.llvm.org/D80833
2020-06-18 09:17:13 -04:00
Florian Hahn 1669fddc9f [Matrix] Use alignment info when lowering loads/stores.
This patch updates LowerMatrixIntrinsics to preserve the alignment
specified at the original load/stores and the align attribute for the
pointer argument of the column.major.load/store intrinsics.

We can always use the specified alignment for the load of the first
column. For subsequent columns, the alignment may need to be reduced.

For ConstantInt strides, compute the offset for the start of the column in
bytes and use commonAlignment to get the largest valid alignment.

For non-ConstantInt strides, we need to take the common alignment of the
initial alignment and the element size in bytes.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D81960
2020-06-18 13:19:31 +01:00
Lucas Prates 92ad6d57c2 [ARM] Moving CMSE handling of half arguments and return to the backend
Summary:
As half-precision floating point arguments and returns were previously
coerced to either float or int32 by clang's codegen, the CMSE handling
of those was also performed in clang's side by zeroing the unused MSBs
of the coercer values.

This patch moves this handling to the backend's calling convention
lowering, making sure the high bits of the registers used by
half-precision arguments and returns are zeroed.

Reviewers: chill, rjmccall, ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81428
2020-06-18 13:16:29 +01:00
Lucas Prates a255931c40 [ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend
Summary:
Half-precision floating point arguments and returns are currently
promoted to either float or int32 in clang's CodeGen and there's
no existing support for the lowering of `half` arguments and returns
from IR in AArch32's backend.

Such frontend coercions, implemented as coercion through memory
in clang, can cause a series of issues in argument lowering, as causing
arguments to be stored on the wrong bits on big-endian architectures
and incurring in missing overflow detections in the return of certain
functions.

This patch introduces the handling of half-precision arguments and returns in
the backend using the actual "half" type on the IR. Using the "half"
type the backend is able to properly enforce the AAPCS' directions for
those arguments, making sure they are stored on the proper bits of the
registers and performing the necessary floating point convertions.

Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer

Reviewed By: ostannard

Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75169
2020-06-18 13:15:13 +01:00
Paul Walker 4612f39120 [SVE] Add flag to specify SVE register size, using this to calculate legal vector types.
Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE
data registers (in bits) to be specified. This allows the code
generator to make assumptions it normally couldn't. As a starting
point this information is used to mark fixed length vector types
that can fit within the specified size as legal.

Reviewers: rengolin, efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80384
2020-06-18 12:11:16 +00:00
Sameer Sahasrabuddhe 7aad220795 [DA] conservatively mark the join of every divergent branch
For a loop, a join block is a block that is reachable along multiple
disjoint paths from the exiting block of a loop. If the exit condition
of the loop is divergent, then such join blocks must also be marked
divergent. This currently fails in some cases because not all join
blocks are identified correctly.

The workaround is to conservatively mark every join block of any
branch (not necessarily the exiting block of a loop) as divergent.

https://bugs.llvm.org/show_bug.cgi?id=46372

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81806
2020-06-18 17:39:20 +05:30
Florian Hahn d88acd8f7d [Matrix] Preserve volatile when loading loads/stores.
Currently the matrix lowering turns volatile loads/stores into
non-volatile ones. This patch updates the lowering to preserve the
volatile bit.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D81498
2020-06-18 12:14:19 +01:00
Jeremy Morse 3626eba11f [NFC][LiveDebugValues] Document how LiveDebugValues operates
We're missing a plain English explanation of how this pass is supposed
to operate -- add one to the file comment.

Differential Revision: https://reviews.llvm.org/D80929
2020-06-18 10:54:09 +01:00
Ayke van Laethem 15bf42d503
[AVR] Implement disassembly of 32-bit instructions
This needed two fixes:

  * 32-bit instructions were read in the wrong order. The machine code
    swaps the two 16-bit instruction words, which wasn't undone when
    decoding instructions.
  * Jump and call instructions don't encode the lowest address bit,
    which is always zero. Therefore, the address needed to be shifted by
    one to fix that.

Differential Revision: https://reviews.llvm.org/D81961
2020-06-18 11:26:58 +02:00
David Sherwood 7e30ef77f6 [CodeGen] Fix warnings in getVectorTypeBreakdown
Added NextPowerOf2() routine to TypeSize and rewritten the code
in getVectorTypeBreakdown to avoid warnings being generated.

Differential Revision: https://reviews.llvm.org/D81578
2020-06-18 09:54:16 +01:00
Florian Hahn 6d18c2067e [Matrix] Update load/store intrinsics.
This patch adjust the load/store matrix intrinsics, formerly known as
llvm.matrix.columnwise.load/store, to improve the naming and allow
passing of extra information (volatile).

The patch performs the following changes:
 * Rename columnwise.load/store to column.major.load/store. This is more
   expressive and also more in line with the naming in Clang.
 * Changes the stride arguments from i32 to i64. The stride can be
   larger than i32 and this makes things more uniform with the way
   things are handled in Clang.
 * A new boolean argument is added to indicate whether the load/store
   is volatile. The lowering respects that when emitting vector
   load/store instructions
 * MatrixBuilder is updated to require both Alignment and IsVolatile
   arguments, which are passed through to the generated intrinsic. The
   alignment is set using the `align` attribute.

The changes are grouped together in a single patch, to have a single
commit that breaks the compatibility. We probably should be fine with
updating the intrinsics, as we did not yet officially support them in
the last stable release. If there are any concerns, we can add
auto-upgrade rules for the columnwise intrinsics though.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse

Reviewed By: anemet, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D81472
2020-06-18 09:44:52 +01:00
David Sherwood 65912a9768 [CodeGen] Fix warnings in foldCONCAT_VECTORS
Instead of asserting the number of elements is the same, we should be
comparing the element counts instead. In addition, when looking at
concats of extract_subvectors it's fine to use getVectorMinNumElements()
for scalable vectors.

I discovered these warnings when compiling the structured loads tests in
this file:

  test/CodeGen/AArch64/sve-intrinsics-loads.ll

Differential Revision: https://reviews.llvm.org/D81936
2020-06-18 09:29:37 +01:00
serge-sans-paille f9c7e3136e Correctly report modified status for HWAddressSanitizer
Differential Revision: https://reviews.llvm.org/D81238
2020-06-18 10:27:44 +02:00
David Green 158e734af1 [ARM] Adjust AND/OR combines to not call isConstantSplat on i1 vectors. NFC.
The rearranges PerformANDCombine and PerformORCombine to try and make
sure we don't call isConstantSplat on any i1 vectors. As pointed out in
D81860 it may not be very well defined in those cases.
2020-06-18 08:25:44 +01:00
Kristof Beyls 832cfc7672 [IndirectThunks] Make generated MF structure as expected by all instruction selectors.
This also enables running the AArch64 SLSHardening pass with GlobalISel,
so add a test for that.

Differential Revision: https://reviews.llvm.org/D81403
2020-06-18 06:44:53 +01:00
Kristof Beyls 3f0cc96a96 [AArch64] SLSHardening: compute correct thunk name for X29.
The enum values for AArch64 registers are not all consecutive.
Therefore, the computation
  "__llvm_slsblr_thunk_x" + utostr(Reg - AArch64::X0)
is not always correct. utostr(Reg - AArch64::X0) will not generate the
expected string for the registers that do not have consecutive values in
the enum.
This happened to work for most registers, but does not for AArch64::FP
(i.e. register X29).
This can get triggered when the X29 is not used as a frame pointer.

Differential Revision: https://reviews.llvm.org/D81997
2020-06-18 06:36:49 +01:00
Xing GUO d261a1c0e0 [DWARFYAML][debug_abbrev] Make the abbreviation code optional.
This patch helps make the `Code` optional in abbreviations table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81826
2020-06-18 13:02:54 +08:00
Mehdi Amini 77b79d79c0 Remove "unused" member ModuleSlice from `struct OpenMPOpt`
This is fixing warning from clang:

 warning: private field 'ModuleSlice' is not used [-Wunused-private-field]
  SmallPtrSetImpl<Function *> &ModuleSlice;
                               ^

Differential Revision: https://reviews.llvm.org/D82027
2020-06-18 03:02:26 +00:00
Kang Zhang 58e19d465a [PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator
Summary:
For PPC BinaryOperator of fp128 will become libcall, we shouldn't
convert loop to CTR loop if the loop contain libCall.

But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal
with BinaryOperator for ppc_fp128, don't deal with the fp128.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81353
2020-06-18 02:54:19 +00:00
Xing GUO 1f391afbf4 [ObjectYAML][ELF] Add support for emitting the .debug_abbrev section.
This patch enables yaml2elf emit the .debug_abbrev section.

The generated .debug_abbrev is verified using `llvm-dwarfdump`.

Known issues that will be addressed later:
- Current implementation doesn't support generating multiple abbreviation tables in one .debug_abbrev section.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D81820
2020-06-18 10:50:38 +08:00
Esme-Yi ad6024e29f [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected.
In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81076
2020-06-18 01:32:23 +00:00
Sam Clegg 7ee758d691 [WebAssembly] MC: Fix for data aliases with offsets (getelementptr)
For some reason we hadn't seen such cases in the wild which makes
me think that clang and rustc don't generate these.  In the bug which
reproduces it only occurs with LTO so my guess is that some LTO pass
is creating this alias + gep.

See: https://github.com/emscripten-core/emscripten/issues/8731

Differential Revision: https://reviews.llvm.org/D79462
2020-06-17 16:25:50 -07:00
Matt Arsenault 5f5f566b26 AMDGPU: Don't use 16-bit FP inline constants in integer operands
It seems to be a hardware defect that the half inline constants do not
work as expected for the 16-bit integer operations (the inverse does
work correctly). Experimentation seems to show these are really
reading the 32-bit inline constants, which can be observed by writing
inline asm using op_sel to see what's in the high half of the
constant. Theoretically we could fold the high halves of the 32-bit
constants using op_sel.

The *_asm_all.s MC tests are broken, and I don't know where the script
to autogenerate these are. I started manually fixing it, but there's
just too many cases to fix. This also does break the
assembler/disassembler support for these values, and I'm not sure what
to do about it. These are still valid encodings, so it seems like you
should be able to use them in some way. If you wrote assembly using
them, you could have really meant it (perhaps to read the high bits
with op_sel?). The disassembler will print the invalid literal
constant which will fail to re-assemble. The behavior is also
different depending on the use context. Consider this example, which
was previously accepted and encoded using the inline constant:

  v_mad_i16 v5, v1, -4.0, v3
  ; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04]

In contexts where an inline immediate is required (such as on gfx8/9),
this will now be rejected. For gfx10, this will produce the literal
encoding and change the printed format:
  v_mad_i16 v5, v1, 0xc400, v3
  ; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00]

This is just another variation of the issue that we don't perfectly
handle round trip assembly/disassembly due to not tracking how
immediates were encoded. This doesn't matter much in practice, since
compilers don't emit the suboptimal encoding. I doubt any users are
relying on this behavior (although I did make use of the old behavior
to figure out what was wrong).

Fixes bug 46302.
2020-06-17 19:14:10 -04:00
Yonghong Song 89648eb16d [BPF] fix a bug for BTF pointee type pruning
In BTF, pointee type pruning is used to reduce cluttering
too many unused types into prog BTF. For example,
   struct task_struct {
      ...
      struct mm_struct *mm;
      ...
   }
If bpf program does not access members of "struct mm_struct",
there is no need to bring types for "struct mm_struct" to BTF.

This patch fixed a bug where an incorrect pruning happened.
The test case like below:
    struct t;
    typedef struct t _t;
    struct s1 { _t *c; };
    int test1(struct s1 *arg) { ... }

    struct t { int a; int b; };
    struct s2 { _t c; }
    int test2(struct s2 *arg) { ... }

After processing test1(), among others, BPF backend generates BTF types for
    "struct s1", "_t" and a placeholder for "struct t".
Note that "struct t" is not really generated. If later a direct access
to "struct t" member happened, "struct t" BTF type will be generated
properly.

During processing test2(), when processing member type "_t c",
BPF backend sees type "_t" already generated, so returned.
This caused the problem that "struct t" BTF type is never generated and
eventually causing incorrect type definition for "struct s2".

To fix the issue, during DebugInfo type traversal, even if a
typedef/const/volatile/restrict derived type has been recorded in BTF,
if it is not a type pruning candidate, type traversal of its base type continues.

Differential Revision: https://reviews.llvm.org/D82041
2020-06-17 15:13:46 -07:00
Eric Christopher a8dad30388 Revert "Remove unused class variable ModuleSlice." as it was
used in debug only code.

This reverts commit 07a1749081.
2020-06-17 14:45:17 -07:00
Eric Christopher 07a1749081 Remove unused class variable ModuleSlice. 2020-06-17 14:33:29 -07:00
Christopher Tetreault 8819202dfd [SVE] Eliminate bad VectorType::getNumElements() calls from ConstantFold
Summary:
Assume all usages of this function are explicitly fixed-width operations
and cast to FixedVectorType

Reviewers: efriedma, sdesmalen, c-rhodes, majnemer, dblaikie

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80262
2020-06-17 14:19:56 -07:00
Christopher Tetreault 4b776a98f1 [SVE] Fix invalid usages of getNumElements in ShuffleVectorInstruction
Summary:
Fix invalid usages of getNumElements identified by test case
LLVM.Transforms/InstCombine::vscale_extractelement.ll.

changesLength: Since the length of the llvm::SmallVector shufflemask
is related to the minimum number of elements in a scalable vector, it is
fine to just get the Min field of the ElementCount

isIdentityWithExtract: Since it is not possible to express the mask
needed for this pattern for scalable vectors, we can just bail before
calling getNumElements()

Reviewers: efriedma, sdesmalen, fpetrogalli, gchatelet, yrouban, craig.topper

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81969
2020-06-17 13:45:34 -07:00
Roman Lebedev 84b4f5a6a6
[InstCombine] Negator: while there, add detection for cycles during negation
I don't have any testcases showing it happening,
and i haven't succeeded in creating one,
but i'm also not positive it can't ever happen,
and i recall having something that looked like
that in the very beginning of Negator creation.

But since we now already have a negation cache,
we can now detect such cases practically for free.

Let's do so instead of "relying" on stack overflow :D
2020-06-17 22:47:20 +03:00
Roman Lebedev e3d8cb1e1d
[InstCombine] Negator: cache negation results (PR46362)
It is possible that we can try to negate the same value multiple times.
For example, PHI nodes may happen to have multiple incoming values
(all of which must be the same value) for the same incoming basic block.
It may happen that we try to negate such a PHI node, and succeed,
and that might result in having now-different incoming values..

To avoid that, and in general to reduce the amount of duplicated
work we might be doing, let's introduce a cache where
we'll track results of negating each value.

The added test was previously failing -verify after -instcombine.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46362
2020-06-17 22:47:20 +03:00
Roman Lebedev c4166f3d84
[NFC][InstCombine] Negator: add thin negate() wrapped before visit() 2020-06-17 22:47:20 +03:00
Roman Lebedev 2b85147337
[NFC][InstCombine] Negator: do not include unneeded "llvm/IR/DerivedTypes.h" header 2020-06-17 22:47:19 +03:00
Thomas Lively 49754dcf22 [WebAssembly] Fix bug in FixBrTables and use branch analysis utils
Summary:
This commit fixes a bug in the FixBrTables pass in which an
unconditional branch from the switch header block to the jump table
block was not removed before the blocks were combined. The result was
an invalid CFG in the MachineFunction. This commit also switches from
using bespoke branch analysis and deletion code to using the standard
utilities for the same.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81909
2020-06-17 12:34:45 -07:00
Nick Desaulniers e7816f263b [InlineSpiller] add assert about spills post terminators
Summary:
This invariant is being violated in the test case
https://reviews.llvm.org/D77849, related to the use of the relatively
new ability for callbr to have return values, and MachineBasicBlocks
with INLINEASM_BR terminators to emit live out register defs.

As noted in the comment, this triggers invariant violations in
MachineVerifier via `llc -verify-machineinstrs` or
`llc -verify-regalloc`, since only MachineInstrs that are terminators
are allowed to follow the first terminator.

https://reviews.llvm.org/D75098 may rework this very assertion if we're
spilling via a (proposed) TCOPY MachineInstr.

Reviewers: void, efriedma, arsenm

Reviewed By: efriedma

Subscribers: qcolombet, wdng, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78166
2020-06-17 11:51:58 -07:00
Nick Desaulniers 88c965ba14 BreakCriticalEdges for callbr indirect dests
Summary:
llvm::SplitEdge was failing an assertion that the BasicBlock only had
one successor (for BasicBlocks terminated by CallBrInst, we typically
have multiple successors).  It was surprising that the earlier call to
SplitCriticalEdge did not handle the critical edge (there was an early
return).  Removing that triggered another assertion relating to creating
a BlockAddress for a BasicBlock that did not (yet) have a parent, which
is a simple order of operations issue in llvm::SplitCriticalEdge (a
freshly constructed BasicBlock must be inserted into a Function's basic
block list to have a parent).

Thanks to @nathanchance for the report.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018

Reviewers: craig.topper, jyknight, void, fhahn, efriedma

Reviewed By: efriedma

Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81607
2020-06-17 11:45:06 -07:00
Davide Italiano 1cbaf847ab [CGP] Reset the debug location when promoting zext(s).
When the zext gets promoted, it used to retain the original location,
which pessimizes the debugging experience causing an unexpected
jump in stepping at -Og.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also
contains a full C repro).

Differential Revision:  https://reviews.llvm.org/D81437
2020-06-17 11:13:13 -07:00
Ian Levesque 7c7c8e0da4 [xray] Option to omit the function index
Summary:
Add a flag to omit the xray_fn_idx to cut size overhead and relocations
roughly in half at the cost of reduced performance for single function
patching.  Minor additions to compiler-rt support per-function patching
without the index.

Reviewers: dberris, MaskRay, johnislarry

Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D81995
2020-06-17 13:49:01 -04:00
Alexandre Ganea acb30f6856 [X86] For 32-bit targets, emit two-byte NOP when possible
In order to support hot-patching, we need to make sure the first emitted instruction in a function is a two-byte+ op. This is already the case on x86_64, which seems to always emit two-byte+ ops. However on 32-bit targets this wasn't the case.

PATCHABLE_OP now lowers to a XCHG AX, AX, (66 90) like MSVC does. However when targetting pentium3 (/arch:SSE) or i386 (/arch:IA32) targets, we generate MOV EDI,EDI (8B FF) like MSVC does. This is for compatiblity reasons with older tools that rely on this two byte pattern.

Differential Revision: https://reviews.llvm.org/D81301
2020-06-17 13:44:38 -04:00
Alexandre Ganea ad879b31f0 [X86] Change signature of EmitNops. NFC.
This is to support https://reviews.llvm.org/D81301.
2020-06-17 13:44:37 -04:00
Fangrui Song c8b082a3ab [llvm-cov gcov] Support clang<11 fake 4.2 format
Test cases are restored from a3bed4bd37
2020-06-17 10:17:15 -07:00
Michał Górny 5c621900a6 [llvm] [CommandLine] Do not suggest really hidden opts in nearest lookup
Skip 'really hidden' options when performing lookup of the nearest
option when invalid option was passed.  Since these options aren't even
documented in --help-hidden, it seems inconsistent to suggest them
to users.

This fixes clang-tools-extra test failures due to unexpected suggestions
when linking the tools to LLVM dylib (that provides more options than
the subset of LLVM libraries linked directly).

Differential Revision: https://reviews.llvm.org/D82001
2020-06-17 19:00:26 +02:00
Scott Linder 691ff4682f [AMDGPU] Skip CFIInstructions in SIInsertWaitcnts
Summary:
CFI emitted during PEI at the beginning of the prologue needs to apply
to any inserted waitcnts on function entry.

Reviewers: arsenm, t-tye, RamNalamothu

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D76881
2020-06-17 12:41:03 -04:00
vnalamot 2e28009981 [NFC] Move getAll{S,V}GPR{32,128} methods to SIFrameLowering
Summary:
Future patch needs some of these in multiple places.

The definitions of these can't be in the header and be eligible for
inlining without making the full declaration of GCNSubtarget visible.
I'm not sure what the right trade-off is, but I opted to not bloat
SIRegisterInfo.h

Reviewers: arsenm, cdevadas

Reviewed By: arsenm

Subscribers: RamNalamothu, qcolombet, jvesely, wdng, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79878
2020-06-17 12:08:09 -04:00
sstefan1 7cfd267c51 [OpenMPOPT][NFC] Introducing OMPInformationCache.
Summary:
Introduction of OpenMP-specific information cache based on Attributor's `InformationCache`. This should make it easier to share information between them.

Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku

Subscribers: yaxunl, hiraditya, guansong, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81798
2020-06-17 16:56:45 +02:00
Jay Foad def2e4c47f [AMDGPU] Simplify GCNPassConfig::addOptimizedRegAlloc. NFC. 2020-06-17 15:56:15 +01:00
Simon Pilgrim a5f1f9c9b8 ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC.
Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency.

Add implicit header dependency to source files where necessary.
2020-06-17 15:48:23 +01:00
Sjoerd Meijer d1522513d4 [ARM] Reimplement MVE Tail-Predication pass using @llvm.get.active.lane.mask
To set up a tail-predicated loop, we need to to calculate the number of
elements processed by the loop. We can now use intrinsic
@llvm.get.active.lane.mask() to do this, which is emitted by the vectoriser in
D79100. This intrinsic generates a predicate for the masked loads/stores, and
consumes the Backedge Taken Count (BTC) as its second argument. We can now use
that to reconstruct the loop tripcount, instead of the IR pattern match
approach we were using before.

Many thanks to Eli Friedman and Sam Parker for all their help with this work.

This also adds overflow checks for the different, new expressions that we
create: the loop tripcount, and the sub expression that calculates the
remaining elements to be processed. For the latter, SCEV is not able to
calculate precise enough bounds, so we work around that at the moment, but is
not entirely correct yet, it's conservative. The overflow checks can be
overruled with a force flag, which is thus potentially unsafe (but not really
because the vectoriser is the only place where this intrinsic is emitted at the
moment). It's also good to mention that the tail-predication pass is not yet
enabled by default.  We will follow up to see if we can implement these
overflow checks better, either by a change in SCEV or we may want revise the
definition of llvm.get.active.lane.mask.

Differential Revision: https://reviews.llvm.org/D79175
2020-06-17 15:17:42 +01:00
Kirill Naumov ea844c7520 Revert "[InlineCost] InlineCostAnnotationWriterPass introduced"
This reverts commit 37e06e8f5c.
2020-06-17 14:02:34 +00:00
Kirill Naumov dcf2a9f2ee Revert "[InlineCost] PrinterPass prints constants to which instructions are simplified"
This reverts commit 52b0db22f8.
2020-06-17 14:02:29 +00:00
Kirill Naumov 39a4505e34 Revert "[InlineCost] GetElementPtr with constant operands"
This reverts commit 34fba68d80.
2020-06-17 14:02:18 +00:00
Kirill Naumov 34fba68d80 [InlineCost] GetElementPtr with constant operands
If the GEP instruction contanins only constants as its arguments,
then it should be recognized as a constant. For now, there was
also added a flag to turn off this simplification if it causes
any regressions ("disable-gep-const-evaluation") which is off
by default. Once I gather needed data of the effectiveness of
this simplification, the flag will be deleted.

Reviewers: apilipenko, davidxl, mtrofin

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81026
2020-06-17 13:40:19 +00:00
Kirill Naumov 52b0db22f8 [InlineCost] PrinterPass prints constants to which instructions are simplified
This patch enables printing of constants to see which instructions were
constant-folded. Needed for tests and better visiual analysis of
inliner's work.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81024
2020-06-17 13:40:18 +00:00
Kirill Naumov 37e06e8f5c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-17 13:40:17 +00:00
Benjamin Kramer df9a51dab3 Remove global std::strings. NFCI. 2020-06-17 14:29:42 +02:00
Sjoerd Meijer c1034d044a Follow up of rGe345d547a0d5, and attempt to pacify buildbot:
"error: 'get' is deprecated: The base class version of get with the scalable
argument defaulted to false is deprecated."

Changed VectorType::get() -> FixedVectorType::get().
2020-06-17 13:24:09 +01:00
Sjoerd Meijer e345d547a0 Recommit "[LV] Emit @llvm.get.active.lane.mask for tail-folded loops"
Fixed ARM regression test.

Please see the original commit message rG47650451738c for details.
2020-06-17 13:12:15 +01:00
David Green 076e08aa45 [LSR] Filter for postinc formulae
In more complicated loops we can easily hit the complexity limits of
loop strength reduction. If we do and filtering occurs, it's all too
easy to remove the wrong formulae for post-inc preferring accesses due
to it attempting to maximise register re-use. The patch adds an
alternative filtering step when the target is preferring postinc to pick
postinc formulae instead, hopefully lowering the complexity to below the
limit so that aggressive filtering is not needed.

There is also a change in here to stop considering existing addrecs as
free under postinc. We should already be modelling them as a reg so
don't want it to cause us to get the cost wrong. (I'm not sure that code
makes sense in general, but there are X86 tests specifically for it
where it seems to be helping so have left it around for the standard
non-post-inc case).

Differential Revision: https://reviews.llvm.org/D80273
2020-06-17 12:32:04 +01:00
Carl Ritson ac8a2f132b [AMDGPU] Fix failure in VCC spilling
Spills of VCC (SGPR64) will fail with new SGPR spill code,
because super register is not correctly resolved.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81224
2020-06-17 20:11:15 +09:00
Benjamin Kramer 547b6da73c [CallPrinter] Remove static constructor.
No need to have std::string here. NFC.
2020-06-17 13:02:58 +02:00
Sam Parker 5bf0858c0b Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant"
I originally reverted the patch because it was causing performance
issues, but now I think it's just enabling simplify-cfg to do
something that I don't want instead :)

Sorry for the noise.

This reverts commit 3e39760f8e.
2020-06-17 11:38:59 +01:00
Paul Walker 95db1e7fb9 [FileCheck] Implement * and / operators for ExpressionValue.
Subscribers: arichardson, hiraditya, thopre, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80915
2020-06-17 09:39:17 +00:00
Hans Wennborg 16ad6eeb94 [IR] Don't copy profile metadata in createCallMatchingInvoke()
The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.

Differential revision: https://reviews.llvm.org/D81996
2020-06-17 11:18:23 +02:00
serge-sans-paille 1cafd8a5d1 Fix LoopIdiomRecognize pass return status
Introduce an helper class to aggregate the cleanup in case of rollback.

Differential Revision: https://reviews.llvm.org/D81230
2020-06-17 11:12:03 +02:00
Sjoerd Meijer d4e183f686 Revert "[LV] Emit @llvm.get.active.mask for tail-folded loops"
This reverts commit 4765045173
while I investigate the build bot failures.
2020-06-17 10:09:54 +01:00
Max Kazantsev 4ac9a6902f [NFC] Add API for edge domination check in dom tree 2020-06-17 16:05:05 +07:00
Florian Hahn 773353be4e [SCCP] Move common code to simplify basic block to helper (NFC).
Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81755
2020-06-17 10:03:43 +01:00
Sjoerd Meijer 4765045173 [LV] Emit @llvm.get.active.mask for tail-folded loops
This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised
loops if the intrinsic is supported by the backend, which is checked by
querying TargetTransform hook emitGetActiveLaneMask.

This intrinsic creates a mask representing active and inactive vector lanes,
which is used by the masked load/store instructions that are created for
tail-folded loops. The semantics of @llvm.get.active.mask are described here in
LangRef:

https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics

This intrinsic is also used to provide a hint to the backend. That is, the
second argument of the intrinsic represents the back-edge taken count of the
loop. For MVE, for example, we use that to set up tail-predication, which is a
new form of predication in MVE for vector loops that implicitely predicates the
last vector loop iteration by implicitely setting active/inactive lanes, i.e.
the tail loop is predicated. In order to set up a tail-predicated vector loop,
we need to know the number of data elements processed by the vector loop, which
corresponds the the tripcount of the scalar loop, which we can now reconstruct
using @llvm.get.active.mask.

Differential Revision: https://reviews.llvm.org/D79100
2020-06-17 09:53:58 +01:00
Sjoerd Meijer 20835cff27 [TTI] Refactor emitGetActiveLaneMask
Refactor TTI hook emitGetActiveLaneMask and remove the unused arguments
as suggested in D79100.
2020-06-17 09:53:58 +01:00
Kirill Bobyrev 3847737fa4
[CallPrinter] Handle freq = 0 case
Improvement of the following revision:
bbc629ebd6

This might still be problematic if freq = 0, so it's better to check for
that.
2020-06-17 10:52:18 +02:00
Kirill Bobyrev bbc629ebd6
[CallPrinter] Fix maxFreq = 0 case
llvm::getHeatColor becomes a problem when maxFreq = 0 -> freq = 0 =>
log2(double(freq)) / log2(maxFreq) -> log2(0.) / log2(0.) which
results in illegal instruction on some architectures.

Problematic revision: https://reviews.llvm.org/D77172
2020-06-17 10:44:28 +02:00
Florian Hahn e4b58ea8c1 [MemDep] Also remove load instructions from NonLocalDesCache.
Currently load instructions are added to the cache for invariant pointer
group dependencies, but only pointer values are removed currently. That
leads to dangling AssertingVHs in the test case below, where we delete a
load from an invariant pointer group. We should also remove the entries
from the cache.

Fixes PR46054.

Reviewers: efriedma, hfinkel, asbirlea

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81726
2020-06-17 09:36:53 +01:00
James Henderson b21794a91c [DebugInfo] Unify Cursor usage for all debug line opcodes
This is a natural extension of the previous changes to use the Cursor
class independently in the standard and extended opcode paths, and in
turn allows delaying error handling until the entire line has been
printed in verbose mode, removing interleaved output in some cases.

Reviewed by: MaskRay, JDevlieghere

Differential Revision: https://reviews.llvm.org/D81562
2020-06-17 09:19:24 +01:00
Vitaly Buka d812efb121 [SafeStack,NFC] Fix names after files move
Summary: Depends on D81831.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81832
2020-06-17 01:08:40 -07:00
Vitaly Buka 6754a0e2ed [SafeStack,NFC] Move SafeStackColoring code
Summary:
This code is going to be used in StackSafety.
This patch is file move with minimal changes. Identifiers
will be fixed in the followup patch.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81831
2020-06-17 01:07:47 -07:00
Jonas Paulsson d3f7448e3c [SystemZ] Bugfix in storeLoadCanUseBlockBinary().
Check that the MemoryVT of LoadA matches that of LoadB.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46239.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D81671
2020-06-17 09:49:31 +02:00
Kang Zhang c2574dc9f7 [NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass
Summary:

In the patch D62907 the PPC CTRLoops pass has been replaced by Generic
Hardware Loop pass, and it has imported some new intrinsic for Generic
Hardware Loop.

The old intrinsic used in PPC CTRLoops int_ppc_mtctr and
int_ppc_is_decremented_ctr_nonzero is been replaced by
int_set_loop_iterations and loop_decrement.

This patch is to remove above unused two instrinsic.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81539
2020-06-17 07:06:46 +00:00
Serge Pavlov 2e613d2ded [Support] Get process statistics in ExecuteAndWait and Wait
The functions sys::ExcecuteAndWait and sys::Wait now have additional
argument of type pointer to structure, which is filled with process
execution statistics upon process termination. These are total and user
execution times and peak memory consumption. By default this argument is
nullptr so existing users of these function must not change behavior.

Differential Revision: https://reviews.llvm.org/D78901
2020-06-17 13:39:59 +07:00
Igor Kudrin ccbd7e8d46 [DebugInfo] Support parsing and dumping of DWARF64 macro units.
Differential Revision: https://reviews.llvm.org/D81844
2020-06-17 12:57:54 +07:00
Sameer Sahasrabuddhe d3963b3a5f [DA] propagate loop live-out values that get used in a branch
Values that are uniform within a loop but appear divergent to uses
outside the loop are "tainted" so that such uses are marked
divergent. But if such a use is a branch, then it's divergence needs
to be propagated. The simplest way to do that is to put the branch
back in the main worklist so that it is processed appropriately.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81822
2020-06-17 09:21:00 +05:30
Itay Bookstein df9d64ed9c [IR] Add missing GlobalAlias copying of ThreadLocalMode attribute
Summary:
Previously, GlobalAlias::copyAttributesFrom did not preserve ThreadLocalMode,
causing incorrect IR generation in IR linking flows. This patch pushes the code
responsible for copying this attribute from GlobalVariable::copyAttributesFrom
down to GlobalValue::copyAttributesFrom so that it is shared by GlobalAlias.
Fixes PR46297.

Reviewers: tejohnson, pcc, hans

Reviewed By: tejohnson, hans

Subscribers: hiraditya, ibookstein, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81605
2020-06-16 20:15:27 -07:00
Matt Arsenault 3b34f3fcca AMDGPU/GlobalISel: Fix obvious bug in ported 32-bit udiv/urem
This was hidden by the IR expansion in AMDGPUCodeGenPrepare, which I
forgot to turn off.
2020-06-16 22:46:35 -04:00
Xing GUO 9aaa32cfcb [ObjectYAML][DWARF] Let writeVariableSizedInteger() return Error.
This patch helps change the return type of `writeVariableSizedInteger()` from `void` to `Error`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81915
2020-06-17 09:30:14 +08:00
Matt Arsenault c5c58fd6b5 AMDGPU: Remove intermediate DAG node for trig_preop intrinsic
We weren't doing anything with this, and keeping it would just add
more boilerplate for GlobalISel.
2020-06-16 21:06:25 -04:00
Christopher Tetreault 8e204f807b [SVE] Generalize size checks in Verifier to use getElementCount
Summary:
Attempts to call getNumElements on scalable vectors identified by test
LLVM.Other::scalable-vectors-core-ir.ll. Since these checks are all
attempting to find if two vectors are the same size, calling
getElementCount will only increase safety.

Reviewers: efriedma, aprantl, reames, kmclaughlin, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81895
2020-06-16 16:03:36 -07:00
Aaron Smith 7e01675ea5 [SelectionDAG] Add MVT::bf16 to getConstantFP()
Summary:
This was probably overlooked in recent bfloat patches.
Needed to handle bf16 constants in SelectionDAG.

  ConstantFP:bf16<APFloat(0)>

Reviewers: stuij

Reviewed By: stuij

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81779
2020-06-16 15:10:05 -07:00
Fangrui Song 7f7cb79b57 [llvm-cov gcov] Don't suppress .gcov output if .gcda is corrupted
If .gcda is corrupted, gcov continues to produce a .gcov and just
assumes execution counts are zeros. This is reasonable, because the
program can corrupt its .gcda output. The code path should be similar to
the code path without .gcda.
2020-06-16 14:55:38 -07:00
Daniel Sanders e35ba09961 [gicombiner] Allow generated combiners to store additional members
Summary:
Adds the ability to add members to a generated combiner via
a State base class. In the current AArch64PreLegalizerCombiner
this is used to make Helper available without having to
provide it to every call.

As part of this, split the command line processing into a
separate object so that it still only runs once even though
the generated combiner is constructed more frequently.

Depends on D81862

Reviewers: aditya_nandakumar, bogner, volkan, aemerson, paquette, arsenm

Reviewed By: arsenm

Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81863
2020-06-16 14:47:04 -07:00
Kirill Naumov 369d00df60 [CallPrinter] Adding heat coloring to CallPrinter
This patch introduces the heat coloring of the Call Printer which is based
on the relative "hotness" of each function. The patch is a part of sequence of
three patches, related to graphs Heat Coloring.
Another feature added is the flag similar to "-cfg-dot-filename-prefix",
which allows to write the graph into a named .pdf

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D77172
2020-06-16 21:15:29 +00:00
Fangrui Song def2156389 [gcov] Add -i --intermediate-format
Between gcov 4.9~8, `gcov -i $file` prints coverage information to
$file.gcov in an intermediate text format (single file, instead of
$source.gcov for each source file).

lcov newer than 2019-05-24 detects -i support and uses it to increase
processing speed.  gcov 9 (GCC r265587) removed --intermediate-format
and -i was changed to mean --json-format. However, we consider this
format still useful and support it. geninfo (part of lcov) supports this
format even if we announce that we are compatible with gcov 9.0.0
2020-06-16 14:14:28 -07:00
Fangrui Song 4cd7ba7eca [gcov] Refactor llvm-cov gcov and add SourceInfo 2020-06-16 14:14:26 -07:00
Christopher Tetreault 616d8d942b [SVE] Eliminate calls to default-false VectorType::get() from AArch64
Reviewers: efriedma, c-rhodes, david-arm, samparker, greened

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81518
2020-06-16 13:53:25 -07:00
Christopher Tetreault b265cad93e [NFC] Bail out for scalable vectors before calling getNumElements
Summary:
Move the bail out logic to before constructing the Result and Lane
vectors. This is both potentially faster, and avoids calling
getNumElements on a potentially scalable vector

Reviewers: efriedma, sunfish, chandlerc, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81619
2020-06-16 13:41:29 -07:00
Christopher Tetreault 747486991c [SVE] Fix bad FixedVectorType cast in simplifyDivRem
Summary:
simplifyDivRem attempts to walk a VectorType elementwise. Ensure that it
only does so for FixedVectorType

Reviewers: efriedma, spatel, lebedev.ri, david-arm, kmclaughlin

Reviewed By: spatel, david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81856
2020-06-16 13:17:05 -07:00
Christopher Tetreault ff628f5f5e [SVE] Eliminate calls to default-false VectorType::get() from Vectorize
Reviewers: efriedma, fhahn, spatel, sdesmalen, kmclaughlin

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81521
2020-06-16 12:50:13 -07:00
Matt Arsenault e4f19d1dda GlobalISel: Fix not failing on widening G_INSERT_VECTOR_ELT
This doesn't actually handled type idx 0, but was reporting Legalized
on it. No test changes because nothing was trying to use this.
2020-06-16 15:48:57 -04:00
Ahsan Saghir 37e72f47a4 [PowerPC] Add -m[no-]power10-vector clang and llvm option
Summary: This patch adds command line option for enabling power10-vector support.

Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc

Reviewed By: lei, amyk, #powerpc

Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits

Tags: #llvm, #clang, #powerpc

Differential Revision: https://reviews.llvm.org/D80758
2020-06-16 14:47:35 -05:00
Matt Arsenault 8a3340d25d GlobalISel: Use early return and reduce indentation 2020-06-16 14:47:08 -04:00
Stanislav Mekhanoshin 3f0c9c1634 Fix ubsan error in tblgen with signed left shift
UBSAN complains when tblgen performs SHL of a negative
value.

Differential Revision: https://reviews.llvm.org/D81952
2020-06-16 11:15:09 -07:00
Hiroshi Yamauchi 6bc2b042f4 [TLI] Add four C++17 delete variants.
Summary:
delete(void*, unsigned int, align_val_t)
delete(void*, unsigned long, align_val_t)
delete[](void*, unsigned int, align_val_t)
delete[](void*, unsigned long, align_val_t)

Differential Revision: https://reviews.llvm.org/D81853
2020-06-16 11:12:02 -07:00
Sanjay Patel ed67f5e7ab [VectorCombine] scalarize compares with insertelement operand(s)
Generalize scalarization (recently enhanced with D80885)
to allow compares as well as binops.
Similar to binops, we are avoiding scalarization of a loaded
value because that could avoid a register transfer in codegen.
This requires 1 extra predicate that I am aware of: we do not
want to scalarize the condition value of a vector select. That
might also invert a transform that we do in instcombine that
prefers a vector condition operand for a vector select.

I think this is the final step in solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

Differential Revision: https://reviews.llvm.org/D81661
2020-06-16 13:48:10 -04:00
Jessica Paquette 7caa9caa80 [AArch64][GlobalISel] Avoid creating redundant ubfx when selecting G_ZEXT
When selecting 32 b -> 64 b G_ZEXTs, we don't have to always emit the extend.

If the instruction feeding into the G_ZEXT implicitly zero extends the high
half of the register, we can just emit a SUBREG_TO_REG instead.

Differential Revision: https://reviews.llvm.org/D81897
2020-06-16 09:50:47 -07:00
Fangrui Song 4799fb63b5 [GlobalISel] Delete unused variable after r353432 2020-06-16 08:32:09 -07:00
Leandro Vaz 56262a74c3 Fix debug line info when line markers are present inside macros.
Compiling assembly files when newlines are reduced to line markers within a `.macro` context will generate wrong information in `.debug_line` section.
This patch fixes this issue by evaluating line markers within the macro scope but not when they are used and evaluated.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D80381
2020-06-16 16:13:11 +01:00
Luke Geeson 10b6567f49 [AArch64]: BFloat MatMul Intrinsics&CodeGen
This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

Luke Geeson
 - Momchil Velikov
 - Mikhail Maltsev
 - Luke Cheeseman

Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki,
stuij

Reviewed By: miyuki, stuij

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki, chill, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80752

Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f
2020-06-16 15:23:30 +01:00
Luke Geeson 508a4764c0 [AArch64]: BFloat Load/Store Intrinsics&CodeGen
This patch upstreams support for ld / st variants of BFloat intrinsics
in from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed.

This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

 - Luke Geeson
 - Momchil Velikov
 - Luke Cheeseman

Reviewers: fpetrogalli, SjoerdMeijer, sdesmalen, t.p.northover, stuij

Reviewed By: stuij

Subscribers: arsenm, pratlucas, simon_tatham, labrinea, kristof.beyls,
hiraditya, danielkiss, cfe-commits, llvm-commits, pbarrio, stuij

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80716

Change-Id: I22e1dca2a8a9ec25d1e4f4b200cb50ea493d2575
2020-06-16 15:23:30 +01:00
Georgii Rymar 66fb3c39cb [DebugInfo/DWARF] - Report .eh_frame sections of version != 1.
Specification (https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html#AEN1349)
says that the value of Version field for .eh_frame should be 1.

Though we accept other values and might perform an attempt to read
it as a .debug_frame because of that, what is wrong.

This patch adds a version check.

Differential revision: https://reviews.llvm.org/D81469
2020-06-16 15:46:26 +03:00
Tyker d7deef1206 Revert "[AssumeBundles] add cannonicalisation to the assume builder"
This reverts commit 90c50cad19.
2020-06-16 14:34:55 +02:00
Ayke van Laethem 5aa8014ca8
[AVR] Remove faulty stack pushing behavior
An instruction like this will need to allocate some stack space for the
last parameter:

  %x = call addrspace(1) i16 @bar(i64 undef, i64 undef, i16 undef, i16 0)

This worked fine when passing an actual value (in this case 0). However,
when passing undef, no value was pushed to the stack and therefore no
push instructions were created. This caused an unbalanced stack leading
to interesting results.

This commit fixes that by replacing the push logic with a regular stack
adjustment and stack-relative load/stores. This is less efficient but at
least it correctly compiles the code.

I can think of a few improvements in the future:

  * The stack should have been adjusted in the function prologue when
    there are no allocas in the function.
  * Many (if not most) stack adjustments can be replaced by
    pushing/popping the values directly. Exactly like the previous code
    attempted but didn't do correctly.
  * Small stack adjustments can be done more efficiently with a few
    push/pop instructions (pushing/popping bogus values), both for code
    size and for speed.

All in all, as long as there are no allocas in the function I think that
it is almost always more efficient to emit regular push/pop
instructions. This is however left for future optimizations.

Differential Revision: https://reviews.llvm.org/D78581
2020-06-16 13:53:32 +02:00
Ayke van Laethem 3ab1c97e35
[AVR] Fix stack size in functions with a frame pointer
This patch fixes a bug in stack save/restore code. Because the frame
pointer was saved/restored manually (not by marking it as clobbered) the
StackSize variable was not updated accordingly. Most code still worked,
but code that tried to load a parameter passed on the stack did not.

This commit fixes this by marking the frame pointer as a
callee-clobbered register. This will let it be saved without any effort
in prolog/epilog code and will make sure the correct address is
calculated for loading parameters that are passed on the stack.

This approach is used by most other targets (such as X86, AArch64 and
RISC-V).

Differential Revision: https://reviews.llvm.org/D78579
2020-06-16 13:53:32 +02:00
David Green f269bb7da0 [ARM] Fix crash trying to generate i1 immediates
These code patterns attempt to call isVMOVModifiedImm on a splat of i1
values, leading to an unreachable being hit. I've guarded the call on a
more specific set of sizes, as i1 vectors are legal under MVE.

Differential Revision: https://reviews.llvm.org/D81860
2020-06-16 12:27:24 +01:00