Commit Graph

140493 Commits

Author SHA1 Message Date
Med Ismail Bennani a3aea0193d [llvm/DebugInfo] Simplify DW_OP_implicit_value condition (NFC)
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2020-10-27 11:25:19 +01:00
Jay Foad 6539ebe97d [AMDGPU] Use DPP instead of Ext in a couple of class names. NFC. 2020-10-27 10:22:30 +00:00
Georgii Rymar 2d59ed4e62 [yaml2obj] - Add a way to override the sh_addralign field of a section.
Imagine the following declaration of a section:
```
Sections:
  - Name:         .dynsym
    Type:         SHT_DYNSYM
    AddressAlign: 0x1111111111111111
```

The aligment is large and yaml2obj reports an error currently:
"the desired output size is greater than permitted. Use the --max-size option to change the limit"

This patch implements the "ShAddrAlign" key, which is similar to other "Sh*" keys we have.
With it it is possible to override the `sh_addralign` field, ignoring the writing of alignment bytes.

Differential revision: https://reviews.llvm.org/D90019
2020-10-27 13:03:38 +03:00
Florian Hahn f067bc3c0a [LoopRotation] Allow loop header duplication if vectorization is forced.
-Oz normally does not allow loop header duplication so this loop wouldn't be
vectorized.  However the vectorization pragma should override this and allow
for loop rotation.

rdar://problem/49281061

Original patch by Adam Nemet.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D59832
2020-10-27 09:28:01 +00:00
Craig Topper f385823e04 [X86] Alternate implementation of D88194.
This uses PreprocessISelDAG to replace the constant before
instruction selection instead of matching opcodes after.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D89178
2020-10-27 00:20:03 -07:00
Wei Wang d602e79a81 [X86] Encode global address in small code model
In small code model, program and its symbols are linked in the lower 2 GB of
the address space. Try encoding global address even when the range is unknown
in such case.

Differential Revision: https://reviews.llvm.org/D89341
2020-10-26 23:14:06 -07:00
Max Kazantsev fdc845b361 [NFC] Factor away lambda's redundant parameter 2020-10-27 12:56:52 +07:00
Serguei Katkov b69919b537 [GVN LoadPRE] Add an option to disable splitting backedge
GVN Load PRE can split the backedge causing breaking the loop structure where the latch
contains the conditional branch with for example induction variable.

Different optimizations expect this form of the loop, so it is better to preserve it for some time.
This CL adds an option to control an ability to split backedge.

Default value is true so technically it is NFC and current behavior is not changed.

Reviewers: fedor.sergeev, mkazantsev, nikic, reames, fhahn
Reviewed By: mkazasntsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D89854
2020-10-27 11:59:52 +07:00
Max Kazantsev c6ca26c0bf [IndVars] Remove monotonic checks with unknown exit count
Even if the exact exit count is unknown, we can still prove that this
exit will not be taken. If we can prove that the predicate is monotonic,
fulfilled on first & last iteration, and no overflow happened in between,
then the check can be removed.

Differential Revision: https://reviews.llvm.org/D87832
Reviewed By: apilipenko
2020-10-27 11:35:16 +07:00
Jonas Devlieghere c4ef3115b4 Fix calls to (p)read on macOS when size > INT32_MAX
On macOS, the read and pread syscalls return EINVAL when the number of
bytes to read exceeds INT32_MAX:

a449c6a3b8/bsd/kern/sys_generic.c (L355)

rdar://68751407

Differential revision: https://reviews.llvm.org/D90201
2020-10-26 20:51:44 -07:00
Arthur Eubanks 42f76e193b Reland [AlwaysInliner] Pass callee AAResults to InlineFunction()
Test copied from noalias-calls.ll with small changes.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89609
2020-10-26 20:40:46 -07:00
Arthur Eubanks e5766f25c6 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-26 20:24:04 -07:00
Arthur Eubanks 4af5ba1726 Revert "[AlwaysInliner] Pass callee AAResults to InlineFunction()"
This reverts commit 504fbec7a6.

Test failure.
2020-10-26 20:23:38 -07:00
Bing1 Yu 2c08f1b4b6 [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...
In each 128-lane, if there is at least one index is demanded and not all
indices are demanded and this 128-lane is not the first 128-lane of the
legalized-vector, then this 128-lane needs a extracti128;
If in each 128-lane, there is at least one index is demanded, this 128-lane
needs a inserti128.

The following cases will help you build a better understanding:
Assume we insert several elements into a v8i32 vector in avx2,
Case#1: inserting into 1th index needs vpinsrd + inserti128
Case#2: inserting into 5th index needs extracti128 + vpinsrd +
inserti128
Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.

Reviewed By: pengfei, RKSimon

Differential Revision: https://reviews.llvm.org/D89767
2020-10-27 11:21:13 +08:00
Arthur Eubanks 504fbec7a6 [AlwaysInliner] Pass callee AAResults to InlineFunction()
Test copied from noalias-calls.ll with small changes.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89609
2020-10-26 20:10:09 -07:00
Arthur Eubanks 3dd1c72458 Port -objc-arc-expand to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90182
2020-10-26 20:05:10 -07:00
Arthur Eubanks 90c0b0d3d6 Port -objc-arc-apelim to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90181
2020-10-26 20:01:46 -07:00
Chen Zheng 00e573cadb [LSR] fix typo in comments and rename for a new added hook. 2020-10-26 22:29:22 -04:00
Duncan P. N. Exon Smith ebb4ea1d53 IR: Simplify two loops walking ConstantDataSequential, NFC
Follow-up to b2b7cf39d5.

Differential Revision: https://reviews.llvm.org/D90198
2020-10-26 21:55:48 -04:00
Carl Ritson 7a880ab388 [AMDGPU] Move WQM Pass after MI Scheduler
Exec mask manipulation inserted by SIWholeQuadMode barriers to
instruction scheduling.  Move the entire pass after the machine
instruction scheduler and make changes so pass is correct for
non-SSA operation.  These changes should leave the pass still
usable pre-scheduler, although tests have be updated to reflect
post-scheduler results.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D88081
2020-10-27 10:25:53 +09:00
TaWeiTu 0efbfa38ae [NPM] Port -slsr to NPM
`-separate-const-offset-from-gep` has not yet be ported, so some tests are not updated.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D90149
2020-10-27 09:21:40 +08:00
Duncan P. N. Exon Smith 52821f6a71 IR: Add a comment at missing std::make_unique calls from b2b7cf39d5, NFC 2020-10-26 21:18:34 -04:00
Gaurav Jain 17cdba61d4 [NFC] Use [MC]Register in RegAllocPBQP & RegisterCoalescer
Differential Revision: https://reviews.llvm.org/D90008
2020-10-26 17:13:32 -07:00
Amy Kwan 803cc3aff2 [PowerPC] Implement Set Boolean Condition Instructions
This patch implements the set boolean condition instructions introduced in
POWER10.

The set boolean condition instructions (set[n]bc[r]) are used during
the following situations:
- sign/zero/any extending i1 to an i32 or i64,
- reg+reg, reg+imm or floating point comparisons being sign/zero extended to i32 or i64,
- spilling CR bits (using the setnbc instruction)

Differential Revision: https://reviews.llvm.org/D87705
2020-10-26 18:42:51 -05:00
Adrian Prantl 5b3bf8b453 [DebugInfo] Expose Fortran array debug info attributes through DIBuilder.
The support of a few debug info attributes specifically for Fortran
arrays have been added to LLVM recently, but there's no way to take
advantage of them through DIBuilder. This patch extends
DIBuilder::createArrayType to enable the settings of those attributes.

Patch by Chih-Ping Chen!

Differential Revision: https://reviews.llvm.org/D89817
2020-10-26 16:23:36 -07:00
Rahman Lavaee 0b2f4cdf2b Explicitly check for entry basic block, rather than relying on MachineBasicBlock::pred_empty.
Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D89423
2020-10-26 16:15:56 -07:00
Stanislav Mekhanoshin d176e13ca5 Fixed release build after D89170 2020-10-26 16:00:57 -07:00
Duncan P. N. Exon Smith b2b7cf39d5 IR: Clarify ownership of ConstantDataSequentials, NFC
Change `ConstantDataSequential::Next` to a
`unique_ptr<ConstantDataSequential>` and update `CDSConstants` to a
`StringMap<unique_ptr<ConstantDataSequential>>`, making the ownership
more obvious.

Differential Revision: https://reviews.llvm.org/D90083
2020-10-26 18:47:25 -04:00
Amy Huang 515973222e [CodeView] Emit static data members as S_CONSTANTs.
We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.

I changed CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.

Bug: https://bugs.llvm.org/show_bug.cgi?id=47580

Differential Revision: https://reviews.llvm.org/D89072
2020-10-26 15:30:35 -07:00
Stanislav Mekhanoshin 038d884a50 [AMDGPU] Use flat scratch instructions where available
The support is disabled by default. So far there is instruction
selection, spilling, and frame elimination. It also changes SP
from unswizzled to swizzled as used by flat scratch instructions,
so it cannot be mixed with MUBUF stack access.

At the very least missing:

- GlobalISel;
- Some optimizations in frame elimination in between vector
  and scalar ALU;
- It shall finally allow to always materialize frame index
  as an SGPR, but that is not implemented and frame elimination
  cannot handle it yet;
- Unaligned and/or multidword flat scratch shall work, but it
  is legalized now for MUBUF;
- Operand folding cannot optimize FI like with MUBUF yet;
- It will need scaling the value of the SP/FP in the DWARF
  expression to recover the unswizzled scratch address;

Differential Revision: https://reviews.llvm.org/D89170
2020-10-26 14:40:42 -07:00
Sriraman Tallam ad1b9daa4b Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names.
Prepend the module name hash with a fixed string ".__uniq." which helps tools
that consume sampled profiles and attribute it to functions to understand
that this symbol belongs to a unique internal linkage type symbol.

Symbols with suffixes can result from various optimizations in the compiler.
Function Multiversioning, function splitting, parameter constant propogation,
unique internal linkage names.

External tools like sampled profile aggregators combine profiles from multiple
runs of a binary. They use various heuristics with symbols that have suffixes
to try and attribute the profile to the right function instance. For instance
multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different
should be attributed to the same source function if a single function is
versioned, using attribute target_clones (supported in GCC but yet to land in
LLVM). Similarly, functions that are split (split part having a .cold suffix)
could have profiles for both the original and split symbols but would be
aggregated and attributed to the original function that was split.

Unique internal linkage functions however have different source instances and
the aggregator must not put them together but attribute it to the appropriate
function instance. To be sure that we are dealing with a symbol of a unique
internal linkage function, we would like to prepend the hash with a known
string ".__uniq." which these tools can check to understand the suffix type.

Differential Revision: https://reviews.llvm.org/D89617
2020-10-26 14:24:28 -07:00
Duncan P. N. Exon Smith d4c667c9af Avoid unnecessary uses of `MDNode::getTemporary`, NFC
This is a long-delayed follow-up to
5e5b85098d.

`TempMDNode` includes a bunch of machinery for RAUW, and should only be
used when necessary. RAUW wasn't being used in any of these cases... it
was just a placeholder for a self-reference.

Where the real node was using `MDNode::getDistinct`, just replace the
temporary argument with `nullptr`.

Where the real node was using `MDNode::get`, the `replaceOperandWith`
call was "promoting" the node to a distinct one implicitly due to
self-reference detection in `MDNode::handleChangedOperand`. The
`TempMDNode` was serving a purpose by delaying uniquing, but it's way
simpler to just call `MDNode::getDistinct` in the first place.

Note that using a self-reference at all in these places is a hold-over
from before `distinct` metadata existed. It was an old trick to create
distinct nodes. It would be intrusive to change, including bitcode
upgrades, etc., and it's harmless so I'm not sure there's much value in
removing it from existing schemas. After this commit it still has a tiny
memory cost (in the extra metadata operand) but no more overhead in
construction.

Differential Revision: https://reviews.llvm.org/D90079
2020-10-26 17:03:25 -04:00
Sanjay Patel 5a6e66ec72 [InstCombine] add folds for icmp+ctpop
https://alive2.llvm.org/ce/z/XjFPQJ

  define void @src(i64 %value) {
    %t0 = call i64 @llvm.ctpop.i64(i64 %value)
    %gt = icmp ugt i64 %t0, 63
    %lt = icmp ult i64 %t0, 64
    call void @use(i1 %gt, i1 %lt)
    ret void
  }

  define void @tgt(i64 %value) {
    %eq = icmp eq i64 %value, -1
    %ne = icmp ne i64 %value, -1
    call void @use(i1 %eq, i1 %ne)
    ret void
  }

  declare i64 @llvm.ctpop.i64(i64) #1
  declare void @use(i1, i1)
2020-10-26 16:48:56 -04:00
Sanjay Patel 437d7551c5 [InstCombine] reduce code duplication in icmp intrinsic folds; NFC 2020-10-26 16:48:56 -04:00
Nick Desaulniers 0b11d018cc [BitCode] decode nossp fn attr
I missed this in https://reviews.llvm.org/D87956.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D90177
2020-10-26 13:06:54 -07:00
Stanislav Mekhanoshin 00928a1956 Fix SROA with a PHI mergig values from a same block
This fixes the bug 47945. It is legal to have a PHI with values
from from the same block, but values must stay the same. In this
case it is illegal to merge different values.

Differential Revision: https://reviews.llvm.org/D89978
2020-10-26 12:58:27 -07:00
Evgeny Leviant a28388f95b [ARM][SchedModels] Move IsLDMBaseRegInListPred to ARMSchedule.td. NFC
This predicate is not specific to cortex-a57 and can be used in other processor
models as well.
2020-10-26 22:31:41 +03:00
Stanislav Mekhanoshin ad8131bb03 [AMDGPU] Fix VC warning about singed/unsigned comparison. NFC.
This is the warning reported in https://reviews.llvm.org/D89599
2020-10-26 11:55:57 -07:00
Joe Ellis bf60bb26ec [SVE] Fix TypeSize warning in llvm::getGEPInductionOperand
We do not need to use the implicit cast here. We can instead can rely on
a comparison between two TypeSize objects instead. This algorithm will
work fine with scalable vectors.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D90146
2020-10-26 17:40:32 +00:00
Joe Ellis 0f83505593 [SVE][InstCombine] Fix TypeSize warning in canReplaceGEPIdxWithZero
The warning would fire when calling canReplaceGEPIdxWithZero on a GEP
whose source element type is a scalable vector. The size of scalable
vector types is not known, so this optimization cannot be performed.

This patch fixes the issue by:

- bailing out early in this routine if the GEP instruction's source
  element type is a scalable vector.

- making use of getFixedSize -- this removes the dependency on the
  deprecated interface.

Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D89968
2020-10-26 17:40:26 +00:00
Joe Ellis 467e5cf40f [SVE][AArch64] Fix TypeSize warning in loop vectorization legality
The warning would fire when calling isDereferenceableAndAlignedInLoop
with a scalable load. Calling isDereferenceableAndAlignedInLoop with a
scalable load would result in the use of the now deprecated implicit
cast of TypeSize to uint64_t through the overloaded operator.

This patch fixes this issue by:

- no longer considering vector loads as candidates in
  canVectorizeWithIfConvert. This doesn't make sense in the context of
  identifying scalar loads to vectorize.

- making use of getFixedSize inside isDereferenceableAndAlignedInLoop --
  this removes the dependency on the deprecated interface, and will
  trigger an assertion error if the function is ever called with a
  scalable type.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D89798
2020-10-26 17:40:04 +00:00
Evgeny Leviant e74f66125e [ARM][SchedModels] Convert IsLdstsoScaledNotOptimalPred to MCSchedPredicate
Differential revision: https://reviews.llvm.org/D90150
2020-10-26 20:22:41 +03:00
Evgeny Leviant a877bda397 Fix issue in cortex-a57 sched model
Differential revision: https://reviews.llvm.org/D90152
2020-10-26 20:16:40 +03:00
Benjamin Kramer b777d30496 [AMDGPU] Avoid unused variable warning in Release builds. NFC.
SIRegisterInfo.cpp:480:19: error: unused variable 'SOffset'
2020-10-26 18:11:57 +01:00
Peter Waller 5b742a0c10 [SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.

This brings the logic in line with the comment on the context line immediately
above the added precondition.

Add a test in sve-redundant-store.ll that the warning is not triggered.

Differential Revision: https://reviews.llvm.org/D89701
2020-10-26 16:37:48 +00:00
Peter Waller 6536d6040f Revert "[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination"
This reverts commit 4604441386.

Reverting because it was not the intended version of the patch, which
follows this patch.
2020-10-26 16:37:00 +00:00
Peter Waller 4604441386 [SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.

This brings the logic in line with the comment on the context line immediately
above the added precondition.

Add a test in Redundantstores.ll that the warning is not triggered.
2020-10-26 16:23:42 +00:00
Kazushi (Jam) Marukawa 9d0db405b5 [VE] Add vector shift instructions
Add VSLL/VSLD/VSRL/VSLA/VSLAX/VSRA/VSRAX/VSFA instructionss.  Add
additonal AsmParser for VSLD special operand.  Also add regression
tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D90143
2020-10-27 00:30:27 +09:00
Kazushi (Jam) Marukawa 83cb423c6e [VE] Add vector logical instructions
Add VAND/VOR/VXOE/VEQV/VLDZ/VPCNT/VBRV/VSEQ instrucitons and regression
tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D90141
2020-10-27 00:29:33 +09:00
Kazushi (Jam) Marukawa cfefef50c1 [VE] Support atomic store
Support atomic store instructions and add a regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D90137
2020-10-27 00:28:11 +09:00