Commit Graph

394978 Commits

Author SHA1 Message Date
Kerry McLaughlin e484e1ae03 [SVE] Fix casts to <FixedVectorType> in truncateToMinimalBitwidths
Fixes more casts to `<FixedVectorType>` for the cases where the
instruction is a Insert/ExtractElementInst.

For fixed-width, this part of truncateToMinimalBitWidths is tested by
AArch64/type-shrinkage-insertelt.ll. I attempted to write a test case for this part
of truncateToMinimalBitWidths which uses scalable vectors, but was unable to add
one. The tests in type-shrinkage-insertelt.ll rely on scalarization to create extract
element instructions for instance, which is not possible for scalable vectors.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106163
2021-07-26 13:44:51 +01:00
Alexey Bataev d7cb2a0796 Revert "[SLP]Fix costs calculations."
This reverts commit a053afed49 to fix
buildbots.
2021-07-26 05:42:34 -07:00
Caroline Concatto bf28111ebd [AArch65][SVE] Remove vector_splice from AddedComplexity pattern
The pattern for vector_splice with Index equal or bigger than
zero was misplaced in the AddedComplexity = 1 pattern in the AArch64
tablegen file. This patch fixes it by removing vector_splice pattern
from inside AddedComplexity = 1.
2021-07-26 13:35:51 +01:00
Tres Popp 539437e288 [mlir] split type conversion to two lines for GCC's sake 2021-07-26 14:15:47 +02:00
Alexey Bataev a053afed49 [SLP]Fix costs calculations.
Need to fix several cost-related problems. The final type may be defined
incorrectly because of to early definition (we may end up with the wider
type), the CommonCost should not be redefined in ExtractElements
cost related calculations and the shuffle of the final insertelements
vectors should be calculated as a cost of single vector permutations
+ costs of two vector permutations for other n-1 incoming vectors.

Differential Revision: https://reviews.llvm.org/D106578
2021-07-26 04:37:22 -07:00
Paul Walker 8a8d01d58c [NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties.
Differential Revision: https://reviews.llvm.org/D106750
2021-07-26 12:25:46 +01:00
Philipp Krones 46c0366877 [Inliner] Make the CallPenalty configurable
Tests with multiple benchmarks, like Embench [1], showed that the
CallPenalty magic number has the most influence on inlining decisions
when optimizing for size.

On the other hand, there was no good default value for this parameter.
Some benchmarks profited strongly from a reduced call penalty. On
example is the picojpeg benchmark compiled for RISC-V, which got 6%
smaller with a CallPenalty of 10 instead of 12. Other benchmarks
increased in size, like matmult.

This commit makes the compromise of turning the magic number constant of
CallPenalty into a configurable value. This introduces the flag
`--inline-call-penalty`. With that flag users can fine tune the inliner
to their needs.

The CallPenalty constant was also used for loops. This commit replaces
the CallPenalty constant with a new LoopPenalty constant that is now
used instead.

This is a slimmed down version of https://reviews.llvm.org/D30899

[1]: https://github.com/embench/embench-iot

Differential Revision: https://reviews.llvm.org/D105976
2021-07-26 12:07:49 +01:00
Florian Hahn d995d63767
[VPlan] Use stored value from recipes for interleave groups.
Instead of getting the VPValue for the stored IR values through the
current plan, use the stored value of the recipes directly.

This way, the correct VPValues are used if the store recipes have been
modified in the VPlan and the IR value is not correct any longer. This
can happen, e.g. due to D105008.
2021-07-26 12:05:23 +01:00
Dylan Fleming 20b0fa91c9 [SVE] Add support for folding for select + masked loads
Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0).

Patch originally authored by @paulwalker-arm

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106376
2021-07-26 11:58:41 +01:00
Caroline Concatto 0bfc26e3a4 [SVE][AArch64] Improve code generation for vector_splice for Imm > 0
This patch implements vector_splice in tablegen for all cases when the
Immediate is positive and lower than the known minimum value of
a scalable vector.
Vector_splice can be implemented using SVE instruction EXT.
For instance :
    @llvm.experimental.vector.splice(Vector_1, Vector_2, Imm)
    @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E>
        EXT  Vector_1, Vector_2, Imm              // Vector_1 = B, C, D + Vector_2 = E

Depends on D105633

Differential Revision: https://reviews.llvm.org/D106273
2021-07-26 11:45:46 +01:00
David Sherwood b2a5f0029f Fix test failures caused by 0aff1798b5 2021-07-26 11:40:26 +01:00
Caroline Concatto 73e4e9cd00 [AArch64][SVE] Improve code generation for vector_splice for Imm == -1
This patch implements vector_splice in tablegen for:
  a) when the immediate is equal to -1 (Imm==1) and uses:
       INSR  +  LASTB
For instance :
@llvm.experimental.vector.splice(Vector_1, Vector_2, -1)
@llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <D, E, F, G>
    LAST   RegLast, Vector_1                 // RegLast = D
    INSR   Res, (Vector_1 >> 1), RegLast     // Res = D + E, F, G

Differential Revision: https://reviews.llvm.org/D105633
2021-07-26 11:25:01 +01:00
Simon Pilgrim c8472db0a8 [X86][AVX] Prefer vinsertf128 to vperm2f128 on AVX1 targets
Splatting the lower xmm with vinsertf128 is at least as quick as vperm2f128, and a lot faster on some AMD targets.

First step towards PR50053
2021-07-26 11:11:56 +01:00
Simon Pilgrim f64e251560 [X86][SSE] Don't scrub address math from interleaved shuffle tests 2021-07-26 11:03:31 +01:00
Sam McCall e9274af718 Revert "[clangd] Avoid range-loop init-list lifetime subtleties."
This reverts commit 253b8145de.

This doesn't actually fix anything - I should stop guessing.
See https://github.com/clangd/clangd/issues/800 for update
2021-07-26 11:38:47 +02:00
Cullen Rhodes e6ff9179ce [AArch64][AsmParser] NFC: Parser.getTok().getLoc() -> getLoc()
Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D106635
2021-07-26 09:36:34 +00:00
David Sherwood 0aff1798b5 [Analysis] Add simple cost model for strict (in-order) reductions
I have added a new FastMathFlags parameter to getArithmeticReductionCost
to indicate what type of reduction we are performing:

  1. Tree-wise. This is the typical fast-math reduction that involves
  continually splitting a vector up into halves and adding each
  half together until we get a scalar result. This is the default
  behaviour for integers, whereas for floating point we only do this
  if reassociation is allowed.
  2. Ordered. This now allows us to estimate the cost of performing
  a strict vector reduction by treating it as a series of scalar
  operations in lane order. This is the case when FP reassociation
  is not permitted. For scalable vectors this is more difficult
  because at compile time we do not know how many lanes there are,
  and so we use the worst case maximum vscale value.

I have also fixed getTypeBasedIntrinsicInstrCost to pass in the
FastMathFlags, which meant fixing up some X86 tests where we always
assumed the vector.reduce.fadd/mul intrinsics were 'fast'.

New tests have been added here:

  Analysis/CostModel/AArch64/reduce-fadd.ll
  Analysis/CostModel/AArch64/sve-intrinsics.ll
  Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
  Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll

Differential Revision: https://reviews.llvm.org/D105432
2021-07-26 10:26:06 +01:00
Fraser Cormack f924a3d474 [SelectionDAG] Support scalable-vector splats in yet more cases
This patch extends support for (scalable-vector) splats in the
DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a
variety of simple combines of constants.

Users of this function may now have to distinguish between
`BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing
with this in-tree follows the approach added for
`ISD::matchUnaryPredicate` implemented in D94501.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106575
2021-07-26 10:15:08 +01:00
Kadir Cetinkaya 0a3c7960cb
Revert "Revert D106562 "[clangd] Get rid of arg adjusters in CommandMangler""
This reverts commit 2aa0cf19e7.
Get rid of reference to the temporary.
2021-07-26 11:13:22 +02:00
Jon Chesterfield 2a613a7790 [libomptarget] Build amdgpu plugin without hsa
Default to building the amdgpu plugin to use dlopen when hsa is
not found instead of disabling it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106600
2021-07-26 09:54:51 +01:00
Jon Chesterfield 93fe84d32f [libomptarget][nfc] Squash unused variable warning
Suppress only current warning on openmp-clang-x86_64-linux-debian

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106777
2021-07-26 09:54:31 +01:00
Vladislav Vinogradov eb6c63cb0b [mlir] Fix RankedTensorType::walkImmediateSubElements method
Add 'enconding' attribute visitor.
Without it ASM printer doesn't use attribute aliases for 'enconding'.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105554
2021-07-26 11:49:25 +03:00
Guillaume Chatelet 47afd43eaa [libc] fix LibcUnitTestMain when building with shared libraries 2021-07-26 08:43:45 +00:00
Lang Hames cdcc354768 [ORC][ORC-RT] Add initial Objective-C and Swift support to MachOPlatform.
This allows ORC to execute code containing Objective-C and Swift classes and
methods (provided that the language runtime is loaded into the executor).
2021-07-26 18:02:01 +10:00
Marcel Koester 0425332015 [mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<ReturnLike>.
This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be
passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the
involved operands. However, in contrast to the BranchOpInterface, it expects an additional region
number to distinguish between various use cases which might require different operands passed to
different regions.

Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and
getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the
ReturnLike trait and/or implementing the newly added interface.  This simplifies reasoning about
terminators in the scope of the nested regions.

We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities.

Differential Revision: https://reviews.llvm.org/D105018
2021-07-26 06:39:31 +02:00
Michael Kruse ae6b400002 [Preprocessor] Implement -fminimize-whitespace.
This patch adds the -fminimize-whitespace with the following effects:

 * If combined with -E, remove as much non-line-breaking whitespace as
   possible.

 * If combined with -E -P, removes as much whitespace as possible,
   including line-breaks.

The motivation is to reduce the amount of insignificant changes in the
preprocessed output with source files where only whitespace has been
changed (add/remove comments, clang-format, etc.) which is in particular
useful with ccache.

A patch for ccache for using this flag has been proposed to ccache as well:
https://github.com/ccache/ccache/pull/815, which will use
-fnormalize-whitespace when clang-13 has been detected, and additionally
uses -P in "unify_mode". ccache already had a unify_mode in an older
version which was removed because of problems that using the
preprocessor itself does not have (such that the custom tokenizer did
not recognize C++11 raw strings).

This patch slightly reorganizes which part is responsible for adding
newlines that are required for semantics. It is now either
startNewLineIfNeeded() or MoveToLine() but never both; this avoids the
ShouldUpdateCurrentLine workaround and avoids redundant lines being
inserted in some cases. It also fixes a mandatory newline not inserted
after a _Pragma("...") that is expanded into a #pragma.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D104601
2021-07-25 23:30:57 -05:00
Yuanfang Chen 1558bb80c0 [Object] make SourceMgr available to MCContext during inline asm symbols
collection

Fixes PR51210.
2021-07-25 21:23:03 -07:00
Esme-Yi 0d3e4d9d4d [Debug-Info][llvm-dwarfdump] Don't use DW_FORM_data4/8
to encode the constants for DW_AT_data_member_location.

Summary: In DWARF v3, DW_FORM_data4/8 in
DW_AT_data_member_location are interpreted as location
list pointers. Interpreting constants as pointers is
not expected, so we use DW_FORM_udata to encode the
constants.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D105687
2021-07-26 03:47:02 +00:00
Mehdi Amini 3211eadfe0 Revert "Build libSupport with -Werror=global-constructors (NFC)"
This reverts commit 579cc9ad2e.
This breaks on Windows.
2021-07-26 03:08:26 +00:00
Mehdi Amini 579cc9ad2e Build libSupport with -Werror=global-constructors (NFC)
Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.
2021-07-26 03:04:31 +00:00
Esme-Yi 2eb7e5f0ed [yaml2obj] Do not write the string table if there is no string entry.
Summary: yaml2obj shouldn't create the string table that isn't needed
         - doing so wastes time and disk space.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D106420
2021-07-26 02:37:49 +00:00
Dave Airlie 9451403c5f [OPENCL] opencl-c.h: add initial CL 3.0 conditionals for atomic operations.
This adds the optional wrappers around things, however this isn't sufficient yet for CL 3.0 without generic address space, I've got one more additional patch to add all those APIs, but this is an easier to review precursor.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D106111
2021-07-26 11:06:33 +10:00
Mehdi Amini df7d9c8cb0 Revert "Build libSupport with -Werror=global-constructors (NFC)"
This reverts commit 5eb2e9aa64.
This broke MacOS builds, needs to have a safer check guarding the flag
addition.
2021-07-26 00:55:36 +00:00
Mehdi Amini 5eb2e9aa64 Build libSupport with -Werror=global-constructors (NFC)
Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.
2021-07-26 00:21:09 +00:00
Mehdi Amini 7d9a2c714c Remove the NotUnderValgrind caching flag
The motivation for this caching wasn't clear, remove it in an effort to
simplify the code and make libSupport free of global dynamic constructor.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106206
2021-07-26 00:21:09 +00:00
Roman Lebedev c2dacb1cd3
[SimplifyCFG] Fold branch to common dest: if branch is unpredictable, prefer to speculate
This is consistent with the two other usages of prof md in this pass.
2021-07-26 02:57:19 +03:00
Roman Lebedev 59a5964e03
[SimplifyCFG] Don't speculatively execute BB[s] if they are predictably not taken
Same as D106650, but for `FoldTwoEntryPHINode()`

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D106717
2021-07-26 02:55:15 +03:00
Roman Lebedev e58ce35f7b
[SimplifyCFG] Don't speculatively execute BB if it's predictably not taken
If the branch isn't `unpredictable`, and it is predicted to *not* branch
to the block we are considering speculatively executing,
then it seems counter-productive to execute the code that is predicted not to be executed.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D106650
2021-07-26 02:55:14 +03:00
Roman Lebedev 48379f27d0
[NFC][SimplifyCFG] Add more negative tests for profmd-induced speculation avoidance 2021-07-26 02:55:08 +03:00
Fangrui Song e7a7ad134f [ELF] Support quoted symbols in symbol assignments
glibc/elf/tst-absolute-zero-lib.lds uses `"absolute" = 0;`
2021-07-25 16:26:37 -07:00
Nico Weber 75e7d1320c [lld/mac] Make comment style uniform in start-end.s test 2021-07-25 18:37:49 -04:00
Nico Weber 80caa1eb4a [lld/mac] Add support for segment$start$ and segment$end$ symbols
These symbols are somewhat interesting in that they create non-existing
segments, which as far as I know is the only way to create segments
that don't contain any sections.

Final part of part of PR50760. Like D106629, but for segments instead
of sections. I'm not aware of anything that needs this in practice.

Differential Revision: https://reviews.llvm.org/D106767
2021-07-25 18:25:13 -04:00
Nico Weber afdeb432f0 [lld/mac] Move output segment rename logic into OutputSegment
Fixes the output segment name if both -rename_section and
-rename_segment are used and the post-section-rename segment
name is the same as the pre-segment-rename segment name to
match ld64's behavior.

The motivation is that segment$start$ can create section-less segments,
and this makes a corner case in the interaction between segment$start and
-rename_segment in the upcoming segment$start patch.

Differential Revision: https://reviews.llvm.org/D106766
2021-07-25 18:20:09 -04:00
Nico Weber 6bf7d2d9c9 [lld/mac] Reland: Add tests for the interaction between -rename_section and -rename_segment
No behavior change.

Differential Revision: https://reviews.llvm.org/D106765
2021-07-25 18:16:33 -04:00
Jon Chesterfield dd0b463dd9 [libomptarget][amdgpu] More robust handling of failure to init HSA
If hsa_init fails, subsequent calls into hsa are not safe. Except for
hsa_init, but we don't retry on failure.

This patch:
- deletes a print that called into hsa to ask why it can't call into hsa
- drops a merge conflict block next to that print
- reliably initializes number of devices to zero
- skips the plugin destructor contents if the constructor failed to init hsa

Tested by making hsa_init return error, and by forcing the dynamic library
use which was then deleted from disk. Before this patch, both segv. After it,
friendly message about offloading being unavailable.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106774
2021-07-25 23:15:58 +01:00
Nico Weber 14bb6e4d70 Revert "[lld/mac] Add tests for the interaction between -rename_section and -rename_segment"
This reverts commit a6eb34624d.
The test fails, I screwed something up.
2021-07-25 18:11:36 -04:00
Nico Weber a6eb34624d [lld/mac] Add tests for the interaction between -rename_section and -rename_segment
No behavior change.

Differential Revision: https://reviews.llvm.org/D106765
2021-07-25 18:03:25 -04:00
Stefan Gränitz e814b28eeb [docs] Update release notes to mention lli JIT engine switch 2021-07-25 23:58:43 +02:00
Nico Weber b1777b04dc Revert "[VPlan] Add recipe for first-order rec phis, make splicing explicit."
Makes clang crash: https://reviews.llvm.org/D105008#2903350
This reverts commit d2a73fb44e.

Also revert a minor formatting follow-up:
This reverts commit 82834a6732.
2021-07-25 17:39:28 -04:00
Jon Chesterfield e3251f2ec4 Revert "[libomptarget] Build amdgpu plugin without hsa"
Inaccurate error handling around hsa_init

This reverts commit e30b3b23a4.
2021-07-25 21:03:51 +01:00