Commit Graph

410901 Commits

Author SHA1 Message Date
Benoit Jacob 499703e9c0 Enable ReassociatingReshapeOpConversion with "non-identity" layouts.
Enable ReassociatingReshapeOpConversion with "non-identity" layouts.

This removes an early-return in this function, which seems unnecessary and is
preventing some memref.collapse_shape from converting to LLVM (see included lit test).

It seems unnecessary because the return message says "only empty layout map is supported"
but there actually is code in this function to deal with non-empty layout maps. Maybe
it refers to an earlier state of implementation and is just out of date?

Though, there is another concern about this early return: the condition that it actually
checks, `{src,dst}MemrefType.getLayout().isIdentity()`, is not quite the same as what the
return message says, "only empty layout map is supported". Stepping through this
`getLayout().isIdentity()` code in GDB, I found that it evaluates to `.getAffineMap().isIdentity()`
which does (AffineMap.cpp:271):

```
  if (getNumDims() != getNumResults())
    return false;
```

This seems that it would always return false for memrefs of rank greater than 1 ?

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114808
2022-01-13 17:46:20 +00:00
Simon Pilgrim fced2744d3 Fix MSVC "not all control paths return a value" warnings. NFC. 2022-01-13 17:44:10 +00:00
Simon Pilgrim 08212dbc44 [X86] Add xop/avx2 shifts to X86TargetLowering::isBinOp
Allows shuffle combining through per-element shift nodes

This exposed a number of issues with shuffle combining with target intrinsics that are lowered to nodes later during legalization - in particular shuffle combining and SimplifyDemandedVectorElts were being called after canonicalizeShuffleWithBinOps, meaning that shuffles didn't have a chance to be combined away before the shuffle(binop(x,y)) -> binop(shuffle(x),shuffle(y)) fold.
2022-01-13 17:44:10 +00:00
LLVM GN Syncbot e2c78f99c4 [gn build] Port 67151d029b 2022-01-13 17:34:38 +00:00
Stanislav Mekhanoshin fc6af7e188 [AMDGPU] Fix error handling in asm constraint syntax
I believe this is unexploitable because in either case the result
will be 'couldn't allocate register for constraint' error message,
but error code checking is clearly wrong.

Differential Revision: https://reviews.llvm.org/D117189
2022-01-13 09:33:50 -08:00
Arthur O'Dwyer 67151d029b [libc++] [ranges] Implement P2415R2 owning_view.
"What is a view?"
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2415r2.html
https://github.com/cplusplus/draft/pull/5010/files

This was a late-breaking (Oct 2021) change to C++20.
The only thing missing from this patch is that we're supposed
to bump the feature-test macro from
    #define __cpp_lib_ranges 202106L
to
    #define __cpp_lib_ranges 202110L
but we can't do that because we don't implement all of 202106 Ranges yet.

Differential Revision: https://reviews.llvm.org/D116894
2022-01-13 12:29:41 -05:00
Juergen Ributzka 3025c3eded Replace PlatformKind with PlatformType.
The PlatformKind/PlatformType enums contain the same information, which requires
them to be kept in-sync. This commit changes over to PlatformType as the sole
source of truth, which allows the removal of the redundant PlatformKind.

The majority of the changes were in LLD and TextAPI.

Reviewed By: cishida

Differential Revision: https://reviews.llvm.org/D117163
2022-01-13 09:23:49 -08:00
Louis Dionne c7aa8b2962 [libc++][NFC] Remove duplication of distance_apriori_sentinel 2022-01-13 12:23:33 -05:00
Craig Topper 1e04923d21 [MachineValueType] Don't allow MVT::getVectorNumElements() to be called for scalable vectors.
Migrate the one caller that failed lit tests to use
MVT::getVectorMinNumElements directly.
2022-01-13 09:16:25 -08:00
Simon Pilgrim 55029f017d [X86] canonicalizeShuffleWithBinOps - add X86ISD::PSHUFHW/PSHUFLW handling 2022-01-13 17:08:59 +00:00
Matt Arsenault 59994c25f9 AMDGPU: Select workitem ID intrinsics to 0 with req_work_group_size
Shockingly we weren't doing this already. We should probably have this
be done earlier in the IR too, but it's still helpful to have the
lowering guarantee it so that we can modify the ABI implicit inputs
based on it.
2022-01-13 12:08:18 -05:00
Matt Arsenault a6f49423c1 AMDGPU: Optimize outgoing workitem ID based on reqd_work_group_size
If we know we we aren't using a component from the kernel, we can save
a few bit packing instructions.

We're still enabling the VGPR input to the kernel though.
2022-01-13 12:08:18 -05:00
Andrzej Warzynski c719a8596d [flang] Relax the Bash version check
As per https://github.com/flang-compiler/f18-llvm-project/issues/1344,
the `flang` bash script works fine with 4.4.19 and requiring
4.4.23 is too restrictive. Rather than keep updating the patch level,
this patch removes this particular check (so that it will only check the
major and minor versions instead).

As this is both rather straightforward and urgent, I'm merging this
without a review.
2022-01-13 17:04:02 +00:00
Joseph Huber 4746e38f67 [Libomptarget] Fix multiply defined symbol during linking
This patch adds the `weak` identifier to the openmp device environment
variable. The changes introduced in https://reviews.llvm.org/D117211
result in multiply defined symbols. Because the symbol is potentially
included multiple times for each offloading file we will get symbol
colisions, and because it needs to have external visiblity it should be
weak.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D117231
2022-01-13 11:57:33 -05:00
Aaron Ballman bf7d9970ba Support the *_WIDTH macros in limits.h and stdint.h
This completes the implementation of
WG14 N2412 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf),
which standardizes C on a twos complement representation for integer
types. The only work that remained there was to define the correct
macros in the standard headers, which this patch does.
2022-01-13 11:46:34 -05:00
Florian Hahn 7b9f5cbfa7
[LV] Extend check lines for pr34681.ll to cover foldable select. 2022-01-13 16:42:47 +00:00
Elizabeth Andrews 4eaf5846d0 [clang] Fix function pointer address space
Functions pointers should be created with program address space. This
patch introduces program address space in TargetInfo. Targets with
non-default (default is 0) address space for functions should explicitly
set this value. This patch fixes a crash on lvalue reference to function
pointer (in device code) when using oneAPI DPC++ compiler.

Differential Revision: https://reviews.llvm.org/D111566
2022-01-13 08:06:19 -08:00
John Brawn 1fa4778d03 [CMake] Output the error message when get_errc_messages fails
This makes it easier figure out the cause when it fails, and is what
check_z3_version does (the other place we use try_run).
2022-01-13 16:05:15 +00:00
David Sherwood ba471ba8d2 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 31009f0b5a.

It seems to be causing SVE VLA buildbot failures and has introduced a
genuine regression. Reverting for now.
2022-01-13 15:59:43 +00:00
Sanjay Patel fe17ce0fa6 [PowerPC] add RUN lines for both endians to test; NFC
The load narrowing transform works for both targets,
so we might as well test both with simple examples
like this.
2022-01-13 10:49:23 -05:00
Jan Svoboda ccd7e7830f Revert "[clang][lex] Keep references to `DirectoryLookup` objects up-to-date"
This reverts commit 8503c688. This patch causes some issues with `#include_next`: https://github.com/llvm/llvm-project/issues/53161
2022-01-13 16:29:44 +01:00
Simon Pilgrim 01494c6a73 [X86] Add tests showing failure to merge shuffles through avx2 shift binops 2022-01-13 15:25:13 +00:00
Simon Pilgrim ec5b63ba96 [X86] Add tests showing failure to merge shuffles through xop shift binops 2022-01-13 15:14:20 +00:00
Arthur O'Dwyer 42185ad870 [libc++] Add tests verifying alphabetical order for several things.
These things are header #includes, CMakeLists.txt, and module.modulemap.

Differential Revision: https://reviews.llvm.org/D116958
2022-01-13 09:58:56 -05:00
Erich Keane b699e8b11a Add another assert to cpu-dispatch emission to help track down a tough
to repro error.

As mentioned yesterday, I've got a problem that I can only reproduce on
Godbolt (none of the build configs on my local machine!), so this is at
least somewhat usable until I figure out a cause.
2022-01-13 06:54:08 -08:00
Simon Pilgrim 0c6c588a9b [DAG] Add ISD::ROTL/ROTR to TargetLoweringBase::isBinOp
Allows shuffle combining through rotation nodes
2022-01-13 14:32:14 +00:00
Simon Pilgrim 08edc8a74b [X86] Add tests showing failure to merge shuffles through rotation binops 2022-01-13 14:32:14 +00:00
Denys Shabalin a8a2ee6331 [mlir] Introduce C API for PDL dialect types
This change introduces C API helper functions to work with PDL types.
Modification closely follow the format of the https://reviews.llvm.org/D116546.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D117221
2022-01-13 15:29:01 +01:00
Denys Shabalin edcac733dc [mlir] Fix reference to out of date CMake function
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D117222
2022-01-13 15:26:36 +01:00
David Spickett cf7bfd6d05 [lldb][AArch64] Remove armv8.3-a flag from tagged memory read test
This was left over from when I had used some pointer authentication
instructions to sign the pointer. Then I realised that simply setting
the top byte is enough to prove the ABI plugin is being called.

Top byte ignore is a feature of the armv8-a architecure and doesn't
need any extra compiler flags.
2022-01-13 14:25:18 +00:00
Eugene Zhulenev 764e52f0d4 [DebugInfo][InstrRef] Short-circuit unnecessary preferred location map construction
Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D117162
2022-01-13 06:24:52 -08:00
Sam McCall fc7a9f36a9 [clangd] Ignore cvr-qualifiers in selection.
The AST doesn't track their locations, and the default behavior of attributing
them to the lexically-enclosing node is sloppy and often inaccurate.

Also add a couple of passing test cases for declarators that weren't obvious.

Differential Revision: https://reviews.llvm.org/D117185
2022-01-13 15:08:50 +01:00
Jon Chesterfield 4395608939 [openmp] Mark used variables as retain as well
D97446 changed the behaviour of 'used'. Compensate.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D117211
2022-01-13 13:57:32 +00:00
Nikita Popov aba7c3c033 [ConstantFold] Check uniform value in ConstantFoldLoadFromConst()
This case is automatically handled if ConstantFoldLoadFromConstPtr()
is used. Make sure that ConstantFoldLoadFromConst() also handles it.
2022-01-13 14:40:19 +01:00
Petar Avramovic 235886e174 AMDGPU/GlobalISel: Fix custom legalizatation for fceil 2022-01-13 14:29:30 +01:00
Petar Avramovic 1919d2c931 AMDGPU/GlobalISel: Regenerate fceil test (NFC) 2022-01-13 14:29:29 +01:00
Sander de Smalen b92102a6d7 [AArch64] Add native CPU detection for Neoverse-V1.
Map Main ID part number 0xd40 to neoverse-v1, as described in the
Neoverse-V1 Technical Reference Manual:

https://developer.arm.com/documentation/101427/0101/Register-descriptions/AArch64-system-registers/MIDR-EL1--Main-ID-Register--EL1

Differential Revision: https://reviews.llvm.org/D117207
2022-01-13 12:58:54 +00:00
Sam McCall 2b2dbe6126 [clangd] Selection: Prune gtest TEST()s earlier
When searching for AST nodes that may overlap the selection, mayHit() was only
attempting to prune nodes whose begin/end are both in the main file.

While failing to prune never gives wrong results, it hurts performance.
In GTest unit-tests, `TEST()` macros at the top level declare classes.
These were never pruned and we traversed *every* such class for any selection.

We fix this by reasoning about what tokens such a node might claim.
They must lie within its ultimate macro expansion range, so if this doesn't
overlap with the selection, we can prune the node.

Differential Revision: https://reviews.llvm.org/D116978
2022-01-13 13:58:42 +01:00
Kirill Bobyrev a9bf32763d [clangd] Fix build after D115243
The api for loadIndex changed but was not updated everywhere due to
differences in the build configuration.
2022-01-13 13:44:19 +01:00
Simon Pilgrim 57a551a8df [X86][AVX] lowerShuffleAsLanePermuteAndShuffle - don't split element rotate patterns
Partial element rotate patterns (e.g. for element insertion on Issue #53124) were being split if every lane wasn't crossing, but really there's a good repeated mask hiding in there.
2022-01-13 11:59:08 +00:00
Evgeny Mandrikov 971bd6f834 Fix build failure with MSVC in C++20 mode
Without this patch when using CMAKE_CXX_STANDARD=20
and MSVC 19.30.30705.0 compilation fails with

clang\lib\Tooling\Syntax\Tree.cpp(347): error C2666: 'clang::syntax::Tree::ChildIteratorBase<clang::syntax::Tree::ChildIterator,clang::syntax::Node>::operator ==': 4 overloads have similar conversions
clang\lib\Tooling\Syntax\Tree.cpp(392): error C2666: 'clang::syntax::Tree::ChildIteratorBase<clang::syntax::Tree::ChildIterator,clang::syntax::Node>::operator ==': 4 overloads have similar conversions

Note that removed comment that
"iterator_facade_base requires == to be a member"
was made obsolete by change https://reviews.llvm.org/D78938

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D116904
2022-01-13 12:55:16 +01:00
David Green 61888d97f6 [AArch64] Basic demand elements for some intrinsics
A lot of neon intrinsics work lane-wise, meaning that non-demanded
elements in and not demanded out. This teaches that to
AArch64TTIImpl::simplifyDemandedVectorEltsIntrinsic for some simple
single-input truncate intrinsics, which can help remove unnecessary
instructions.

Differential Revision: https://reviews.llvm.org/D117097
2022-01-13 11:53:12 +00:00
Simon Pilgrim 4f19bb6f28 [X86][AVX] Add v8f32/v8i32 01289abc test case
Blend Insertion + Element Rotation pattern similar to Issue #53124
2022-01-13 11:37:49 +00:00
Javier Setoain 7c56458616 [mlir] Fix scalable type translation in splat element attr
LLVM Dialect Constant Op translations assume that if the attribute is a
vector, it's a fixed length one, generating an invalid translation for
constant scalable vector initializations.

Differential Revision: https://reviews.llvm.org/D117125
2022-01-13 11:14:41 +00:00
Alex Bradbury 36a5491832 [llvm-objdump][test] Add RISC-V objdump test case
This test case captures the current state of support for printing branch
targets.

Differential Revision: https://reviews.llvm.org/D116676
2022-01-13 11:13:51 +00:00
Florian Hahn 3f2fb767e3
[VPlan] Make IV operand explicit for VPWidenCanonicalIVRecipe (NFC).
This makes the def-use relationship between VPCanonicalIVPHIRecipe and
VPWidenCanonicalIVRecipe explicit. Needed for D117140.
2022-01-13 11:13:05 +00:00
Simon Pilgrim a5507d2e25 Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC. 2022-01-13 11:10:50 +00:00
Simon Pilgrim 4f414af6a7 Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC. 2022-01-13 11:10:50 +00:00
Simon Pilgrim 37ebec68a8 [MIPS] Mips16DAGToDAGISel::selectAddr - Use cast<> instead of dyn_cast<> to avoid dereference of nullptr
The pointer is always dereferenced immediately below, so assert the cast is correct instead of returning nullptr
2022-01-13 11:10:49 +00:00
Hans Wennborg 2bc57d85eb Don't override __attribute__((no_stack_protector)) by inlining (PR52886)
Since 26c6a3e736, LLVM's inliner will "upgrade" the caller's stack protector
attribute based on the callee. This lead to surprising results with Clang's
no_stack_protector attribute added in 4fbf84c173 (D46300). Consider the
following code compiled with clang -fstack-protector-strong -Os
(https://godbolt.org/z/7s3rW7a1q).

  extern void h(int* p);

  inline __attribute__((always_inline)) int g() {
    return 0;
  }

  int __attribute__((__no_stack_protector__)) f() {
    int a[1];
    h(a);
    return g();
  }

LLVM will inline g() into f(), and f() would get a stack protector, against the
users explicit wishes, potentially breaking the program e.g. if h() changes the
value of the stack cookie. That's a miscompile.

More recently, bc044a88ee (D91816) addressed this problem by preventing
inlining when the stack protector is disabled in the caller and enabled in the
callee or vice versa. However, the problem remained if the callee is marked
always_inline as in the example above. This affected users, see e.g.
http://crbug.com/1274129 and http://llvm.org/pr52886.

One way to fix this would be to prevent inlining also in the always_inline
case. Despite the name, always_inline does not guarantee inlining, so this
would be legal but potentially surprising to users.

However, I think the better fix is to not enable the stack protector in a
caller based on the callee. The motivation for the old behaviour is unclear, it
seems counter-intuitive, and causes real problems as we've seen.

This commit implements that fix, which means in the example above, g() gets
inlined into f() (also without always_inline), and f() is emitted without stack
protector. I think that matches most developers' expectations, and that's also
what GCC does.

Another effect of this change is that a no_stack_protector function can now be
inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856):

  extern void h(int* p);

  inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() {
    return 0;
  }

  int f() {
    int a[1];
    h(a);
    return g();
  }

I think that's fine. Such code would be unusual since no_stack_protector is
normally applied to a program entry point which sets up the stack canary. And
even if such code exists, inlining doesn't change the semantics: there is still
no stack cookie setup/check around entry/exit of the g() code region, but there
may be in the surrounding context, as there was before inlining. This also
matches GCC.

See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722

Differential revision: https://reviews.llvm.org/D116589
2022-01-13 12:04:49 +01:00