Summary:
This patch fix the following issues with visitExtractElementInst:
1. Restrict VectorUtils::findScalarElement to fixed-length vector.
For scalable type, the number of elements in shuffle mask is
unknown at compile-time.
2. Fix out-of-range calculation for fixed-length vector.
3. Skip scalable type when analysis rely on fixed number of elements.
4. Add unit tests to check functionality of extractelement for scalable type.
Reviewers: sdesmalen, efriedma, spatel, nikic
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78267
See PR45812 for motivation.
No explicit test since I couldn't figure out how to get the
current disk drive in lower case into a form in lit where I could
mkdir it and cd to it. But the change does have test coverage in
that I can remove the case normalization in lit, and tests failed
on several bots (and for me locally if in a pwd with a lower-case
drive) without that normalization prior to this change.
Differential Revision: https://reviews.llvm.org/D79531
I discovered that the limit on possible builtins managed by this
ObjCOrBuiltin variable is too low when combining large targets, since
aux-targets are appended to the targets list. A runtime assert exists
for this, however this patch creates a static-assert as well.
The logic for said static-assert is to make sure we have the room for
the aux-target and target to both be the largest list, which makes sure
we have room for all possible combinations.
I also incremented the number of bits by 1, since I discovered this
currently broken. The current bit-count was 36, so this doesn't
increase any size.
Summary:
This patch fixes the following issues in visitInsertElementInst:
1. Bail out for scalable type when analysis requires fixed size number of vector elements.
2. Use cast<FixedVectorType> to get vector number of elements. This ensure assertion
on scalable vector type.
3. For scalable type, avoid folding a chain of insertelement into splat:
insertelt(insertelt(insertelt(insertelt X, %k, 0), %k, 1), %k, 2) ...
->
shufflevector(insertelt(X, %k, 0), undef, zero)
The length of scalable vector is unknown at compile-time, therefore we don't know if
given insertelement sequence is valid for splat.
Reviewers: sdesmalen, efriedma, spatel, nikic
Reviewed By: sdesmalen, efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78895
This is a wrapper around vector of NamedAttributes that keeps track of whether sorted and does some minimal effort to remain sorted (doing more, e.g., appending attributes in sorted order, could be done in follow up). It contains whether sorted and if a DictionaryAttr is queried, it caches the returned DictionaryAttr along with whether sorted.
Change MutableDictionaryAttr to always return a non-null Attribute even when empty (reserve null cases for errors). To this end change the getter to take a context as input so that the empty DictionaryAttr could be queried. Also create one instance of the empty dictionary attribute that could be reused without needing to lock context etc.
Update infer type op interface to use DictionaryAttr and use NamedAttrList to avoid incurring multiple conversion costs.
Fix bug in sorting helper function.
Differential Revision: https://reviews.llvm.org/D79463
These were testing byval private kernel arguments, which doesn't make
any sense and has never been used. There didn't seem to be any tests
for real value struct arguments, which are.
The original patch (rG86dfbc676ebe) exposed an existing bug:
we could wrongly cast a constant expression to BinaryOperator
because the pattern matching allows that. This adds a check
for that case, and there's a reduced test case to verify no
crashing.
Original commit message:
This builds on the or-reduction bailout that was added with D67841.
We still do not have IR-level load combining, although that could
be a target-specific enhancement for -vector-combiner.
The heuristic is narrowly defined to catch the motivating case from
PR39538:
https://bugs.llvm.org/show_bug.cgi?id=39538
...while preserving existing functionality.
That is, there's an unmodified test of pure load/zext/store that is
not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll.
That's the reason for the logic difference to require the 'or'
instructions. The chances that vectorization would actually help a
memory-bound sequence like that seem small, but it looks nicer with:
vpmovzxwd (%rsi), %xmm0
vmovdqu %xmm0, (%rdi)
rather than:
movzwl (%rsi), %eax
movl %eax, (%rdi)
...
In the motivating test, we avoid creating a vector mess that is
unrecoverable in the backend, and SDAG forms the expected bswap
instructions after load combining:
movzbl (%rdi), %eax
vmovd %eax, %xmm0
movzbl 1(%rdi), %eax
vmovd %eax, %xmm1
movzbl 2(%rdi), %eax
vpinsrb $4, 4(%rdi), %xmm0, %xmm0
vpinsrb $8, 8(%rdi), %xmm0, %xmm0
vpinsrb $12, 12(%rdi), %xmm0, %xmm0
vmovd %eax, %xmm2
movzbl 3(%rdi), %eax
vpinsrb $1, 5(%rdi), %xmm1, %xmm1
vpinsrb $2, 9(%rdi), %xmm1, %xmm1
vpinsrb $3, 13(%rdi), %xmm1, %xmm1
vpslld $24, %xmm0, %xmm0
vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
vpslld $16, %xmm1, %xmm1
vpor %xmm0, %xmm1, %xmm0
vpinsrb $1, 6(%rdi), %xmm2, %xmm1
vmovd %eax, %xmm2
vpinsrb $2, 10(%rdi), %xmm1, %xmm1
vpinsrb $3, 14(%rdi), %xmm1, %xmm1
vpinsrb $1, 7(%rdi), %xmm2, %xmm2
vpinsrb $2, 11(%rdi), %xmm2, %xmm2
vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
vpinsrb $3, 15(%rdi), %xmm2, %xmm2
vpslld $8, %xmm1, %xmm1
vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero
vpor %xmm2, %xmm1, %xmm1
vpor %xmm1, %xmm0, %xmm0
vmovdqu %xmm0, (%rsi)
movl (%rdi), %eax
movl 4(%rdi), %ecx
movl 8(%rdi), %edx
movbel %eax, (%rsi)
movbel %ecx, 4(%rsi)
movl 12(%rdi), %ecx
movbel %edx, 8(%rsi)
movbel %ecx, 12(%rsi)
Differential Revision: https://reviews.llvm.org/D78997
Summary:
This helps detect some missed BFI updates during CodeGenPrepare.
This is debug build only and disabled behind a flag.
Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts().
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77417
Summary:
These warnings were present when building llvm-libc in release mode.
```
workspace/llvm-project/libc/utils/benchmarks/LibcMemoryBenchmarkTest.cpp:50:34: warning: 'None' is deprecated: Use Align() or Align(1) instead [-Wdeprecated-declarations]
Conf.AddressAlignment = Align::None();
workspace/llvm-project/libc/utils/testutils/FDReaderUnix.cpp:19:7: warning: unused variable 'err' [-Wunused-variable]
int err = ::pipe(pipefd);
```
For test-utils it seems in general we should use `report_fatal_error` instead of asserts as these are turned off when building in release mode.
https://llvm.org/docs/CodingStandards.html#assert-liberally
Reviewers: abrachet, sivachandra
Reviewed By: abrachet, sivachandra
Subscribers: tschuett, libc-commits
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D79469
Summary:
Python2 has been removed from cygwin, this means anyone running the dump_format_style.py in a cygwin shell could pick up python3 instead
In Python3 all strings are unicode as the file is opened in binary mode we need to encode the contents string or we'll face the following error
```
Traceback (most recent call last):
File "./dump_format_style.py", line 228, in <module>
output.write(contents)
TypeError: a bytes-like object is required, not 'str'
```
Reviewed By: krasimir
Subscribers: cfe-commits
Tags: #clang, #clang-format
Differential Revision: https://reviews.llvm.org/D79326
Summary:
I thought it would be simpler to understand on how the unique names would
look like when an example is present. So I added some examples.
I found 2 such places to add examples
-Common blocks
-Module scope global data
Reviewers: schweitz, kiranchandramohan, sscalpone, jeanPerier, jdoerfert, DavidTruby
Reviewed By: schweitz, jeanPerier
Subscribers: llvm-commits, flang-commits
Tags: #flang, #llvm
Differential Revision: https://reviews.llvm.org/D79442
For empty directories (except the first one) we've been adding a file
with the same name as the directory to the result VFS mapping.
Differential Revision: https://reviews.llvm.org/D79551
vulkan-runtime-wrappers does not need MLIRSPIRVSerialization,
which is used by the ConvertGpuLaunchFuncToVulkanLaunchFunc pass
under the hood.
Differential Revision: https://reviews.llvm.org/D79577
The ostream operator<< is currently broken for std::complex with
specified field widths.
This patch a partial revert of c3478eff7a (reviewed as D71214),
restoring the correct behavior.
Differential Revision: https://reviews.llvm.org/D78816
Summary:
Avoid relocating DW_AT_call_pc, e.g. when a call PC is equal to the
function's low_pc as is the case in the test:
```
__Z5func1v:
0000000100007f94 b __Z5func2v
```
rdar://62952440
Reviewers: friss, JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79536
clock_gettime is documented to be available when _POSIX_TIMERS is
defined. Add a check for this.
Differential Revision: https://reviews.llvm.org/D79305
All entry points into ProcessGDBRemote that connect to the debug server
should connect to the replay server instead during reproducer replay.
This patch adds the necessary logic for ConnectRemote, which is
accessible from the SB API. This fixes active replay for
TestRecognizeBreakpoint.py as described in D78588.
When peeling out the epilogue we need to ignore illegal phis coming from stages
greater than the producer stage. Otherwise we end up with circular phi
dependencies.
Differential Revision: https://reviews.llvm.org/D79581
These tests fail due to a couple of changes to `move_iterator` for C++20:
* `move_iterator<I>::operator++(int)` returns `void` in C++20 if `I` doesn't model `forward_iterator`.
* `move_iterator<I>::reference` is calculated in C++20, so `I` must actually have an `operator*() const`.
Differential Revision: https://reviews.llvm.org/D79343
Summary:
1. Created a new common completion for the registers of the current context;
2. Apply this new common completion to the commands register read/write;
3. Unit test.
Reviewers: teemperor, JDevlieghere
Reviewed By: teemperor
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D79490
This patch adds deduction guides to <memory> to allow deducing
construction of shared_ptrs from unique_ptrs, and from weak_ptrs
and vice versa, as specified by C++17.
Differential Revision: https://reviews.llvm.org/D69603
This patch adds `_VSTD::` to some calls to `make_pair` inside the
implementations of searchers, to prevent things exploding if there is
a make_pair in an associated namespace of a user-defined type.
https://godbolt.org/z/xAFG98
Differential Revision: https://reviews.llvm.org/D72640
We currently rely on the TypeSystem implementation to initialize this value
with 0 in the GetChildCompilerTypeAtIndex call below. Let's just initialize
this variable like the rest.
Summary:
Remove incorrect usage of getNumElements() from visitCallInst(). The
number of elements was being used to construct a DemandedElts bitfield.
This operation does not make sense for scalable vectors. Cast to
FixedVectorType
Identified by test case Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_mla.c
Reviewers: rengolin, efriedma, sdesmalen, c-rhodes, david-arm
Reviewed By: david-arm
Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79524
Summary:
Properly forward TrackOrigins and Recover user options to the MSan pass under the new pass manager.
This makes the number of check-msan failures when ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER is TRUE go from 52 to 2.
Based on https://reviews.llvm.org/D77249.
Reviewers: nemanjai, vitalybuka, leonardchan
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79445
This patch adds various builtins under their corresponding feature macros:
Defined under __ARM_FEATURE_SVE2_AES:
- svaesd
- svaese
- svaesimc
- svaesmc
- svpmullb_pair
- svpmullt_pair
Defined under __ARM_FEATURE_SVE2_SHA3:
- svrax1
Defined under __ARM_FEATURE_SVE2_SM4:
- svsm4e
- svsm4ekey
Defined under __ARM_FEATURE_SVE2_BITPERM:
- svbdep
- svbext
- svbgrp