forked from OSchip/llvm-project
c32af4447f
Summary: These are in some sense the inverse of vmovl[bt]q: they take a vector of n wide elements and truncate each to half its width. So they only write half a vector's worth of output data, and therefore they also take an 'inactive' parameter to provide the other half of the data in the output vector. So vmovnb overwrites the even lanes of 'inactive' with the narrowed values from the main input, and vmovnt overwrites the odd lanes. LLVM had existing codegen which generates these MVE instructions in response to IR that takes two vectors of wide elements, or two vectors of narrow ones. But in this case, we have one vector of each. So my clang codegen strategy is to narrow the input vector of wide elements by simply reinterpreting it as the output type, and then we have two narrow vectors and can represent the operation as a vector shuffle that interleaves lanes from both of them. Even so, not all the cases I needed ended up being selected as a single MVE instruction, so I've added a couple more patterns that spot combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be generated as a VMOVN instruction with operands swapped. This commit adds the unpredicated forms only. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D74337 |
||
---|---|---|
.. | ||
ABITest | ||
CIndex | ||
ClangVisualizers | ||
TableGen | ||
TestUtils | ||
VtableTest | ||
analyzer | ||
check_cfc | ||
hmaptool | ||
perf-training | ||
valgrind | ||
CaptureCmd | ||
ClangDataFormat.py | ||
CmpDriver | ||
FindSpecRefs | ||
FuzzTest | ||
bash-autocomplete.sh | ||
builtin-defines.c | ||
clangdiag.py | ||
convert_arm_neon.py | ||
creduce-clang-crash.py | ||
find-unused-diagnostics.sh | ||
make-ast-dump-check.sh | ||
modfuzz.py | ||
token-delta.py |