Commit Graph

393949 Commits

Author SHA1 Message Date
Nikita Popov 1b61d837b9 [Inline] Add test for PR50589 (NFC) 2021-07-18 18:38:06 +02:00
Nikita Popov 59c33a0bc8 [Cloning] Remove unused parameter from CloneAndPruneFunctionInto() (NFC) 2021-07-18 18:38:06 +02:00
Simon Pilgrim 3a1b38049a [X86] Add i32 (shl (sr[la] exact sel(X,Y), C1), C2) test
Shows failure to fold sel(sra(X,C1),sra(Y,C1)) -> sra(sel(X,Y),C1) (and to retain the flags)
2021-07-18 16:48:57 +01:00
Kazu Hirata 958437de52 [Analysis] Remove getLoopPackage (NFC)
The last use was removed on Apr 28, 2014 in commit
c5a3139ebd.
2021-07-18 08:16:29 -07:00
Simon Pilgrim fcb710a7ad [NVPTX] Add select(cc,binop(),binop()) fast-math tests
As discussed on D106058 - we're not propagating the common flags to the merged binop
2021-07-18 15:30:24 +01:00
Deep Majumder d825309352 [analyzer] Handle std::make_unique
Differential Revision: https://reviews.llvm.org/D103750
2021-07-18 19:54:28 +05:30
Valentin Churavy a56fe117e0
Revert "[Orc] Add verylazy example for C-bindings"
Broke ASAN buildbot, will reland with fixes

This reverts commit b5a6ad8c89.
2021-07-18 16:21:37 +02:00
Simon Pilgrim 1a6a8443c2 [DAG] Move select(cc, binop(), binop()) folds into DAGCombiner::foldSelectOfBinops. NFCI.
I'm going to extend the functionality started in D106058 so move the folds into their own method to reduce the amount of code in DAGCombiner::visitSELECT
2021-07-18 14:54:41 +01:00
Shilei Tian 4357cfc792 [OpenMP][Offloading] Add -g when compiling deviceRTLs in debug mode
Currently when we compile the project in debug mode, `-g` will not be added to
compilation flag. The bc files generated in different mode are of different size.
When using GPU debuggers like `cuda-gdb`, it is expected to provide more info
with a debug version of bc lib.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D106229
2021-07-18 09:34:54 -04:00
Simon Pilgrim 51a12d2ff0 [X86][SSE] matchShuffleWithPACK - avoid poison pollution from bitcasting multiple elements together.
D106053 exposed that we've not been taking into account that by bitcasting smaller elements together and then performing a ComputeKnownBits on the result we'd be allowing a poison element to influence other neighbouring elements being used in the pack. Instead we now peek through any existing bitcast to ensure that the source type already matches the width source of the pack node we're trying to match.

This has also been a chance to stop matchShuffleWithPACK creating unused nodes on the fly which could affect oneuse tests during shuffle lowering/combining.

The only regression we're seeing is due to being unable to peek through a bitcast as its on the other side of a extract_subvector - which should go away once we finally allow shuffle combining across different vector widths (by making matchShuffleWithPACK using const SelectionDAG& we've gotten closer to this - see PR45974).
2021-07-18 14:25:28 +01:00
Simon Pilgrim 367ec7755f [Orc] Remove unnecessary <string> include dependency from Orc headers. NFC.
At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string.

Move the include down to any cpp implementation where std::string is actually used.
2021-07-18 12:31:13 +01:00
Sanjay Patel 0e15de2d0c [InstCombine] fold reassociative FP add into start value of fadd reduction
This pattern is visible in unrolled and vectorized loops.
Although the backend seems to be able to reassociate to
ideal form in the examples I looked at, we might as well
do that in IR for efficiency.
2021-07-18 06:26:20 -04:00
Sanjay Patel 0590502265 [InstCombine][test] add tests for fadd reductions; NFC 2021-07-18 06:26:20 -04:00
Valentin Churavy b5a6ad8c89
[Orc] Add verylazy example for C-bindings
Still WIP, based on the Kaleidoscope/BuildingAJIT/Chapter4.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D104799
2021-07-18 12:07:16 +02:00
Valentin Churavy 0c164ea9e6
[MLIR][CAPI] On MINGW don't link against libMLIR
Cross-compiling MLIR with MINGW failed because adding libMLIR to the libraries to link against would lead to duplicated symbols.

```
[09:28:14] ninja: job failed: : && /opt/bin/i686-w64-mingw32-libgfortran4-cxx03/i686-w64-mingw32-g++ --sysroot=/opt/i686-w64-mingw32/i686-w64-mingw32/sys-root/  -remap -D__USING_SJLJ_EXCEPTIONS__ -D__CRT__NO_INLINE -fno-gnu-unique -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment  -O2 -DNDEBUG   -shared -o bin/libMLIRPublicAPI.dll -Wl,--out-implib,lib/libMLIRPublicAPI.dll.a -Wl,--major-image-version,0,--minor-image-version,0 tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/AffineExpr.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/AffineMap.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/BuiltinAttributes.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/BuiltinTypes.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/Diagnostics.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/IntegerSet.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/IR.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/Pass.cpp.obj tools/mlir/lib/CAPI/IR/CMakeFiles/obj.MLIRCAPIIR.dir/Support.cpp.obj tools/mlir/lib/CAPI/Registration/CMakeFiles/obj.MLIRCAPIRegistration.dir/Registration.cpp.obj tools/mlir/lib/CAPI/Dialect/CMakeFiles/obj.MLIRCAPILinalg.dir/Linalg.cpp.obj tools/mlir/lib/CAPI/Dialect/CMakeFiles/obj.MLIRCAPISCF.dir/SCF.cpp.obj tools/mlir/lib/CAPI/Dialect/CMakeFiles/obj.MLIRCAPIShape.dir/Shape.cpp.obj tools/mlir/lib/CAPI/Dialect/CMakeFiles/obj.MLIRCAPIStandard.dir/Standard.cpp.obj tools/mlir/lib/CAPI/Dialect/CMakeFiles/obj.MLIRCAPITensor.dir/Tensor.cpp.obj tools/mlir/lib/CAPI/Transforms/CMakeFiles/obj.MLIRCAPITransforms.dir/Passes.cpp.obj  lib/libMLIR.dll.a  lib/libMLIRIR.a  lib/libMLIRParser.a  lib/libMLIRSupport.a  lib/libMLIRPass.a  lib/libMLIRCAPIIR.a  lib/libMLIRAffine.a  lib/libMLIRAffineEDSC.a  lib/libMLIRAffineTransforms.a  lib/libMLIRAffineUtils.a  lib/libMLIRArmNeon.a  lib/libMLIRArmSVE.a  lib/libMLIRAsync.a  lib/libMLIRAsyncTransforms.a  lib/libMLIRAVX512.a  lib/libMLIRComplex.a  lib/libMLIRGPU.a  lib/libMLIRLinalgAnalysis.a  lib/libMLIRLinalgEDSC.a  lib/libMLIRLinalg.a  lib/libMLIRLinalgTransforms.a  lib/libMLIRLinalgUtils.a  lib/libMLIRLLVMIRTransforms.a  lib/libMLIRLLVMIR.a  lib/libMLIRLLVMAVX512.a  lib/libMLIRLLVMArmNeon.a  lib/libMLIRLLVMArmSVE.a  lib/libMLIRNVVMIR.a  lib/libMLIRROCDLIR.a  lib/libMLIROpenACC.a  lib/libMLIROpenMP.a  lib/libMLIRPDL.a  lib/libMLIRPDLInterp.a  lib/libMLIRQuant.a  lib/libMLIRSCF.a  lib/libMLIRSCFTransforms.a  lib/libMLIRSDBM.a  lib/libMLIRShape.a  lib/libMLIRShapeOpsTransforms.a  lib/libMLIRSPIRV.a  lib/libMLIRSPIRVModuleCombiner.a  lib/libMLIRSPIRVConversion.a  lib/libMLIRSPIRVTransforms.a  lib/libMLIRSPIRVUtils.a  lib/libMLIRStandard.a  lib/libMLIRStandardOpsTransforms.a  lib/libMLIRTensor.a  lib/libMLIRTensorTransforms.a  lib/libMLIRTosa.a  lib/libMLIRTosaTransforms.a  lib/libMLIRVector.a  lib/libMLIRCAPIIR.a  lib/libMLIRLinalg.a  lib/libMLIRCAPIIR.a  lib/libMLIRSCF.a  lib/libMLIRCAPIIR.a  lib/libMLIRShape.a  lib/libMLIRCAPIIR.a  lib/libMLIRStandard.a  lib/libMLIRCAPIIR.a  lib/libMLIRTensor.a  lib/libMLIRTransforms.a  lib/libMLIRAsync.a  lib/libMLIRAffineUtils.a  lib/libMLIRLinalgAnalysis.a  lib/libMLIRLinalgEDSC.a  lib/libMLIRVectorToSCF.a  lib/libMLIRVectorToLLVM.a  lib/libMLIRArmNeonToLLVM.a  lib/libMLIRArmNeon.a  lib/libMLIRLLVMArmNeon.a  lib/libMLIRAVX512ToLLVM.a  lib/libMLIRAVX512.a  lib/libMLIRLLVMAVX512.a  lib/libMLIRArmSVEToLLVM.a  lib/libMLIRArmSVE.a  lib/libMLIRLLVMArmSVE.a  lib/libMLIRStandardToLLVM.a  lib/libMLIRTargetLLVMIRModuleTranslation.a  lib/libMLIRLLVMIRTransforms.a  lib/libMLIRLLVMIR.a  lib/libMLIROpenMP.a  lib/libMLIRTranslation.a  lib/libMLIRSPIRVConversion.a  lib/libMLIRSPIRV.a  lib/libMLIRParser.a  lib/libMLIRTransforms.a  lib/libMLIRVector.a  lib/libMLIRAffineEDSC.a  lib/libMLIRLinalg.a  lib/libMLIRCopyOpInterface.a  lib/libMLIRTosa.a  lib/libMLIRQuant.a  lib/libMLIRTransformUtils.a  lib/libMLIRLoopAnalysis.a  lib/libMLIRPresburger.a  lib/libMLIRRewrite.a  lib/libMLIRPDLToPDLInterp.a  lib/libMLIRPass.a  lib/libMLIRAnalysis.a  lib/libMLIRAffine.a  lib/libMLIRSCF.a  lib/libMLIRLoopLikeInterface.a  lib/libMLIRPDLInterp.a  lib/libMLIRPDL.a  lib/libMLIRInferTypeOpInterface.a  lib/libMLIRStandard.a  lib/libMLIRTensor.a  lib/libMLIREDSC.a  lib/libMLIRCastInterfaces.a  lib/libMLIRVectorInterfaces.a  lib/libMLIRSideEffectInterfaces.a  lib/libMLIRDialect.a  lib/libMLIRViewLikeInterface.a  lib/libMLIRCallInterfaces.a  lib/libMLIRControlFlowInterfaces.a  lib/libMLIRIR.a  lib/libMLIRSupport.a  lib/libLLVM.dll.a  -lkernel32 -luser32 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32 && :
[09:28:14] lib/libMLIRAffine.a(AffineOps.cpp.obj):AffineOps.cpp:(.text+0x1d600): multiple definition of `mlir::AffineDialect::initialize()'
[09:28:14] lib/libMLIR.dll.a(d008729.o):(.text+0x0): first defined here
[09:28:14] lib/libMLIRArmSVE.a(ArmSVEDialect.cpp.obj):ArmSVEDialect.cpp:(.text+0x5be0): multiple definition of `mlir::arm_sve::ArmSVEDialect::initialize()'
[09:28:14] lib/libMLIR.dll.a(d039020.o):(.text+0x0): first defined here
[09:28:14] lib/libMLIRAsync.a(Async.cpp.obj):Async.cpp:(.text+0xc0d0): multiple definition of `mlir::async::AsyncDialect::initialize()'
[09:28:14] lib/libMLIR.dll.a(d023173.o):(.text+0x0): first defined here
...
```

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D106169
2021-07-18 12:06:12 +02:00
Deep Majumder 0cd98bef1b [analyzer] Handle std::swap for std::unique_ptr
This patch handles the `std::swap` function specialization
for `std::unique_ptr`. Implemented to be very similar to
how `swap` method is handled

Differential Revision: https://reviews.llvm.org/D104300
2021-07-18 14:38:55 +05:30
Craig Topper 00c1cc867f [RISCV] Add more i32 srem/sdiv with power of 2 constant tests. NFC
Add a small power 2 srem test to match existing sdiv test. Add
larger power of 2 test to both.

The larger constant test shows materialization of a constant
for an AND in the RV64 code. We should be using W shift instructions
to match the RV32 code.
2021-07-18 00:21:14 -07:00
David Blaikie dac582ad3a DebugInfo: Name class templates with default arguments consistently (both direct naming, and as a template argument for a function template)
It's noteworthy that GCC has the same bug here, which is a bit
surprising. Both Clang and GCC's bug is only for function template
arguments that are themselves templates with default template arguments
(f1<t1<int[, missing_default_here]>>). Probably because function name
matching isn't generally necessary - whereas type matching is necessary
for DWARF consumers to associate declarations and definitions across
translation units, so the bug's been addressed there already - but
continued to exist for function templates since it's fairly benign
there.

I came across this while working on a change that could reconstitute
these pretty printed names based on the rest of the DWARF, reducing the
size of the DWARF by not having to encode all the template parameters in
the name string. That reconstitution code can't tell the difference
between a defaulted argument or not, so couldn't create the current
buggy-ish output.

Making the names more consistent between direct and indirect references,
and between function and class templates seems all to the good.

(I fixed the function template version of this a few years back in
9fdd09a4cc - clearly I should've looked
more closely and generalized the code better so it only had to be fixed
once - well, doing that here now)
2021-07-17 23:58:15 -07:00
Amara Emerson 4c55cdb00a [GlobalISel] Fix known bits for G_BSWAP and B_BITREVERSE not doing anything.
llvm::KnownBits::byteSwap() and reverse() don't modify in-place, so
we weren't actually computing anything. This was causing a miscompile on an
arm64 stage2 bootstrap clang build.
2021-07-17 23:07:16 -07:00
David Carlier 657eb94324 [Sanitizers] FutexWake fix typo for FreeBSD code path. 2021-07-18 07:02:21 +01:00
Jon Roelofs 5cd63e9ec2 [AArch64][GlobalISel] Legalize bswap <2 x i16>
Differential revision: https://reviews.llvm.org/D105935
2021-07-17 15:31:15 -07:00
Nikita Popov ffe94738ed [ExecutionEngine] Fix GEP type
Fix bug introduced in 2c68ecccc9,
the GEP type was off-by-ptr. Apparently I didn't run the MLIR
tests.
2021-07-17 23:45:00 +02:00
David Green 5acddf5b09 [ARM] Lower non-extended small gathers via truncated gathers.
Corollary to 1113e06821 this allows us to
match gather that dont produce a full vector width results. They use an
extended gather which is truncated back to the original type.
2021-07-17 22:38:31 +01:00
Eli Friedman e41e865b15 [AArch64] Prepare for changes to STEP_VECTOR.
Rewrite patterns to assume that the operand of STEP_VECTOR is a
constant. The old patterns will stop working when the operand is changed
from a Constant to a TargetConstant. (See D105673.)

Add test coverage for certain patterns that weren't exercised by
existing regression tests.

Differential Revision: https://reviews.llvm.org/D105847
2021-07-17 14:13:41 -07:00
Nikita Popov f164bc52b6 [IRBuilder] Deprecate CreateGEP() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.

Also remove the separate overload for a single index, as this is
already covered by the ArrayRef overload.
2021-07-17 22:57:51 +02:00
Nikita Popov 2c68ecccc9 [OpaquePtr] Remove uses of CreateGEP() without element type
Remove uses of to-be-deprecated API. In cases where the correct
element type was not immediately obvious to me, fall back to
explicit getPointerElementType().
2021-07-17 22:56:27 +02:00
Nikita Popov f95d26006e [IRBuilder] Deprecate CreateInBoundsGEP() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
2021-07-17 21:27:16 +02:00
Nikita Popov 6225d0cc6e [OpaquePtr] Remove uses of CreateInBoundsGEP() without element type
Remove uses of to-be-deprecated API.

Unfortunately this one mostly just makes the use of
getPointerElementType() explicit, as the correct type to use
wasn't immediately available (deriving it from QualType is left
as an excercise to the reader).
2021-07-17 21:27:16 +02:00
Craig Topper d0f8047d37 [RISCV] Teach computeKnownBitsForTargetNode that VLENB will never be more than 65536/8. 2021-07-17 11:24:20 -07:00
Vy Nguyen f44fc35149 [libcxx] Updated test and seemingly incorrect comment from it.
Background: https://reviews.llvm.org/D82490#inline-1007741

Differential Revision: https://reviews.llvm.org/D106092
2021-07-17 13:46:28 -04:00
Jez Ng 428a7c1b38 [lld-macho] Have ICF operate on all sections at once
ICF previously operated only within a given OutputSection. We would
merge all CFStrings first, then merge all regular code sections in a
second phase. This worked fine since CFStrings would never reference
regular `__text` sections. However, I would like to expand ICF to merge
functions that reference unwind info. Unwind info references the LSDA
section, which can in turn reference the `__text` section, so we cannot
perform ICF in phases.

In order to have ICF operate on InputSections spanning multiple
OutputSections, we need a way to distinguish InputSections that are
destined for different OutputSections, so that we don't fold across
section boundaries. We achieve this by creating OutputSections early,
and setting `InputSection::parent` to point to them. This is what
LLD-ELF does. (This change should also make it easier to implement the
`section$start$` symbols.)

This diff also folds InputSections w/o checking their flags, which I
think is the right behavior -- if they are destined for the same
OutputSection, they will have the same flags in the output (even if
their input flags differ). I.e. the `parent` pointer check subsumes the
`flags` check. In practice this has nearly no effect (ICF did not become
any more effective on chromium_framework).

I've also updated ICF.cpp's block comment to better reflect its current
status.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D105641
2021-07-17 13:42:51 -04:00
Christopher Di Bella 182ba8ab1b [libcxx][ranges] makes `ranges::subrange` a borrowed range
Differential Revision: https://reviews.llvm.org/D106207
2021-07-17 17:25:56 +00:00
Shilei Tian d3454ee8d2 [AbstractAttributor] Fix two issues in folding __kmpc_is_spmd_exec_mode
This patch fixed two issues found when folding `__kmpc_is_spmd_exec_mode`:
1. When the reaching kernels are empty, it should not fold to generic mode.
2. When creating AA for the caller when updating information, the dependency
   should be required.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D106209
2021-07-17 13:13:44 -04:00
Nikita Popov ca161e0c35 [IRBuilder] Deprecate CreateStructGEP() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
2021-07-17 18:48:22 +02:00
Nikita Popov 4ace6008f2 [OpaquePtr] Remove uses of CreateStructGEP() without element type
Remove uses of to-be-deprecated API.
2021-07-17 18:48:21 +02:00
ShihPo Hung be8159bfa5 [RISCV][RVV] Precommit a test case for D105684
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D105685
2021-07-18 00:43:17 +08:00
Nikita Popov 03e4351013 [IRBuilder] Deprecate CreateConstGEP1_32() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
2021-07-17 18:32:36 +02:00
Nikita Popov 6d3e7c783b [OpaquePtr] Remove uses of CreateConstGEP1_32() without element type
Remove uses of to-be-deprecated API. I've fallen back to calling
getPointerElementType() in some cases where the correct type wasn't
immediately obvious to me.
2021-07-17 18:32:36 +02:00
Simon Pilgrim 9277ce7932 [DebugInfo] Remove unnecessary <string> include dependency from DebugInfo headers. NFC.
At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string.

Move the include down to any cpp implementation where std::string is actually used.
2021-07-17 16:56:06 +01:00
Nikita Popov 5df48493f0 [IRBuilder] Deprecate CreateConstInBoundsGEP1_64() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
2021-07-17 17:07:48 +02:00
Nikita Popov 5071360eb1 [OpaquePtr] Remove uses of CGF.Builder.CreateConstInBoundsGEP1_64() without type
Remove uses of to-be-deprecated API.
2021-07-17 17:07:46 +02:00
Nikita Popov 32e2729e33 [IRBuilder] Deprecate CreateConstGEP1_64() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
2021-07-17 16:43:42 +02:00
Nikita Popov 357756ecf6 [OpaquePtr] Remove uses of CreateConstGEP1_64() without element type
Remove uses of to-be-deprecated API.
2021-07-17 16:43:20 +02:00
Nikita Popov 251a11fdcf [IRBuilder] Deprecate CreateConstInBoundsGEP2_64() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
2021-07-17 16:42:39 +02:00
Nikita Popov 4737eebc0d [OpaquePtr] Remove uses of CreateConstInBoundsGEP2_64() without type
Remove uses of to-be-deprecated API.
2021-07-17 16:42:10 +02:00
Nikita Popov 7db463ced5 [IRBuilder] Deprecate CreateConstGEP2_64() without element type
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
2021-07-17 16:41:51 +02:00
Kazu Hirata 1993b73755 [Analaysis, CodeGen] Remove getHotSucc (NFC)
These functions seem to be unused for at least 5 years.
2021-07-17 07:31:36 -07:00
Nikita Popov 7e21ded88d [IR] Don't accept null type in ConstantExpr::getGetElementPtr()
This is the same change as D105653, but for the constant expression
version of the API.
2021-07-17 15:59:31 +02:00
Nikita Popov be5af50e7d [BPF] Use elementtype attribute for preserve.array/struct.index intrinsics
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.

This patch:

 * Adds a verifier check that the attribute is required.
 * Adds it in the IRBuilder methods for these intrinsics.
 * Autoupgrades old bitcode without the attribute.
 * Updates the lowering code to use the attribute rather than
   the pointer element type.
 * Updates lots of tests to specify the attribute.
 * Adds -force-opaque-pointers to the intrinsic-array.ll test
   to demonstrate they work now.

https://reviews.llvm.org/D106184
2021-07-17 11:09:18 +02:00
Craig Topper 173332d175 [RISCV] Manually emit the best shift for VSCALE lowering to improve codegen.
We assume VLENB is a multiple of 8 and previously relied on shift
pairs being optimized to an AND+SHL/SHR and computeKnownBits
removing the AND. This doesn't happen if (vlenb >> 3) gets CSEd
to have multiple uses. This patch manually emits the best shift
to workaround this.
2021-07-17 00:52:07 -07:00