The prior code intermixed several concerns - the actual materialization of the offset, the choice of destination register, and whether to prune the ADDI. This version factors the first part out, and then reasons only about the later two. My intention is to merge the adjustReg routine with the one from frame lowering, and then explore using the merged result to simplify frame setup and tear down.
This change is conceptually NFC, but since it results in slightly different vreg usage, the end result can change register allocation in minor ways.
Differential Revision: https://reviews.llvm.org/D138502
This patch makes sure the compiler uses R16/R17 on avrtiny (attiny10
etc) instead of R0/R1.
Some notes:
* For the NEGW and ROLB instructions, it adds an explicit zero
register. This is necessary because the zero register is different
on avrtiny (and InstrInfo Uses lines need a fixed register).
* Not entirely sure about putting all tests in features/avr-tiny.ll,
but it doesn't seem like the "target-cpu"="attiny10" attribute
works.
Updates: https://github.com/llvm/llvm-project/issues/53459
Differential Revision: https://reviews.llvm.org/D138582
This patch implements getArithmeticInstrCost for RISCV, supports cost
model for integer and float vector arithmetic instructions.
Differential Revision: https://reviews.llvm.org/D133552 (Original patch by jacquesguan. Subset by me with todos added.)
Use collectOffset to collect scaled indices and constant offset for GEP
instead of custom code. This simplifies the logic in decomposeGEP and
allows to handle all cases supported by the generic helper.
Handle rewriting dispatch operation with complex arguments or
return.
sret will be done in a separate patch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D138820
This patch splits off the logic to transform the canonical IV to a
a value for an induction with a different start and step. This
transformation only needs to be done once (independent of VF/UF) and
enables sinking of VPScalarIVStepsRecipe as follow-up.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D133758
The OpenMP offloading toolchain uses wrapper headers to implement some
standard features on the GPU. Currently there is no way to turn these
off without also disabling all the standard includes altogether. This
patch makes `-nogpuinc` apply to these wrapper headers so we can use a
sterile toolchain. This was causing problems when attempting to compile
a `libc` for the GPU using OpenMP.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D138598
Update the call conversion pattern to support fir.dispatch
operation as well. The first operand of fir.dispatch op is always the
polymoprhic object. The pass_arg_pos attribute needs to be shifted when
the result is added as argument.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D138799
This revision fixes a bug in the vector.extract folding that was missing
handling the "dim-1" broadcasting case in vector.broadcast.
Differential Revision: https://reviews.llvm.org/D138804
If left unchecked, the SLPVecrtorizer can move loads/stores below a stackrestore. The move can cause issues if the loads/stores have pointer operands from `alloca`s that are reset by the stackrestores. This patch adds the dependency check.
The check is conservative, in that it does not check if the pointer operands of the loads/stores are actually from `alloca`s that may be reset. We did not observe any SPECCPU2017 performance degradation so this simple fix seems sufficient.
The test could have been added to `llvm/test/Transforms/SLPVectorizer/X86/stacksave-dependence.ll`, but that test has not been updated to use opaque pointers. I am not inclined to add tests that still use typed pointers, or to refactor `llvm/test/Transforms/SLPVectorizer/X86/stacksave-dependence.ll` to use opaque pointers in this patch. If desired, I will open a different patch to refactor and consolidate the tests.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D138585