forked from mindspore-Ecosystem/mindspore
commit
50ed6ee1d5
|
@ -24,7 +24,7 @@
|
||||||
- auto: 为不同处理器设置专家推荐的混合精度等级,如在GPU上设为"O2",在Ascend上设为"O3"。该设置方式可能在部分场景下不适用,建议用户根据具体的网络模型自定义设置 `amp_level` 。
|
- auto: 为不同处理器设置专家推荐的混合精度等级,如在GPU上设为"O2",在Ascend上设为"O3"。该设置方式可能在部分场景下不适用,建议用户根据具体的网络模型自定义设置 `amp_level` 。
|
||||||
|
|
||||||
在GPU上建议使用"O2",在Ascend上建议使用"O3"。
|
在GPU上建议使用"O2",在Ascend上建议使用"O3"。
|
||||||
通过 `kwargs` 设置 `keep_batchnorm_fp32` ,可修改BatchNorm的精度策略, `keep_batchnorm_fp32` 必须为bool类型;通过 `kwargs` 设置 `loss_scale_manager` 可修改损失缩放策略,`loss_scale_manager` 必须为 :class:`mindspore.LossScaleManager` 的子类,
|
通过 `kwargs` 设置 `keep_batchnorm_fp32` ,可修改BatchNorm的精度策略, `keep_batchnorm_fp32` 必须为bool类型;通过 `kwargs` 设置 `loss_scale_manager` 可修改损失缩放策略,`loss_scale_manager` 必须为 :class:`mindspore.amp.LossScaleManager` 的子类,
|
||||||
关于 `amp_level` 详见 `mindpore.build_train_network` 。
|
关于 `amp_level` 详见 `mindpore.build_train_network` 。
|
||||||
|
|
||||||
- **boost_level** (str) - `mindspore.boost` 的可选参数,为boost模式训练等级。支持["O0", "O1", "O2"]. 默认值:"O0"。
|
- **boost_level** (str) - `mindspore.boost` 的可选参数,为boost模式训练等级。支持["O0", "O1", "O2"]. 默认值:"O0"。
|
||||||
|
|
|
@ -17,7 +17,7 @@ mindspore.nn.ASGD
|
||||||
\mu_{t} = \frac{1}{\max(1, t - t0)}
|
\mu_{t} = \frac{1}{\max(1, t - t0)}
|
||||||
\end{gather*}
|
\end{gather*}
|
||||||
|
|
||||||
:math:`\lambda` 代表衰减项, :math:`\mu` 和 :math:`\eta` 被跟踪以更新 :math:`ax` 和 :math:`w` , :math:`t0` 代表开始平均的点, :math:`\α` 代表 :math:`\eta` 更新的系数, :math:`ax` 表示平均参数值, :math:`t` 表示当前步数(step),:math:`g` 表示 `gradients` , :math:`w` 表示`params` 。
|
:math:`\lambda` 代表衰减项, :math:`\mu` 和 :math:`\eta` 被跟踪以更新 :math:`ax` 和 :math:`w` , :math:`t0` 代表开始平均的点, :math:`\α` 代表 :math:`\eta` 更新的系数, :math:`ax` 表示平均参数值, :math:`t` 表示当前步数(step),:math:`g` 表示 `gradients` , :math:`w` 表示 `params` 。
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
如果参数未分组,则优化器中的 `weight_decay` 将应用于名称中没有"beta"或"gamma"的参数。用户可以对参数进行分组,以更改权重衰减策略。当参数分组时,每个组都可以设置 `weight_decay` ,如果没有,将应用优化器中的 `weight_decay` 。
|
如果参数未分组,则优化器中的 `weight_decay` 将应用于名称中没有"beta"或"gamma"的参数。用户可以对参数进行分组,以更改权重衰减策略。当参数分组时,每个组都可以设置 `weight_decay` ,如果没有,将应用优化器中的 `weight_decay` 。
|
||||||
|
@ -40,7 +40,7 @@ mindspore.nn.ASGD
|
||||||
- **t0** (float) - 开始平均的点。默认值:1e6。
|
- **t0** (float) - 开始平均的点。默认值:1e6。
|
||||||
- **weight_decay** (Union[float, int, Cell]) - 权重衰减(L2 penalty)。默认值:0.0。
|
- **weight_decay** (Union[float, int, Cell]) - 权重衰减(L2 penalty)。默认值:0.0。
|
||||||
|
|
||||||
.. include:: mindspore.nn.optim_arg_dynamic_wd.rst
|
.. include:: mindspore.nn.optim_arg_dynamic_wd.rst
|
||||||
|
|
||||||
输入:
|
输入:
|
||||||
- **gradients** (tuple[Tensor]) - `params` 的梯度,shape与 `params` 相同。
|
- **gradients** (tuple[Tensor]) - `params` 的梯度,shape与 `params` 相同。
|
||||||
|
|
|
@ -1 +1 @@
|
||||||
- **loss_scale** (float) - 梯度缩放系数,必须大于0。如果 `loss_scale` 是整数,它将被转换为浮点数。通常使用默认值,仅当训练时使用了 `FixedLossScaleManager`,且 `FixedLossScaleManager` 的 `drop_overflow_update` 属性配置为False时,此值需要与 `FixedLossScaleManager` 中的 `loss_scale` 相同。有关更多详细信息,请参阅 :class:`mindspore.FixedLossScaleManager`。默认值:1.0。
|
- **loss_scale** (float) - 梯度缩放系数,必须大于0。如果 `loss_scale` 是整数,它将被转换为浮点数。通常使用默认值,仅当训练时使用了 `FixedLossScaleManager`,且 `FixedLossScaleManager` 的 `drop_overflow_update` 属性配置为False时,此值需要与 `FixedLossScaleManager` 中的 `loss_scale` 相同。有关更多详细信息,请参阅 :class:`mindspore.amp.FixedLossScaleManager`。默认值:1.0。
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
mindspore.ops.AlltoAll
|
mindspore.ops.AlltoAll
|
||||||
======================
|
======================
|
||||||
|
|
||||||
.. py:class:: mindspore.ops.AlltoAll(split_count, split_dim, concat_dim, group='hccl_world_group')
|
.. py:class:: mindspore.ops.AlltoAll(split_count, split_dim, concat_dim, group=GlobalComm.WORLD_COMM_GROUP)
|
||||||
|
|
||||||
AlltoAll是一个集合通信函数。
|
AlltoAll是一个集合通信函数。
|
||||||
|
|
||||||
|
|
|
@ -24,3 +24,4 @@ mindspore.ops.CropAndResize
|
||||||
异常:
|
异常:
|
||||||
- **TypeError** - 如果 `method` 不是str。
|
- **TypeError** - 如果 `method` 不是str。
|
||||||
- **TypeError** - 如果 `extrapolation_value` 不是float,且取值不是"bilinear"、"nearest"或"bilinear_v2"。
|
- **TypeError** - 如果 `extrapolation_value` 不是float,且取值不是"bilinear"、"nearest"或"bilinear_v2"。
|
||||||
|
- **ValueError** - 如果 `method` 不是'bilinear'、 'nearest'或者'bilinear_v2'。
|
||||||
|
|
|
@ -16,4 +16,4 @@ mindspore.ops.EqualCount
|
||||||
|
|
||||||
异常:
|
异常:
|
||||||
- **TypeError** - 如果 `x` 或 `y` 不是Tensor。
|
- **TypeError** - 如果 `x` 或 `y` 不是Tensor。
|
||||||
- **ValueError** - 如果 `x` 与`y` 的shape不相等。
|
- **ValueError** - 如果 `x` 与 `y` 的shape不相等。
|
||||||
|
|
|
@ -12,7 +12,7 @@ mindspore.ops.SparseTensorDenseMatmul
|
||||||
输入:
|
输入:
|
||||||
- **indices** (Tensor) - 二维Tensor,表示元素在稀疏Tensor中的位置。支持int32、int64,每个元素值都应该是非负的。shape是 :math:`(n,2)` 。
|
- **indices** (Tensor) - 二维Tensor,表示元素在稀疏Tensor中的位置。支持int32、int64,每个元素值都应该是非负的。shape是 :math:`(n,2)` 。
|
||||||
- **values** (Tensor) - 一维Tensor,表示 `indices` 位置上对应的值。支持float16、float32、float64、int32、int64、complex64、complex128。shape应该是 :math:`(n,)` 。
|
- **values** (Tensor) - 一维Tensor,表示 `indices` 位置上对应的值。支持float16、float32、float64、int32、int64、complex64、complex128。shape应该是 :math:`(n,)` 。
|
||||||
- **sparse_shape** (tuple(int)) - 指定稀疏Tensor的shape,由两个正整数组成,表示稀疏Tensor的shape为 :math:`(N, C)` 。
|
- **sparse_shape** (tuple(int) 或 Tensor) - 指定稀疏Tensor的shape,由两个正整数组成,表示稀疏Tensor的shape为 :math:`(N, C)` 。
|
||||||
- **dense** (Tensor) - 二维Tensor,数据类型与 `values` 相同。
|
- **dense** (Tensor) - 二维Tensor,数据类型与 `values` 相同。
|
||||||
|
|
||||||
如果 `adjoint_st` 为False, `adjoint_dt` 为False,则shape必须为 :math:`(C, M)` 。
|
如果 `adjoint_st` 为False, `adjoint_dt` 为False,则shape必须为 :math:`(C, M)` 。
|
||||||
|
|
|
@ -5,7 +5,7 @@ mindspore.ops.SquaredDifference
|
||||||
|
|
||||||
第一个输入Tensor元素中减去第二个输入Tensor,并返回其平方。
|
第一个输入Tensor元素中减去第二个输入Tensor,并返回其平方。
|
||||||
|
|
||||||
`x` 和` y` 的输入遵循隐式类型转换规则,使数据类型一致。输入必须是两个Tensor或一个Tensor和一个Scalar。当输入是两个Tensor时,它们的数据类型不能同时为bool类型,并且它们的shape可以广播。当输入是一个Tensor和一个Scalar时,Scalar只能是一个常量。
|
`x` 和 `y` 的输入遵循隐式类型转换规则,使数据类型一致。输入必须是两个Tensor或一个Tensor和一个Scalar。当输入是两个Tensor时,它们的数据类型不能同时为bool类型,并且它们的shape可以广播。当输入是一个Tensor和一个Scalar时,Scalar只能是一个常量。
|
||||||
|
|
||||||
.. math::
|
.. math::
|
||||||
out_{i} = (x_{i} - y_{i}) * (x_{i} - y_{i}) = (x_{i} - y_{i})^2
|
out_{i} = (x_{i} - y_{i}) * (x_{i} - y_{i}) = (x_{i} - y_{i})^2
|
||||||
|
|
|
@ -5,7 +5,7 @@ mindspore.ops.assign
|
||||||
|
|
||||||
为网络参数赋值。
|
为网络参数赋值。
|
||||||
|
|
||||||
`variable` 和`value` 遵循隐式类型转换规则,使数据类型一致。如果它们具有不同的数据类型,则低精度数据类型将转换为相对最高精度的数据类型。
|
`variable` 和 `value` 遵循隐式类型转换规则,使数据类型一致。如果它们具有不同的数据类型,则低精度数据类型将转换为相对最高精度的数据类型。
|
||||||
|
|
||||||
参数:
|
参数:
|
||||||
- **variable** (Parameter) - 网路参数。 :math:`(N,*)` ,其中 :math:`*` 表示任意数量的附加维度,其秩应小于8。
|
- **variable** (Parameter) - 网路参数。 :math:`(N,*)` ,其中 :math:`*` 表示任意数量的附加维度,其秩应小于8。
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
mindspore.ops.count_nonzero
|
mindspore.ops.count_nonzero
|
||||||
============================
|
============================
|
||||||
|
|
||||||
.. py:function:: mindspore.ops.count_nonzero(x, axis=(), keep_dims=False, dtype=mindspore.int32)
|
.. py:function:: mindspore.ops.count_nonzero(x, axis=(), keep_dims=False, dtype=mstype.int32)
|
||||||
|
|
||||||
计算输入Tensor指定轴上的非零元素的数量。
|
计算输入Tensor指定轴上的非零元素的数量。
|
||||||
|
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
mindspore.ops.custom_info_register
|
mindspore.ops.custom_info_register
|
||||||
==================================
|
==================================
|
||||||
|
|
||||||
.. py:class:: mindspore.ops.custom_info_register(*reg_info)
|
.. py:function:: mindspore.ops.custom_info_register(*reg_info)
|
||||||
|
|
||||||
装饰器,用于将注册信息绑定到: :class:`mindspore.ops.Custom` 的 `func` 参数。
|
装饰器,用于将注册信息绑定到: :class:`mindspore.ops.Custom` 的 `func` 参数。
|
||||||
|
|
|
@ -131,7 +131,7 @@ class Adagrad(Optimizer):
|
||||||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||||
|
|
||||||
|
|
|
@ -113,7 +113,7 @@ class Adadelta(Optimizer):
|
||||||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||||
|
|
||||||
|
|
|
@ -232,7 +232,7 @@ class AdaFactor(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
|
|
|
@ -441,7 +441,7 @@ class Adam(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
|
@ -902,7 +902,7 @@ class AdamOffload(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
|
|
|
@ -136,7 +136,7 @@ class AdaMax(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
|
|
|
@ -191,7 +191,7 @@ class FTRL(Optimizer):
|
||||||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||||
|
|
||||||
|
|
|
@ -285,7 +285,7 @@ class LazyAdam(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale. Should be equal to or greater than 1. In general,
|
loss_scale (float): A floating point value for the loss scale. Should be equal to or greater than 1. In general,
|
||||||
use the default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update`
|
use the default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update`
|
||||||
in `FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
in `FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
|
|
|
@ -145,7 +145,7 @@ class Momentum(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale. It must be greater than 0.0. In general, use the
|
loss_scale (float): A floating point value for the loss scale. It must be greater than 0.0. In general, use the
|
||||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
use_nesterov (bool): Enable Nesterov momentum. Default: False.
|
use_nesterov (bool): Enable Nesterov momentum. Default: False.
|
||||||
|
|
||||||
|
|
|
@ -125,7 +125,7 @@ class Optimizer(Cell):
|
||||||
type of `loss_scale` input is int, it will be converted to float. In general, use the default value. Only
|
type of `loss_scale` input is int, it will be converted to float. In general, use the default value. Only
|
||||||
when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
|
|
|
@ -131,7 +131,7 @@ class ProximalAdagrad(Optimizer):
|
||||||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||||
|
|
||||||
|
|
|
@ -145,7 +145,7 @@ class RMSProp(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||||
|
|
||||||
|
|
|
@ -110,7 +110,7 @@ class SGD(Optimizer):
|
||||||
loss_scale (float): A floating point value for the loss scale, which must be larger than 0.0. In general, use
|
loss_scale (float): A floating point value for the loss scale, which must be larger than 0.0. In general, use
|
||||||
the default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
the default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||||
Default: 1.0.
|
Default: 1.0.
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
|
|
|
@ -2518,7 +2518,7 @@ class TransformerDecoder(Cell):
|
||||||
'relu6', 'tanh', 'gelu', 'fast_gelu', 'elu', 'sigmoid', 'prelu', 'leakyrelu', 'hswish',
|
'relu6', 'tanh', 'gelu', 'fast_gelu', 'elu', 'sigmoid', 'prelu', 'leakyrelu', 'hswish',
|
||||||
'hsigmoid', 'logsigmoid' and so on. Default: gelu.
|
'hsigmoid', 'logsigmoid' and so on. Default: gelu.
|
||||||
lambda_func(function): A function can determine the fusion index,
|
lambda_func(function): A function can determine the fusion index,
|
||||||
pipeline stages and recompute attribute. If the
|
pipeline stages and recompute attribute. If the
|
||||||
user wants to determine the pipeline stage and gradient aggregation fusion, the user can pass a
|
user wants to determine the pipeline stage and gradient aggregation fusion, the user can pass a
|
||||||
function that accepts `network`, `layer_id`, `offset`, `parallel_config`, `layers`. The `network(Cell)`
|
function that accepts `network`, `layer_id`, `offset`, `parallel_config`, `layers`. The `network(Cell)`
|
||||||
represents the transformer block, `layer_id(int)` means the layer index for the current module, counts
|
represents the transformer block, `layer_id(int)` means the layer index for the current module, counts
|
||||||
|
|
|
@ -349,7 +349,7 @@ def multinomial(inputs, num_sample, replacement=True, seed=None):
|
||||||
seed (int, optional): Seed is used as entropy source for the random number engines to generate
|
seed (int, optional): Seed is used as entropy source for the random number engines to generate
|
||||||
pseudo-random numbers, must be non-negative. Default: None.
|
pseudo-random numbers, must be non-negative. Default: None.
|
||||||
|
|
||||||
Outputs:
|
Returns:
|
||||||
Tensor, has the same rows with input. The number of sampled indices of each row is `num_samples`.
|
Tensor, has the same rows with input. The number of sampled indices of each row is `num_samples`.
|
||||||
The dtype is float32.
|
The dtype is float32.
|
||||||
|
|
||||||
|
|
|
@ -6895,7 +6895,7 @@ class ExtractVolumePatches(Primitive):
|
||||||
Supported Platforms:
|
Supported Platforms:
|
||||||
``Ascend`` ``CPU``
|
``Ascend`` ``CPU``
|
||||||
|
|
||||||
Example:
|
Examples:
|
||||||
>>> kernel_size = (1, 1, 2, 2, 2)
|
>>> kernel_size = (1, 1, 2, 2, 2)
|
||||||
>>> strides = (1, 1, 1, 1, 1)
|
>>> strides = (1, 1, 1, 1, 1)
|
||||||
>>> padding = "VALID"
|
>>> padding = "VALID"
|
||||||
|
|
|
@ -686,7 +686,7 @@ class NeighborExchange(Primitive):
|
||||||
Supported Platforms:
|
Supported Platforms:
|
||||||
``Ascend``
|
``Ascend``
|
||||||
|
|
||||||
Example:
|
Examples:
|
||||||
>>> # This example should be run with 2 devices. Refer to the tutorial > Distributed Training on mindspore.cn
|
>>> # This example should be run with 2 devices. Refer to the tutorial > Distributed Training on mindspore.cn
|
||||||
>>> import os
|
>>> import os
|
||||||
>>> import mindspore as ms
|
>>> import mindspore as ms
|
||||||
|
@ -762,7 +762,7 @@ class AlltoAll(PrimitiveWithInfer):
|
||||||
Supported Platforms:
|
Supported Platforms:
|
||||||
``Ascend``
|
``Ascend``
|
||||||
|
|
||||||
Example:
|
Examples:
|
||||||
>>> # This example should be run with 8 devices. Refer to the tutorial > Distributed Training on mindspore.cn
|
>>> # This example should be run with 8 devices. Refer to the tutorial > Distributed Training on mindspore.cn
|
||||||
>>> import os
|
>>> import os
|
||||||
>>> import mindspore as ms
|
>>> import mindspore as ms
|
||||||
|
|
|
@ -2055,7 +2055,7 @@ class Rsqrt(Primitive):
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
- **x** (Tensor) - The input of Rsqrt. Its rank must be in [0, 7] inclusive and
|
- **x** (Tensor) - The input of Rsqrt. Its rank must be in [0, 7] inclusive and
|
||||||
each element must be a non-negative number.
|
each element must be a non-negative number.
|
||||||
|
|
||||||
Outputs:
|
Outputs:
|
||||||
Tensor, has the same type and shape as `x`.
|
Tensor, has the same type and shape as `x`.
|
||||||
|
@ -2092,7 +2092,6 @@ class Sqrt(Primitive):
|
||||||
|
|
||||||
out_{i} = \sqrt{x_{i}}
|
out_{i} = \sqrt{x_{i}}
|
||||||
|
|
||||||
|
|
||||||
Inputs:
|
Inputs:
|
||||||
- **x** (Tensor) - The input tensor with a dtype of Number, its rank must be in [0, 7] inclusive.
|
- **x** (Tensor) - The input tensor with a dtype of Number, its rank must be in [0, 7] inclusive.
|
||||||
|
|
||||||
|
|
|
@ -623,7 +623,7 @@ class SparseTensorDenseMatmul(Primitive):
|
||||||
Support int32, int64, each element value should be a non-negative int number. The shape is :math:`(n, 2)`.
|
Support int32, int64, each element value should be a non-negative int number. The shape is :math:`(n, 2)`.
|
||||||
- **values** (Tensor) - A 1-D Tensor, represents the value corresponding to the position in the `indices`.
|
- **values** (Tensor) - A 1-D Tensor, represents the value corresponding to the position in the `indices`.
|
||||||
Support float16, float32, float64, int32, int64, complex64, complex128. The shape should be :math:`(n,)`.
|
Support float16, float32, float64, int32, int64, complex64, complex128. The shape should be :math:`(n,)`.
|
||||||
- **sparse_shape** (tuple(int)) or (Tensor) - A positive int tuple or tensor which specifies the shape of
|
- **sparse_shape** (tuple(int) or (Tensor)) - A positive int tuple or tensor which specifies the shape of
|
||||||
sparse tensor, and only constant value is allowed when sparse_shape is a tensor, should have 2 elements,
|
sparse tensor, and only constant value is allowed when sparse_shape is a tensor, should have 2 elements,
|
||||||
represent sparse tensor shape is :math:`(N, C)`.
|
represent sparse tensor shape is :math:`(N, C)`.
|
||||||
- **dense** (Tensor) - A 2-D Tensor, the dtype is same as `values`.
|
- **dense** (Tensor) - A 2-D Tensor, the dtype is same as `values`.
|
||||||
|
|
|
@ -281,8 +281,8 @@ def build_train_network(network, optimizer, loss_fn=None, level='O0', boost_leve
|
||||||
keep_batchnorm_fp32 (bool): Keep Batchnorm run in `float32` when the network is set to cast to `float16` . If
|
keep_batchnorm_fp32 (bool): Keep Batchnorm run in `float32` when the network is set to cast to `float16` . If
|
||||||
set, the `level` setting will take no effect on this property.
|
set, the `level` setting will take no effect on this property.
|
||||||
loss_scale_manager (Union[None, LossScaleManager]): If not None, must be subclass of
|
loss_scale_manager (Union[None, LossScaleManager]): If not None, must be subclass of
|
||||||
:class:`mindspore.LossScaleManager` for scaling the loss. If set, the `level` setting will take no effect
|
:class:`mindspore.amp.LossScaleManager` for scaling the loss. If set, the `level` setting will
|
||||||
on this property.
|
take no effect on this property.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ValueError: If device is CPU, property `loss_scale_manager` is not `None` or `FixedLossScaleManager`
|
ValueError: If device is CPU, property `loss_scale_manager` is not `None` or `FixedLossScaleManager`
|
||||||
|
|
|
@ -29,7 +29,7 @@ class LossScaleManager:
|
||||||
`get_update_cell` is used to get the instance of :class:`mindspore.nn.Cell` that is used to update the loss scale,
|
`get_update_cell` is used to get the instance of :class:`mindspore.nn.Cell` that is used to update the loss scale,
|
||||||
the instance will be called during the training. Currently, the `get_update_cell` is mostly used.
|
the instance will be called during the training. Currently, the `get_update_cell` is mostly used.
|
||||||
|
|
||||||
For example, :class:`mindspore.FixedLossScaleManager` and :class:`mindspore.DynamicLossScaleManager`.
|
For example, :class:`mindspore.amp.FixedLossScaleManager` and :class:`mindspore.amp.DynamicLossScaleManager`.
|
||||||
"""
|
"""
|
||||||
def get_loss_scale(self):
|
def get_loss_scale(self):
|
||||||
"""Get the value of loss scale, which is the amplification factor of the gradients."""
|
"""Get the value of loss scale, which is the amplification factor of the gradients."""
|
||||||
|
|
|
@ -140,7 +140,7 @@ class Model:
|
||||||
"O2" is recommended on GPU, "O3" is recommended on Ascend.
|
"O2" is recommended on GPU, "O3" is recommended on Ascend.
|
||||||
The BatchNorm strategy can be changed by `keep_batchnorm_fp32` settings in `kwargs`. `keep_batchnorm_fp32`
|
The BatchNorm strategy can be changed by `keep_batchnorm_fp32` settings in `kwargs`. `keep_batchnorm_fp32`
|
||||||
must be a bool. The loss scale strategy can be changed by `loss_scale_manager` setting in `kwargs`.
|
must be a bool. The loss scale strategy can be changed by `loss_scale_manager` setting in `kwargs`.
|
||||||
`loss_scale_manager` should be a subclass of :class:`mindspore.LossScaleManager`.
|
`loss_scale_manager` should be a subclass of :class:`mindspore.amp.LossScaleManager`.
|
||||||
The more detailed explanation of `amp_level` setting can be found at `mindspore.build_train_network`.
|
The more detailed explanation of `amp_level` setting can be found at `mindspore.build_train_network`.
|
||||||
|
|
||||||
boost_level (str): Option for argument `level` in `mindspore.boost`, level for boost mode
|
boost_level (str): Option for argument `level` in `mindspore.boost`, level for boost mode
|
||||||
|
|
Loading…
Reference in New Issue