forked from mindspore-Ecosystem/mindspore
commit
50ed6ee1d5
|
@ -24,7 +24,7 @@
|
|||
- auto: 为不同处理器设置专家推荐的混合精度等级,如在GPU上设为"O2",在Ascend上设为"O3"。该设置方式可能在部分场景下不适用,建议用户根据具体的网络模型自定义设置 `amp_level` 。
|
||||
|
||||
在GPU上建议使用"O2",在Ascend上建议使用"O3"。
|
||||
通过 `kwargs` 设置 `keep_batchnorm_fp32` ,可修改BatchNorm的精度策略, `keep_batchnorm_fp32` 必须为bool类型;通过 `kwargs` 设置 `loss_scale_manager` 可修改损失缩放策略,`loss_scale_manager` 必须为 :class:`mindspore.LossScaleManager` 的子类,
|
||||
通过 `kwargs` 设置 `keep_batchnorm_fp32` ,可修改BatchNorm的精度策略, `keep_batchnorm_fp32` 必须为bool类型;通过 `kwargs` 设置 `loss_scale_manager` 可修改损失缩放策略,`loss_scale_manager` 必须为 :class:`mindspore.amp.LossScaleManager` 的子类,
|
||||
关于 `amp_level` 详见 `mindpore.build_train_network` 。
|
||||
|
||||
- **boost_level** (str) - `mindspore.boost` 的可选参数,为boost模式训练等级。支持["O0", "O1", "O2"]. 默认值:"O0"。
|
||||
|
|
|
@ -17,7 +17,7 @@ mindspore.nn.ASGD
|
|||
\mu_{t} = \frac{1}{\max(1, t - t0)}
|
||||
\end{gather*}
|
||||
|
||||
:math:`\lambda` 代表衰减项, :math:`\mu` 和 :math:`\eta` 被跟踪以更新 :math:`ax` 和 :math:`w` , :math:`t0` 代表开始平均的点, :math:`\α` 代表 :math:`\eta` 更新的系数, :math:`ax` 表示平均参数值, :math:`t` 表示当前步数(step),:math:`g` 表示 `gradients` , :math:`w` 表示`params` 。
|
||||
:math:`\lambda` 代表衰减项, :math:`\mu` 和 :math:`\eta` 被跟踪以更新 :math:`ax` 和 :math:`w` , :math:`t0` 代表开始平均的点, :math:`\α` 代表 :math:`\eta` 更新的系数, :math:`ax` 表示平均参数值, :math:`t` 表示当前步数(step),:math:`g` 表示 `gradients` , :math:`w` 表示 `params` 。
|
||||
|
||||
.. note::
|
||||
如果参数未分组,则优化器中的 `weight_decay` 将应用于名称中没有"beta"或"gamma"的参数。用户可以对参数进行分组,以更改权重衰减策略。当参数分组时,每个组都可以设置 `weight_decay` ,如果没有,将应用优化器中的 `weight_decay` 。
|
||||
|
@ -40,7 +40,7 @@ mindspore.nn.ASGD
|
|||
- **t0** (float) - 开始平均的点。默认值:1e6。
|
||||
- **weight_decay** (Union[float, int, Cell]) - 权重衰减(L2 penalty)。默认值:0.0。
|
||||
|
||||
.. include:: mindspore.nn.optim_arg_dynamic_wd.rst
|
||||
.. include:: mindspore.nn.optim_arg_dynamic_wd.rst
|
||||
|
||||
输入:
|
||||
- **gradients** (tuple[Tensor]) - `params` 的梯度,shape与 `params` 相同。
|
||||
|
|
|
@ -1 +1 @@
|
|||
- **loss_scale** (float) - 梯度缩放系数,必须大于0。如果 `loss_scale` 是整数,它将被转换为浮点数。通常使用默认值,仅当训练时使用了 `FixedLossScaleManager`,且 `FixedLossScaleManager` 的 `drop_overflow_update` 属性配置为False时,此值需要与 `FixedLossScaleManager` 中的 `loss_scale` 相同。有关更多详细信息,请参阅 :class:`mindspore.FixedLossScaleManager`。默认值:1.0。
|
||||
- **loss_scale** (float) - 梯度缩放系数,必须大于0。如果 `loss_scale` 是整数,它将被转换为浮点数。通常使用默认值,仅当训练时使用了 `FixedLossScaleManager`,且 `FixedLossScaleManager` 的 `drop_overflow_update` 属性配置为False时,此值需要与 `FixedLossScaleManager` 中的 `loss_scale` 相同。有关更多详细信息,请参阅 :class:`mindspore.amp.FixedLossScaleManager`。默认值:1.0。
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
mindspore.ops.AlltoAll
|
||||
======================
|
||||
|
||||
.. py:class:: mindspore.ops.AlltoAll(split_count, split_dim, concat_dim, group='hccl_world_group')
|
||||
.. py:class:: mindspore.ops.AlltoAll(split_count, split_dim, concat_dim, group=GlobalComm.WORLD_COMM_GROUP)
|
||||
|
||||
AlltoAll是一个集合通信函数。
|
||||
|
||||
|
|
|
@ -24,3 +24,4 @@ mindspore.ops.CropAndResize
|
|||
异常:
|
||||
- **TypeError** - 如果 `method` 不是str。
|
||||
- **TypeError** - 如果 `extrapolation_value` 不是float,且取值不是"bilinear"、"nearest"或"bilinear_v2"。
|
||||
- **ValueError** - 如果 `method` 不是'bilinear'、 'nearest'或者'bilinear_v2'。
|
||||
|
|
|
@ -16,4 +16,4 @@ mindspore.ops.EqualCount
|
|||
|
||||
异常:
|
||||
- **TypeError** - 如果 `x` 或 `y` 不是Tensor。
|
||||
- **ValueError** - 如果 `x` 与`y` 的shape不相等。
|
||||
- **ValueError** - 如果 `x` 与 `y` 的shape不相等。
|
||||
|
|
|
@ -12,7 +12,7 @@ mindspore.ops.SparseTensorDenseMatmul
|
|||
输入:
|
||||
- **indices** (Tensor) - 二维Tensor,表示元素在稀疏Tensor中的位置。支持int32、int64,每个元素值都应该是非负的。shape是 :math:`(n,2)` 。
|
||||
- **values** (Tensor) - 一维Tensor,表示 `indices` 位置上对应的值。支持float16、float32、float64、int32、int64、complex64、complex128。shape应该是 :math:`(n,)` 。
|
||||
- **sparse_shape** (tuple(int)) - 指定稀疏Tensor的shape,由两个正整数组成,表示稀疏Tensor的shape为 :math:`(N, C)` 。
|
||||
- **sparse_shape** (tuple(int) 或 Tensor) - 指定稀疏Tensor的shape,由两个正整数组成,表示稀疏Tensor的shape为 :math:`(N, C)` 。
|
||||
- **dense** (Tensor) - 二维Tensor,数据类型与 `values` 相同。
|
||||
|
||||
如果 `adjoint_st` 为False, `adjoint_dt` 为False,则shape必须为 :math:`(C, M)` 。
|
||||
|
|
|
@ -5,7 +5,7 @@ mindspore.ops.SquaredDifference
|
|||
|
||||
第一个输入Tensor元素中减去第二个输入Tensor,并返回其平方。
|
||||
|
||||
`x` 和` y` 的输入遵循隐式类型转换规则,使数据类型一致。输入必须是两个Tensor或一个Tensor和一个Scalar。当输入是两个Tensor时,它们的数据类型不能同时为bool类型,并且它们的shape可以广播。当输入是一个Tensor和一个Scalar时,Scalar只能是一个常量。
|
||||
`x` 和 `y` 的输入遵循隐式类型转换规则,使数据类型一致。输入必须是两个Tensor或一个Tensor和一个Scalar。当输入是两个Tensor时,它们的数据类型不能同时为bool类型,并且它们的shape可以广播。当输入是一个Tensor和一个Scalar时,Scalar只能是一个常量。
|
||||
|
||||
.. math::
|
||||
out_{i} = (x_{i} - y_{i}) * (x_{i} - y_{i}) = (x_{i} - y_{i})^2
|
||||
|
|
|
@ -5,7 +5,7 @@ mindspore.ops.assign
|
|||
|
||||
为网络参数赋值。
|
||||
|
||||
`variable` 和`value` 遵循隐式类型转换规则,使数据类型一致。如果它们具有不同的数据类型,则低精度数据类型将转换为相对最高精度的数据类型。
|
||||
`variable` 和 `value` 遵循隐式类型转换规则,使数据类型一致。如果它们具有不同的数据类型,则低精度数据类型将转换为相对最高精度的数据类型。
|
||||
|
||||
参数:
|
||||
- **variable** (Parameter) - 网路参数。 :math:`(N,*)` ,其中 :math:`*` 表示任意数量的附加维度,其秩应小于8。
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
mindspore.ops.count_nonzero
|
||||
============================
|
||||
|
||||
.. py:function:: mindspore.ops.count_nonzero(x, axis=(), keep_dims=False, dtype=mindspore.int32)
|
||||
.. py:function:: mindspore.ops.count_nonzero(x, axis=(), keep_dims=False, dtype=mstype.int32)
|
||||
|
||||
计算输入Tensor指定轴上的非零元素的数量。
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
mindspore.ops.custom_info_register
|
||||
==================================
|
||||
|
||||
.. py:class:: mindspore.ops.custom_info_register(*reg_info)
|
||||
.. py:function:: mindspore.ops.custom_info_register(*reg_info)
|
||||
|
||||
装饰器,用于将注册信息绑定到: :class:`mindspore.ops.Custom` 的 `func` 参数。
|
||||
|
|
@ -131,7 +131,7 @@ class Adagrad(Optimizer):
|
|||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||
|
||||
|
|
|
@ -113,7 +113,7 @@ class Adadelta(Optimizer):
|
|||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||
|
||||
|
|
|
@ -232,7 +232,7 @@ class AdaFactor(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
|
||||
Inputs:
|
||||
|
|
|
@ -441,7 +441,7 @@ class Adam(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
|
||||
Inputs:
|
||||
|
@ -902,7 +902,7 @@ class AdamOffload(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
|
||||
Inputs:
|
||||
|
|
|
@ -136,7 +136,7 @@ class AdaMax(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
|
||||
Inputs:
|
||||
|
|
|
@ -191,7 +191,7 @@ class FTRL(Optimizer):
|
|||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||
|
||||
|
|
|
@ -285,7 +285,7 @@ class LazyAdam(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale. Should be equal to or greater than 1. In general,
|
||||
use the default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update`
|
||||
in `FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
|
||||
Inputs:
|
||||
|
|
|
@ -145,7 +145,7 @@ class Momentum(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale. It must be greater than 0.0. In general, use the
|
||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
use_nesterov (bool): Enable Nesterov momentum. Default: False.
|
||||
|
||||
|
|
|
@ -125,7 +125,7 @@ class Optimizer(Cell):
|
|||
type of `loss_scale` input is int, it will be converted to float. In general, use the default value. Only
|
||||
when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
|
||||
Raises:
|
||||
|
|
|
@ -131,7 +131,7 @@ class ProximalAdagrad(Optimizer):
|
|||
loss_scale (float): Value for the loss scale. It must be greater than 0.0. In general, use the default value.
|
||||
Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||
|
||||
|
|
|
@ -145,7 +145,7 @@ class RMSProp(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. In general, use the
|
||||
default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
weight_decay (Union[float, int, Cell]): Weight decay (L2 penalty). Default: 0.0.
|
||||
|
||||
|
|
|
@ -110,7 +110,7 @@ class SGD(Optimizer):
|
|||
loss_scale (float): A floating point value for the loss scale, which must be larger than 0.0. In general, use
|
||||
the default value. Only when `FixedLossScaleManager` is used for training and the `drop_overflow_update` in
|
||||
`FixedLossScaleManager` is set to False, then this value needs to be the same as the `loss_scale` in
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.FixedLossScaleManager` for more details.
|
||||
`FixedLossScaleManager`. Refer to class :class:`mindspore.amp.FixedLossScaleManager` for more details.
|
||||
Default: 1.0.
|
||||
|
||||
Inputs:
|
||||
|
|
|
@ -2518,7 +2518,7 @@ class TransformerDecoder(Cell):
|
|||
'relu6', 'tanh', 'gelu', 'fast_gelu', 'elu', 'sigmoid', 'prelu', 'leakyrelu', 'hswish',
|
||||
'hsigmoid', 'logsigmoid' and so on. Default: gelu.
|
||||
lambda_func(function): A function can determine the fusion index,
|
||||
pipeline stages and recompute attribute. If the
|
||||
pipeline stages and recompute attribute. If the
|
||||
user wants to determine the pipeline stage and gradient aggregation fusion, the user can pass a
|
||||
function that accepts `network`, `layer_id`, `offset`, `parallel_config`, `layers`. The `network(Cell)`
|
||||
represents the transformer block, `layer_id(int)` means the layer index for the current module, counts
|
||||
|
|
|
@ -349,7 +349,7 @@ def multinomial(inputs, num_sample, replacement=True, seed=None):
|
|||
seed (int, optional): Seed is used as entropy source for the random number engines to generate
|
||||
pseudo-random numbers, must be non-negative. Default: None.
|
||||
|
||||
Outputs:
|
||||
Returns:
|
||||
Tensor, has the same rows with input. The number of sampled indices of each row is `num_samples`.
|
||||
The dtype is float32.
|
||||
|
||||
|
|
|
@ -6895,7 +6895,7 @@ class ExtractVolumePatches(Primitive):
|
|||
Supported Platforms:
|
||||
``Ascend`` ``CPU``
|
||||
|
||||
Example:
|
||||
Examples:
|
||||
>>> kernel_size = (1, 1, 2, 2, 2)
|
||||
>>> strides = (1, 1, 1, 1, 1)
|
||||
>>> padding = "VALID"
|
||||
|
|
|
@ -686,7 +686,7 @@ class NeighborExchange(Primitive):
|
|||
Supported Platforms:
|
||||
``Ascend``
|
||||
|
||||
Example:
|
||||
Examples:
|
||||
>>> # This example should be run with 2 devices. Refer to the tutorial > Distributed Training on mindspore.cn
|
||||
>>> import os
|
||||
>>> import mindspore as ms
|
||||
|
@ -762,7 +762,7 @@ class AlltoAll(PrimitiveWithInfer):
|
|||
Supported Platforms:
|
||||
``Ascend``
|
||||
|
||||
Example:
|
||||
Examples:
|
||||
>>> # This example should be run with 8 devices. Refer to the tutorial > Distributed Training on mindspore.cn
|
||||
>>> import os
|
||||
>>> import mindspore as ms
|
||||
|
|
|
@ -2055,7 +2055,7 @@ class Rsqrt(Primitive):
|
|||
|
||||
Inputs:
|
||||
- **x** (Tensor) - The input of Rsqrt. Its rank must be in [0, 7] inclusive and
|
||||
each element must be a non-negative number.
|
||||
each element must be a non-negative number.
|
||||
|
||||
Outputs:
|
||||
Tensor, has the same type and shape as `x`.
|
||||
|
@ -2092,7 +2092,6 @@ class Sqrt(Primitive):
|
|||
|
||||
out_{i} = \sqrt{x_{i}}
|
||||
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - The input tensor with a dtype of Number, its rank must be in [0, 7] inclusive.
|
||||
|
||||
|
|
|
@ -623,7 +623,7 @@ class SparseTensorDenseMatmul(Primitive):
|
|||
Support int32, int64, each element value should be a non-negative int number. The shape is :math:`(n, 2)`.
|
||||
- **values** (Tensor) - A 1-D Tensor, represents the value corresponding to the position in the `indices`.
|
||||
Support float16, float32, float64, int32, int64, complex64, complex128. The shape should be :math:`(n,)`.
|
||||
- **sparse_shape** (tuple(int)) or (Tensor) - A positive int tuple or tensor which specifies the shape of
|
||||
- **sparse_shape** (tuple(int) or (Tensor)) - A positive int tuple or tensor which specifies the shape of
|
||||
sparse tensor, and only constant value is allowed when sparse_shape is a tensor, should have 2 elements,
|
||||
represent sparse tensor shape is :math:`(N, C)`.
|
||||
- **dense** (Tensor) - A 2-D Tensor, the dtype is same as `values`.
|
||||
|
|
|
@ -281,8 +281,8 @@ def build_train_network(network, optimizer, loss_fn=None, level='O0', boost_leve
|
|||
keep_batchnorm_fp32 (bool): Keep Batchnorm run in `float32` when the network is set to cast to `float16` . If
|
||||
set, the `level` setting will take no effect on this property.
|
||||
loss_scale_manager (Union[None, LossScaleManager]): If not None, must be subclass of
|
||||
:class:`mindspore.LossScaleManager` for scaling the loss. If set, the `level` setting will take no effect
|
||||
on this property.
|
||||
:class:`mindspore.amp.LossScaleManager` for scaling the loss. If set, the `level` setting will
|
||||
take no effect on this property.
|
||||
|
||||
Raises:
|
||||
ValueError: If device is CPU, property `loss_scale_manager` is not `None` or `FixedLossScaleManager`
|
||||
|
|
|
@ -29,7 +29,7 @@ class LossScaleManager:
|
|||
`get_update_cell` is used to get the instance of :class:`mindspore.nn.Cell` that is used to update the loss scale,
|
||||
the instance will be called during the training. Currently, the `get_update_cell` is mostly used.
|
||||
|
||||
For example, :class:`mindspore.FixedLossScaleManager` and :class:`mindspore.DynamicLossScaleManager`.
|
||||
For example, :class:`mindspore.amp.FixedLossScaleManager` and :class:`mindspore.amp.DynamicLossScaleManager`.
|
||||
"""
|
||||
def get_loss_scale(self):
|
||||
"""Get the value of loss scale, which is the amplification factor of the gradients."""
|
||||
|
|
|
@ -140,7 +140,7 @@ class Model:
|
|||
"O2" is recommended on GPU, "O3" is recommended on Ascend.
|
||||
The BatchNorm strategy can be changed by `keep_batchnorm_fp32` settings in `kwargs`. `keep_batchnorm_fp32`
|
||||
must be a bool. The loss scale strategy can be changed by `loss_scale_manager` setting in `kwargs`.
|
||||
`loss_scale_manager` should be a subclass of :class:`mindspore.LossScaleManager`.
|
||||
`loss_scale_manager` should be a subclass of :class:`mindspore.amp.LossScaleManager`.
|
||||
The more detailed explanation of `amp_level` setting can be found at `mindspore.build_train_network`.
|
||||
|
||||
boost_level (str): Option for argument `level` in `mindspore.boost`, level for boost mode
|
||||
|
|
Loading…
Reference in New Issue