forked from mindspore-Ecosystem/mindspore
!28609 Fix CN docs
Merge pull request !28609 from wanyiming/code_docs_cnapis
This commit is contained in:
commit
293143d3ee
|
@ -62,12 +62,12 @@ mindspore.common.initializer
|
|||
|
||||
.. py:class:: mindspore.common.initializer.HeUniform(negative_slope=0, mode="fan_in", nonlinearity="leaky_relu")
|
||||
|
||||
生成一个服从HeKaiming均匀分布U(-boundary, boundary)的随机数组用于初始化Tensor。
|
||||
|
||||
HeKaiming均匀分布范围的上下界:
|
||||
生成一个服从HeKaiming均匀分布U(-boundary, boundary)的随机数组用于初始化Tensor,其中:
|
||||
|
||||
.. math::
|
||||
boundary = \sqrt{\frac{6}{(1 + a^2) \times \text{fan_in}}}
|
||||
boundary = \text{gain} \times \sqrt{\frac{3}{fan_mode}}
|
||||
|
||||
其中,gain是一个可选的缩放因子。fan_mode是权重Tensor中输入或输出单元的数量,取决于mode是"fan_in"或是"fan_out"。
|
||||
|
||||
**参数:**
|
||||
|
||||
|
@ -87,9 +87,10 @@ mindspore.common.initializer
|
|||
生成一个服从HeKaiming正态分布N(0, sigma^2)的随机数组用于初始化Tensor,其中:
|
||||
|
||||
.. math::
|
||||
sigma = \frac{gain} {\sqrt{N}}
|
||||
sigma = \frac{gain} {\sqrt{fan_mode}}
|
||||
|
||||
其中,gain是一个可选的缩放因子。如果mode是"fan_in", N是权重Tensor中输入单元的数量,如果mode是"fan_out", N是权重Tensor中输出单元的数量。
|
||||
其中,gain是一个可选的缩放因子。如果mode是"fan_in",则fan_mode是权重Tensor中输入单元的数量,如果mode是"fan_out",
|
||||
fan_mode是权重Tensor中输出单元的数量。
|
||||
|
||||
HeUniform 算法的详细信息,请查看 https://arxiv.org/abs/1502.01852。
|
||||
|
||||
|
@ -113,9 +114,7 @@ mindspore.common.initializer
|
|||
.. math::
|
||||
boundary = gain * \sqrt{\frac{6}{n_{in} + n_{out}}}
|
||||
|
||||
- :math:`gain` 是一个可选的缩放因子。
|
||||
- :math:`n_{in}` 为权重Tensor中输入单元的数量。
|
||||
- :math:`n_{out}` 为权重Tensor中输出单元的数量。
|
||||
:math:`gain` 是一个可选的缩放因子。:math:`n_{in}` 为权重Tensor中输入单元的数量。:math:`n_{out}` 为权重Tensor中输出单元的数量。
|
||||
|
||||
有关 XavierUniform 算法的详细信息,请查看 http://proceedings.mlr.press/v9/glorot10a.html。
|
||||
|
||||
|
@ -187,7 +186,7 @@ mindspore.common.initializer
|
|||
|
||||
**返回:**
|
||||
|
||||
Tensor,返回一个张量对象。
|
||||
Tensor。
|
||||
|
||||
**异常:**
|
||||
|
||||
|
|
|
@ -5,10 +5,10 @@ mindspore.nn.CosineDecayLR
|
|||
|
||||
基于余弦衰减函数计算学习率。
|
||||
|
||||
对于当前step,decayed_learning_rate[current_step]的计算公式为:
|
||||
对于当前step,计算学习率的公式为:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = &min\_lr + 0.5 * (max\_lr - min\_lr) *\\
|
||||
decayed\_learning\_rate = &min\_lr + 0.5 * (max\_lr - min\_lr) *\\
|
||||
&(1 + cos(\frac{current\_step}{decay\_steps}\pi))
|
||||
|
||||
|
||||
|
@ -16,7 +16,7 @@ mindspore.nn.CosineDecayLR
|
|||
|
||||
- **min_lr** (float): 学习率的最小值。
|
||||
- **max_lr** (float): 学习率的最大值。
|
||||
- **decay_steps** (int): 用于计算衰减学习率的值。
|
||||
- **decay_steps** (int): 进行衰减的step数。
|
||||
|
||||
**输入:**
|
||||
|
||||
|
@ -24,7 +24,7 @@ mindspore.nn.CosineDecayLR
|
|||
|
||||
**输出:**
|
||||
|
||||
Tensor。形状为 :math:`()` 的当前step的学习率值。
|
||||
标量Tensor。当前step的学习率值。
|
||||
|
||||
**异常:**
|
||||
|
||||
|
|
|
@ -5,10 +5,10 @@ mindspore.nn.ExponentialDecayLR
|
|||
|
||||
基于指数衰减函数计算学习率。
|
||||
|
||||
对于当前step,decayed_learning_rate[current_step]的计算公式为:
|
||||
对于当前step,计算学习率的公式为:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = learning\_rate * decay\_rate^{p}
|
||||
decayed\_learning\_rate = learning\_rate * decay\_rate^{p}
|
||||
|
||||
其中,
|
||||
|
||||
|
@ -24,7 +24,7 @@ mindspore.nn.ExponentialDecayLR
|
|||
|
||||
- **learning_rate** (float): 学习率的初始值。
|
||||
- **decay_rate** (float): 衰减率。
|
||||
- **decay_steps** (int): 用于计算衰减学习率的值。
|
||||
- **decay_steps** (int): 进行衰减的step数。
|
||||
- **is_stair** (bool): 如果为True,则学习率每 `decay_steps` 步衰减一次。默认值:False。
|
||||
|
||||
**输入:**
|
||||
|
@ -33,7 +33,7 @@ mindspore.nn.ExponentialDecayLR
|
|||
|
||||
**输出:**
|
||||
|
||||
Tensor。形状为 :math:`()` 的当前step的学习率值。
|
||||
标量Tensor。当前step的学习率值。
|
||||
|
||||
**异常:**
|
||||
|
||||
|
|
|
@ -5,10 +5,10 @@ mindspore.nn.InverseDecayLR
|
|||
|
||||
基于逆时衰减函数计算学习率。
|
||||
|
||||
对于当前step,计算decayed_learning_rate[current\_step]的公式为:
|
||||
对于当前step,计算学习率的公式为:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = learning\_rate / (1 + decay\_rate * p)
|
||||
decayed\_learning\_rate = learning\_rate / (1 + decay\_rate * p)
|
||||
|
||||
其中,
|
||||
|
||||
|
@ -24,7 +24,7 @@ mindspore.nn.InverseDecayLR
|
|||
|
||||
- **learning_rate** (float) - 学习率的初始值。
|
||||
- **decay_rate** (float) - 衰减率。
|
||||
- **decay_steps** (int) - 用于计算衰减学习率的值。
|
||||
- **decay_steps** (int) - 进行衰减的step数。
|
||||
- **is_stair** (bool) - 如果为True,则学习率每 `decay_steps` 次衰减一次。默认值:False。
|
||||
|
||||
**输入:**
|
||||
|
@ -33,7 +33,7 @@ mindspore.nn.InverseDecayLR
|
|||
|
||||
**输出:**
|
||||
|
||||
Tensor。当前step的学习率值,shape为 :math:`()`。
|
||||
标量Tensor。当前step的学习率值。
|
||||
|
||||
**异常:**
|
||||
|
||||
|
|
|
@ -5,10 +5,10 @@ mindspore.nn.NaturalExpDecayLR
|
|||
|
||||
基于自然指数衰减函数计算学习率。
|
||||
|
||||
对于当前step,计算decayed_learning_rate[current_step]的公式为:
|
||||
对于当前step,计算学习率的公式为:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = learning\_rate * e^{-decay\_rate * p}
|
||||
decayed\_learning\_rate = learning\_rate * e^{-decay\_rate * p}
|
||||
|
||||
其中,
|
||||
|
||||
|
@ -24,7 +24,7 @@ mindspore.nn.NaturalExpDecayLR
|
|||
|
||||
- **learning_rate** (float) - 学习率的初始值。
|
||||
- **decay_rate** (float) - 衰减率。
|
||||
- **decay_steps** (int) - 用于计算衰减学习率的值。
|
||||
- **decay_steps** (int) - 进行衰减的step数。
|
||||
- **is_stair** (bool) - 如果为True,则学习率每 `decay_steps` 次衰减一次。默认值:False。
|
||||
|
||||
**输入:**
|
||||
|
@ -33,7 +33,7 @@ mindspore.nn.NaturalExpDecayLR
|
|||
|
||||
**输出:**
|
||||
|
||||
Tensor。当前step的学习率值,shape为 :math:`()`。
|
||||
标量Tensor。当前step的学习率值。
|
||||
|
||||
**异常:**
|
||||
|
||||
|
|
|
@ -5,10 +5,10 @@ mindspore.nn.PolynomialDecayLR
|
|||
|
||||
基于多项式衰减函数计算学习率。
|
||||
|
||||
对于当前step,计算decayed_learning_rate[current_step]的公式为:
|
||||
对于当前step,计算学习率的公式为:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = &(learning\_rate - end\_learning\_rate) *\\
|
||||
decayed\_learning\_rate = &(learning\_rate - end\_learning\_rate) *\\
|
||||
&(1 - tmp\_step / tmp\_decay\_steps)^{power}\\
|
||||
&+ end\_learning\_rate
|
||||
|
||||
|
@ -26,8 +26,8 @@ mindspore.nn.PolynomialDecayLR
|
|||
|
||||
- **learning_rate** (float) - 学习率的初始值。
|
||||
- **end_learning_rate** (float) - 学习率的最终值。
|
||||
- **decay_steps** (int) - 用于计算衰减学习率的值。
|
||||
- **power** (float) - 用于计算衰减学习率的值。该参数必须大于0。
|
||||
- **decay_steps** (int) - 进行衰减的step数。
|
||||
- **power** (float) - 多项式的幂,必须大于0。
|
||||
- **update_decay_steps** (bool) - 如果为True,则学习率每 `decay_steps` 次衰减一次。默认值:False。
|
||||
|
||||
**输入:**
|
||||
|
@ -36,7 +36,7 @@ mindspore.nn.PolynomialDecayLR
|
|||
|
||||
**输出:**
|
||||
|
||||
Tensor。当前step的学习率值, shape为 :math:`()`。
|
||||
标量Tensor。当前step的学习率值。
|
||||
|
||||
**异常:**
|
||||
|
||||
|
|
|
@ -5,10 +5,10 @@ mindspore.nn.WarmUpLR
|
|||
|
||||
学习率热身。
|
||||
|
||||
对于当前step,计算warmup_learning_rate[current_step]的公式为:
|
||||
对于当前step,计算学习率的公式为:
|
||||
|
||||
.. math::
|
||||
warmup\_learning\_rate[current\_step] = learning\_rate * tmp\_step / warmup\_steps
|
||||
warmup\_learning\_rate = learning\_rate * tmp\_step / warmup\_steps
|
||||
|
||||
其中,
|
||||
|
||||
|
@ -26,7 +26,7 @@ mindspore.nn.WarmUpLR
|
|||
|
||||
**输出:**
|
||||
|
||||
Tensor。形状为 :math:`()` 的当前step的学习率值。
|
||||
标量Tensor。当前step的学习率值。
|
||||
|
||||
**异常:**
|
||||
|
||||
|
|
|
@ -3,7 +3,7 @@ mindspore.nn.cosine_decay_lr
|
|||
|
||||
.. py:class:: mindspore.nn.cosine_decay_lr(min_lr, max_lr, total_step, step_per_epoch, decay_epoch)
|
||||
|
||||
基于余弦衰减函数计算学习率。
|
||||
基于余弦衰减函数计算学习率。每个step的学习率将会被存放在一个列表中。
|
||||
|
||||
对于第i步,计算decayed_learning_rate[i]的公式为:
|
||||
|
||||
|
@ -19,12 +19,20 @@ mindspore.nn.cosine_decay_lr
|
|||
- **max_lr** (float) - 学习率的最大值。
|
||||
- **total_step** (int) - step总数。
|
||||
- **step_per_epoch** (int) - 每个epoch的step数。
|
||||
- **decay_epoch** (int) - 用于计算衰减学习率的值。
|
||||
- **decay_epoch** (int) - 进行衰减的epoch数。
|
||||
|
||||
**返回:**
|
||||
|
||||
list[float]。列表大小为 `total_step`。
|
||||
|
||||
**异常:**
|
||||
|
||||
- **TypeError:** `min_lr` 或 `max_lr` 不是float。
|
||||
- **TypeError:** `total_step` 或 `step_per_epoch` 或 `decay_epoch` 不是int。
|
||||
- **ValueError:** `max_lr` 不大于0或 `min_lr` 小于0。
|
||||
- **ValueError:** `total_step` 或 `step_per_epoch` 或 `decay_epoch` 小于0。
|
||||
- **ValueError:** `max_lr` 大于或等于 `min_lr`。
|
||||
|
||||
**样例:**
|
||||
|
||||
>>> import mindspore.nn as nn
|
||||
|
|
|
@ -3,7 +3,7 @@ mindspore.nn.exponential_decay_lr
|
|||
|
||||
.. py:class:: mindspore.nn.exponential_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, decay_epoch, is_stair=False)
|
||||
|
||||
基于指数衰减函数计算学习率。
|
||||
基于指数衰减函数计算学习率。每个step的学习率将会被存放在一个列表中。
|
||||
|
||||
对于第i步,计算decayed_learning_rate[i]的公式为:
|
||||
|
||||
|
@ -18,13 +18,20 @@ mindspore.nn.exponential_decay_lr
|
|||
- **decay_rate** (float) - 衰减率。
|
||||
- **total_step** (int) - step总数。
|
||||
- **step_per_epoch** (int) - 每个 epoch的step数。
|
||||
- **decay_epoch** (int) - 用于计算衰减学习率的值。
|
||||
- **decay_epoch** (int) - 进行衰减的epoch数。
|
||||
- **is_stair** (bool) - 如果为True,则学习率每 `decay_epoch` 次衰减一次。默认值:False。
|
||||
|
||||
**返回:**
|
||||
|
||||
list[float]。列表的大小为 `total_step`。
|
||||
|
||||
**异常:**
|
||||
|
||||
- **TypeError:** `total_step` 或 `step_per_epoch` 或 `decay_epoch` 不是int。
|
||||
- **TypeError:** `is_stair` 不是bool。
|
||||
- **TypeError:** `learning_rate` 或 `decay_rate` 不是float。
|
||||
- **ValueError:** `learning_rate` 或 `decay_rate` 小于等于0。
|
||||
|
||||
**样例:**
|
||||
|
||||
>>> import mindspore.nn as nn
|
||||
|
|
|
@ -3,7 +3,7 @@ mindspore.nn.inverse_decay_lr
|
|||
|
||||
.. py:class:: mindspore.nn.inverse_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, decay_epoch, is_stair=False)
|
||||
|
||||
基于逆时间衰减函数计算学习率。
|
||||
基于逆时衰减函数计算学习率。每个step的学习率将会被存放在一个列表中。
|
||||
|
||||
对于第i步,计算decayed_learning_rate[i]的公式为:
|
||||
|
||||
|
@ -18,13 +18,20 @@ mindspore.nn.inverse_decay_lr
|
|||
- **decay_rate** (float) - 衰减率。
|
||||
- **total_step** (int) - step总数。
|
||||
- **step_per_epoch** (int) - 每个epoch的step数。
|
||||
- **decay_epoch** (int) - 用于计算衰减学习率的值。
|
||||
- **decay_epoch** (int) - 进行衰减的epoch数。
|
||||
- **is_stair** (bool) - 如果为True,则学习率每 `decay_epoch` 次衰减一次。默认值:False。
|
||||
|
||||
**返回:**
|
||||
|
||||
list[float]。列表大小为 `total_step` 。
|
||||
|
||||
**异常:**
|
||||
|
||||
- **TypeError:** `total_step` 或 `step_per_epoch` 或 `decay_epoch` 不是int。
|
||||
- **TypeError:** `is_stair` 不是bool。
|
||||
- **ValueError:** `learning_rate` 或 `decay_rate` 不是float。
|
||||
- **ValueError:** `learning_rate` 或 `decay_rate` 小于等于0。
|
||||
|
||||
**样例:**
|
||||
|
||||
>>> import mindspore.nn as nn
|
||||
|
|
|
@ -3,7 +3,7 @@ mindspore.nn.natural_exp_decay_lr
|
|||
|
||||
.. py:class:: mindspore.nn.natural_exp_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, decay_epoch, is_stair=False)
|
||||
|
||||
基于自然指数衰减函数计算学习率。
|
||||
基于自然指数衰减函数计算学习率。每个step的学习率将会被存放在一个列表中。
|
||||
|
||||
对于第i步,计算decayed_learning_rate[i]的公式为:
|
||||
|
||||
|
@ -18,13 +18,20 @@ mindspore.nn.natural_exp_decay_lr
|
|||
- **decay_rate** (float) - 衰减率。
|
||||
- **total_step** (int) - step总数。
|
||||
- **step_per_epoch** (int) - 每个epoch的step数。
|
||||
- **decay_epoch** (int) - 用于计算衰减学习率的值。
|
||||
- **decay_epoch** (int) - 进行衰减的epoch数。
|
||||
- **is_stair** (bool) - 如果为True,则学习率每 `decay_epoch` 次衰减一次。默认值:False。
|
||||
|
||||
**返回:**
|
||||
|
||||
list[float]。`total_step` 表示列表的大小。
|
||||
|
||||
**异常:**
|
||||
|
||||
- **TypeError:** `total_step` 或 `step_per_epoch` 或 `decay_epoch` 不是int。
|
||||
- **TypeError:** `is_stair` 不是bool。
|
||||
- **ValueError:** `learning_rate` 或 `decay_rate` 不是float。
|
||||
- **ValueError:** `learning_rate` 或 `decay_rate` 小于等于0。
|
||||
|
||||
**样例:**
|
||||
|
||||
>>> import mindspore.nn as nn
|
||||
|
|
|
@ -3,7 +3,7 @@ mindspore.nn.piecewise_constant_lr
|
|||
|
||||
.. py:class:: mindspore.nn.piecewise_constant_lr(milestone, learning_rates)
|
||||
|
||||
获取分段常量学习率。
|
||||
获取分段常量学习率。每个step的学习率将会被存放在一个列表中。
|
||||
|
||||
通过给定的 `milestone` 和 `learning_rates` 计算学习率。设 `milestone` 的值为 :math:`(M_1, M_2, ..., M_t, ..., M_N)` , `learning_rates` 的值为 :math:`(x_1, x_2, ..., x_t, ..., x_N)` 。N是 `milestone` 的长度。
|
||||
设 `y` 为输出学习率, 那么对于第i步,计算y[i]的公式为:
|
||||
|
@ -13,12 +13,18 @@ mindspore.nn.piecewise_constant_lr
|
|||
|
||||
**参数:**
|
||||
|
||||
- **milestone** (Union[list[int], tuple[int]]) - milestone列表。此列表是一个单调递增的列表。类表中的元素必须大于0。
|
||||
- **milestone** (Union[list[int], tuple[int]]) - milestone列表。此列表是一个单调递增的列表。列表中的元素必须大于0。
|
||||
- **learning_rates** (Union[list[float], tuple[float]]) - 学习率列表。
|
||||
|
||||
**返回:**
|
||||
|
||||
list[float]。列表的大小为 :math:`M_N`。
|
||||
list[float]。列表的大小为 :math:`MN`。
|
||||
|
||||
**异常:**
|
||||
|
||||
- **TypeError:** `milestone` 或 `learning_rates` 既不是tuple也不是list。
|
||||
- **ValueError:** `milestone` 和 `learning_rates` 的长度不相等。
|
||||
- **ValueError:** `milestone` 中的不是单调递增的。
|
||||
|
||||
**样例:**
|
||||
>>> import mindspore.nn as nn
|
||||
|
|
|
@ -3,7 +3,7 @@ mindspore.nn.polynomial_decay_lr
|
|||
|
||||
.. py:class:: mindspore.nn.polynomial_decay_lr(learning_rate, end_learning_rate, total_step, step_per_epoch, decay_epoch, power, update_decay_epoch=False)
|
||||
|
||||
基于多项式衰减函数计算学习率。
|
||||
基于多项式衰减函数计算学习率。每个step的学习率将会被存放在一个列表中。
|
||||
|
||||
对于第i步,计算decayed_learning_rate[i]的公式为:
|
||||
|
||||
|
@ -33,14 +33,21 @@ mindspore.nn.polynomial_decay_lr
|
|||
- **end_learning_rate** (float) - 学习率的最终值。
|
||||
- **total_step** (int) - step总数。
|
||||
- **step_per_epoch** (int) - 每个epoch的step数。
|
||||
- **decay_epoch** (int) - 用于计算衰减学习率的值。
|
||||
- **power** (float) - 用于计算衰减学习率的值。该参数必须大于0。
|
||||
- **decay_epoch** (int) - 进行衰减的epoch数。
|
||||
- **power** (float) - 多项式的幂,必须大于0。
|
||||
- **update_decay_epoch** (bool) - 如果为True,则更新 `decay_epoch` 。默认值:False。
|
||||
|
||||
**返回:**
|
||||
|
||||
list[float]。列表的大小为 `total_step`。
|
||||
|
||||
**异常:**
|
||||
|
||||
- **TypeError:** `learning_rate` 或 `end_learning_rate` 或 `power` 不是float。
|
||||
- **TypeError:** `total_step` 或 `step_per_epoch` 或 `decay_epoch` 不是int。
|
||||
- **TypeError:** `update_decay_epoch` 不是bool。
|
||||
- **ValueError:** `learning_rate` 或 `power` 小于等于0。
|
||||
|
||||
**样例:**
|
||||
|
||||
>>> import mindspore.nn as nn
|
||||
|
|
|
@ -3,7 +3,7 @@ mindspore.nn.warmup_lr
|
|||
|
||||
.. py:function:: mindspore.nn.warmup_lr(learning_rate, total_step, step_per_epoch, warmup_epoch)
|
||||
|
||||
预热学习率。
|
||||
预热学习率。每个step的学习率将会被存放在一个列表中。
|
||||
|
||||
对于第i步,计算warmup_learning_rate[i]的公式为:
|
||||
|
||||
|
@ -23,6 +23,12 @@ mindspore.nn.warmup_lr
|
|||
|
||||
list[float]。 `total_step` 表示列表的大小。
|
||||
|
||||
**异常:**
|
||||
|
||||
- **TypeError:** `learning_rate` 不是float。
|
||||
- **TypeError:** `total_step` 或 `step_per_epoch` 或 `decay_epoch` 不是int。
|
||||
- **ValueError:** `learning_rate` 小于0。
|
||||
|
||||
**样例:**
|
||||
|
||||
>>> import mindspore.nn as nn
|
||||
|
|
|
@ -228,14 +228,13 @@ def _calculate_in_and_out(arr):
|
|||
class XavierUniform(Initializer):
|
||||
r"""
|
||||
Generates an array with values sampled from Xavier uniform distribution
|
||||
:math:`{U}(-\text{boundary}, \text{boundary})` in order to initialize a tensor, where:
|
||||
:math:`{U}(-\text{boundary}, \text{boundary})` in order to initialize a tensor, where
|
||||
|
||||
.. math::
|
||||
boundary = gain * \sqrt{\frac{6}{n_{in} + n_{out}}}
|
||||
|
||||
- where :math:`gain` is an optional scaling factor.
|
||||
- where :math:`n_{in}` is the number of input units in the weight tensor.
|
||||
- where :math:`n_{out}` is the number of output units in the weight tensor.
|
||||
where :math:`gain` is an optional scaling factor, :math:`n_{in}` is the number of input units in the weight tensor,
|
||||
:math:`n_{out}` is the number of output units in the weight tensor.
|
||||
|
||||
For details of XavierUniform algorithm, please check
|
||||
`<http://proceedings.mlr.press/v9/glorot10a.html>`_.
|
||||
|
@ -270,9 +269,10 @@ class HeUniform(Initializer):
|
|||
:math:`{U}(-\text{boundary}, \text{boundary})` in order to initialize a tensor, where
|
||||
|
||||
.. math::
|
||||
boundary = \sqrt{\frac{6}{(1 + a^2) \times \text{fan_in}}}
|
||||
boundary = \text{gain} \times \sqrt{\frac{3}{fan_mode}}
|
||||
|
||||
which is the bound of the HeUniform distribution.
|
||||
where :math:`gain` is an optional scaling factor. If :math: `fan_mode` is 'fan_in', it is the number of input units
|
||||
of the weight tensor. If :math: `fan_mode` is 'fan_out', it is the number of output units of the weight tensor.
|
||||
|
||||
For details of HeUniform algorithm, please check
|
||||
`<https://arxiv.org/abs/1502.01852>`_.
|
||||
|
@ -316,10 +316,10 @@ class HeNormal(Initializer):
|
|||
:math:`{N}(0, \text{sigma}^2)` in order to initialize a tensor, where
|
||||
|
||||
.. math::
|
||||
sigma = \frac{gain} {\sqrt{N}}
|
||||
sigma = \frac{gain} {\sqrt{fan_mode}}
|
||||
|
||||
where :math:`gain` is an optional scaling factor. :math: `N` is the number of input units of the weight tensor,
|
||||
if `mode` is 'fan_in'. If `mode` is 'fan_out', it is the number of output units.
|
||||
where :math:`gain` is an optional scaling factor. :math: `fan_mode` is the number of input or output units of
|
||||
the weight tensor, depending on the `mode` is 'fan_in' or 'fan_out'.
|
||||
|
||||
For details of HeUniform algorithm, please check `<https://arxiv.org/abs/1502.01852>`_.
|
||||
|
||||
|
@ -634,7 +634,6 @@ class Normal(Initializer):
|
|||
sigma (float): The standard deviation of Normal distribution. Default: 0.01.
|
||||
mean (float): The mean of Normal distribution. Default: 0.0.
|
||||
|
||||
|
||||
Examples:
|
||||
>>> import mindspore
|
||||
>>> from mindspore.common.initializer import initializer, Normal
|
||||
|
|
|
@ -20,7 +20,7 @@ from mindspore._checkparam import Validator as validator
|
|||
|
||||
def piecewise_constant_lr(milestone, learning_rates):
|
||||
r"""
|
||||
Get piecewise constant learning rate.
|
||||
Get piecewise constant learning rate. The learning rate for each step will be stored in a list.
|
||||
|
||||
Calculate learning rate by the given `milestone` and `learning_rates`. Let the value of `milestone` be
|
||||
:math:`(M_1, M_2, ..., M_t, ..., M_N)` and the value of `learning_rates` be :math:`(x_1, x_2, ..., x_t, ..., x_N)`.
|
||||
|
@ -38,6 +38,11 @@ def piecewise_constant_lr(milestone, learning_rates):
|
|||
Returns:
|
||||
list[float]. The size of list is :math:`M_N`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `milestone` or `learning_rates` is neither a tuple nor a list.
|
||||
ValueError: If the length of `milestone` and `learning_rates` is not same.
|
||||
ValueError: If the value in `milestone` is not monotonically decreasing.
|
||||
|
||||
Examples:
|
||||
>>> import mindspore.nn as nn
|
||||
>>>
|
||||
|
@ -83,7 +88,8 @@ def _check_inputs(learning_rate, decay_rate, total_step, step_per_epoch, decay_e
|
|||
|
||||
def exponential_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, decay_epoch, is_stair=False):
|
||||
r"""
|
||||
Calculates learning rate base on exponential decay function.
|
||||
Calculates learning rate base on exponential decay function. The learning rate for each step will
|
||||
be stored in a list.
|
||||
|
||||
For the i-th step, the formula of computing decayed_learning_rate[i] is:
|
||||
|
||||
|
@ -97,12 +103,18 @@ def exponential_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch,
|
|||
decay_rate (float): The decay rate.
|
||||
total_step (int): The total number of steps.
|
||||
step_per_epoch (int): The number of steps in per epoch.
|
||||
decay_epoch (int): A value used to calculate decayed learning rate.
|
||||
decay_epoch (int): Number of epochs to decay over.
|
||||
is_stair (bool): If true, learning rate is decayed once every `decay_epoch` times. Default: False.
|
||||
|
||||
Returns:
|
||||
list[float]. The size of list is `total_step`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `total_step` or `step_per_epoch` or `decay_epoch` is not an int.
|
||||
TypeError: If `is_stair` is not a bool.
|
||||
TypeError: If `learning_rate` or `decay_rate` is not a float.
|
||||
ValueError: If `learning_rate` or `decay_rate` is less than or equal to 0.
|
||||
|
||||
Examples:
|
||||
>>> import mindspore.nn as nn
|
||||
>>>
|
||||
|
@ -128,7 +140,8 @@ def exponential_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch,
|
|||
|
||||
def natural_exp_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, decay_epoch, is_stair=False):
|
||||
r"""
|
||||
Calculates learning rate base on natural exponential decay function.
|
||||
Calculates learning rate base on natural exponential decay function. The learning rate for each step will be
|
||||
stored in a list.
|
||||
|
||||
For the i-th step, the formula of computing decayed_learning_rate[i] is:
|
||||
|
||||
|
@ -142,12 +155,18 @@ def natural_exp_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch,
|
|||
decay_rate (float): The decay rate.
|
||||
total_step (int): The total number of steps.
|
||||
step_per_epoch (int): The number of steps in per epoch.
|
||||
decay_epoch (int): A value used to calculate decayed learning rate.
|
||||
decay_epoch (int): Number of epochs to decay over.
|
||||
is_stair (bool): If true, learning rate is decayed once every `decay_epoch` times. Default: False.
|
||||
|
||||
Returns:
|
||||
list[float]. The size of list is `total_step`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `total_step` or `step_per_epoch` or `decay_epoch` is not an int.
|
||||
TypeError: If `is_stair` is not a bool.
|
||||
TypeError: If `learning_rate` or `decay_rate` is not a float.
|
||||
ValueError: If `learning_rate` or `decay_rate` is less than or equal to 0.
|
||||
|
||||
Examples:
|
||||
>>> import mindspore.nn as nn
|
||||
>>>
|
||||
|
@ -174,7 +193,8 @@ def natural_exp_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch,
|
|||
|
||||
def inverse_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, decay_epoch, is_stair=False):
|
||||
r"""
|
||||
Calculates learning rate base on inverse-time decay function.
|
||||
Calculates learning rate base on inverse-time decay function. The learning rate for each step
|
||||
will be stored in a list.
|
||||
|
||||
For the i-th step, the formula of computing decayed_learning_rate[i] is:
|
||||
|
||||
|
@ -188,12 +208,18 @@ def inverse_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, deca
|
|||
decay_rate (float): The decay rate.
|
||||
total_step (int): The total number of steps.
|
||||
step_per_epoch (int): The number of steps in per epoch.
|
||||
decay_epoch (int): A value used to calculate decayed learning rate.
|
||||
decay_epoch (int): Number of epochs to decay over.
|
||||
is_stair (bool): If true, learning rate is decayed once every `decay_epoch` times. Default: False.
|
||||
|
||||
Returns:
|
||||
list[float]. The size of list is `total_step`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `total_step` or `step_per_epoch` or `decay_epoch` is not an int.
|
||||
TypeError: If `is_stair` is not a bool.
|
||||
TypeError: If `learning_rate` or `decay_rate` is not a float.
|
||||
ValueError: If `learning_rate` or `decay_rate` is less than or equal to 0.
|
||||
|
||||
Examples:
|
||||
>>> import mindspore.nn as nn
|
||||
>>>
|
||||
|
@ -219,7 +245,7 @@ def inverse_decay_lr(learning_rate, decay_rate, total_step, step_per_epoch, deca
|
|||
|
||||
def cosine_decay_lr(min_lr, max_lr, total_step, step_per_epoch, decay_epoch):
|
||||
r"""
|
||||
Calculates learning rate base on cosine decay function.
|
||||
Calculates learning rate base on cosine decay function. The learning rate for each step will be stored in a list.
|
||||
|
||||
For the i-th step, the formula of computing decayed_learning_rate[i] is:
|
||||
|
||||
|
@ -234,11 +260,18 @@ def cosine_decay_lr(min_lr, max_lr, total_step, step_per_epoch, decay_epoch):
|
|||
max_lr (float): The maximum value of learning rate.
|
||||
total_step (int): The total number of steps.
|
||||
step_per_epoch (int): The number of steps in per epoch.
|
||||
decay_epoch (int): A value used to calculate decayed learning rate.
|
||||
decay_epoch (int): Number of epochs to decay over.
|
||||
|
||||
Returns:
|
||||
list[float]. The size of list is `total_step`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `min_lr` or `max_lr` is not a float.
|
||||
TypeError: If `total_step` or `step_per_epoch` or `decay_epoch` is not an int.
|
||||
ValueError: If `max_lr` is not greater than 0 or `min_lr` is less than 0.
|
||||
ValueError: If `total_step` or `step_per_epoch` or `decay_epoch` is less than 0.
|
||||
ValueError: If `min_lr` is greater than or equal to `max_lr`.
|
||||
|
||||
Examples:
|
||||
>>> import mindspore.nn as nn
|
||||
>>>
|
||||
|
@ -274,7 +307,8 @@ def cosine_decay_lr(min_lr, max_lr, total_step, step_per_epoch, decay_epoch):
|
|||
def polynomial_decay_lr(learning_rate, end_learning_rate, total_step, step_per_epoch, decay_epoch, power,
|
||||
update_decay_epoch=False):
|
||||
r"""
|
||||
Calculates learning rate base on polynomial decay function.
|
||||
Calculates learning rate base on polynomial decay function. The learning rate for each step
|
||||
will be stored in a list.
|
||||
|
||||
For the i-th step, the formula of computing decayed_learning_rate[i] is:
|
||||
|
||||
|
@ -303,10 +337,16 @@ def polynomial_decay_lr(learning_rate, end_learning_rate, total_step, step_per_e
|
|||
end_learning_rate (float): The end value of learning rate.
|
||||
total_step (int): The total number of steps.
|
||||
step_per_epoch (int): The number of steps in per epoch.
|
||||
decay_epoch (int): A value used to calculate decayed learning rate.
|
||||
power (float): A value used to calculate decayed learning rate. This parameter must be greater than 0.
|
||||
decay_epoch (int): Number of epochs to decay over.
|
||||
power (float): The power of polynomial. It must be greater than 0.
|
||||
update_decay_epoch (bool): If true, update `decay_epoch`. Default: False.
|
||||
|
||||
Raises:
|
||||
TypeError: If `learning_rate` or `end_learning_rate` or `power` is not a float.
|
||||
TypeError: If `total_step` or `step_per_epoch` or `decay_epoch` is not an int.
|
||||
TypeError: If `update_decay_epoch` is not a bool.
|
||||
ValueError: If `learning_rate` or `power` is not greater than 0.
|
||||
|
||||
Returns:
|
||||
list[float]. The size of list is `total_step`.
|
||||
|
||||
|
@ -352,7 +392,7 @@ def polynomial_decay_lr(learning_rate, end_learning_rate, total_step, step_per_e
|
|||
|
||||
def warmup_lr(learning_rate, total_step, step_per_epoch, warmup_epoch):
|
||||
r"""
|
||||
Gets learning rate warming up.
|
||||
Gets learning rate warming up. The learning rate for each step will be stored in a list.
|
||||
|
||||
For the i-th step, the formula of computing warmup_learning_rate[i] is:
|
||||
|
||||
|
@ -370,6 +410,11 @@ def warmup_lr(learning_rate, total_step, step_per_epoch, warmup_epoch):
|
|||
Returns:
|
||||
list[float]. The size of list is `total_step`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `learning_rate` is not a float.
|
||||
TypeError: If `total_step` or `step_per_epoch` or `decay_epoch` is not an int.
|
||||
ValueError: If `learning_rate` is less than 0.
|
||||
|
||||
Examples:
|
||||
>>> import mindspore.nn as nn
|
||||
>>>
|
||||
|
|
|
@ -58,10 +58,10 @@ class ExponentialDecayLR(LearningRateSchedule):
|
|||
r"""
|
||||
Calculates learning rate based on exponential decay function.
|
||||
|
||||
For current step, the formula of computing decayed_learning_rate[current_step] is:
|
||||
For current step, the formula of computing decayed learning rate is:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = learning\_rate * decay\_rate^{p}
|
||||
decayed\_learning\_rate = learning\_rate * decay\_rate^{p}
|
||||
|
||||
Where :
|
||||
|
||||
|
@ -76,7 +76,7 @@ class ExponentialDecayLR(LearningRateSchedule):
|
|||
Args:
|
||||
learning_rate (float): The initial value of learning rate.
|
||||
decay_rate (float): The decay rate.
|
||||
decay_steps (int): A value used to calculate decayed learning rate.
|
||||
decay_steps (int): Number of steps to decay over.
|
||||
is_stair (bool): If true, learning rate is decayed once every `decay_steps` time. Default: False.
|
||||
|
||||
Inputs:
|
||||
|
@ -128,10 +128,10 @@ class NaturalExpDecayLR(LearningRateSchedule):
|
|||
r"""
|
||||
Calculates learning rate base on natural exponential decay function.
|
||||
|
||||
For current step, the formula of computing decayed_learning_rate[current_step] is:
|
||||
For current step, the formula of computing decayed learning rate is:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current_step] = learning\_rate * e^{-decay\_rate * p}
|
||||
decayed\_learning\_rate= learning\_rate * e^{-decay\_rate * p}
|
||||
|
||||
Where :
|
||||
|
||||
|
@ -146,7 +146,7 @@ class NaturalExpDecayLR(LearningRateSchedule):
|
|||
Args:
|
||||
learning_rate (float): The initial value of learning rate.
|
||||
decay_rate (float): The decay rate.
|
||||
decay_steps (int): A value used to calculate decayed learning rate.
|
||||
decay_steps (int): Number of steps to decay over.
|
||||
is_stair (bool): If true, learning rate is decayed once every `decay_steps` time. Default: False.
|
||||
|
||||
Inputs:
|
||||
|
@ -199,10 +199,10 @@ class InverseDecayLR(LearningRateSchedule):
|
|||
r"""
|
||||
Calculates learning rate base on inverse-time decay function.
|
||||
|
||||
For current step, the formula of computing decayed_learning_rate[current_step] is:
|
||||
For current step, the formula of computing decayed learning rate is:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = learning\_rate / (1 + decay\_rate * p)
|
||||
decayed\_learning\_rate = learning\_rate / (1 + decay\_rate * p)
|
||||
|
||||
Where :
|
||||
|
||||
|
@ -217,7 +217,7 @@ class InverseDecayLR(LearningRateSchedule):
|
|||
Args:
|
||||
learning_rate (float): The initial value of learning rate.
|
||||
decay_rate (float): The decay rate.
|
||||
decay_steps (int): A value used to calculate decayed learning rate.
|
||||
decay_steps (int): Number of steps to decay over.
|
||||
is_stair (bool): If true, learning rate decay once every `decay_steps` times. Default: False.
|
||||
|
||||
Inputs:
|
||||
|
@ -268,17 +268,17 @@ class CosineDecayLR(LearningRateSchedule):
|
|||
r"""
|
||||
Calculates learning rate based on cosine decay function.
|
||||
|
||||
For current step, the formula of computing decayed_learning_rate[current_step] is:
|
||||
For current step, the formula of computing decayed learning rate is:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = min\_lr + 0.5 * (max\_lr - min\_lr) *
|
||||
decayed\_learning\_rate = min\_lr + 0.5 * (max\_lr - min\_lr) *
|
||||
(1 + cos(\frac{current\_step}{decay\_steps}\pi))
|
||||
|
||||
|
||||
Args:
|
||||
min_lr (float): The minimum value of learning rate.
|
||||
max_lr (float): The maximum value of learning rate.
|
||||
decay_steps (int): A value used to calculate decayed learning rate.
|
||||
decay_steps (int): Number of steps to decay over.
|
||||
|
||||
Inputs:
|
||||
- **global_step** (Tensor) - The current step number.
|
||||
|
@ -338,10 +338,10 @@ class PolynomialDecayLR(LearningRateSchedule):
|
|||
r"""
|
||||
Calculates learning rate base on polynomial decay function.
|
||||
|
||||
For current step, the formula of computing decayed_learning_rate[current_step] is:
|
||||
For current step, the formula of computing decayed learning rate is:
|
||||
|
||||
.. math::
|
||||
decayed\_learning\_rate[current\_step] = (learning\_rate - end\_learning\_rate) *
|
||||
decayed\_learning\_rate = (learning\_rate - end\_learning\_rate) *
|
||||
(1 - tmp\_step / tmp\_decay\_steps)^{power} + end\_learning\_rate
|
||||
|
||||
Where :
|
||||
|
@ -357,8 +357,8 @@ class PolynomialDecayLR(LearningRateSchedule):
|
|||
Args:
|
||||
learning_rate (float): The initial value of learning rate.
|
||||
end_learning_rate (float): The end value of learning rate.
|
||||
decay_steps (int): A value used to calculate decayed learning rate.
|
||||
power (float): A value used to calculate decayed learning rate. This parameter must be greater than 0.
|
||||
decay_steps (int): Number of steps to decay over.
|
||||
power (float): The power of polynomial. It must be greater than 0.
|
||||
update_decay_steps (bool): If true, learning rate is decayed once every `decay_steps` time. Default: False.
|
||||
|
||||
Inputs:
|
||||
|
@ -432,10 +432,10 @@ class WarmUpLR(LearningRateSchedule):
|
|||
r"""
|
||||
Gets learning rate warming up.
|
||||
|
||||
For current step, the formula of computing warmup_learning_rate[i] is:
|
||||
For current step, the formula of computing warmup learning rate is:
|
||||
|
||||
.. math::
|
||||
warmup\_learning\_rate[current\_step] = learning\_rate * tmp\_step / warmup\_steps
|
||||
warmup\_learning\_rate = learning\_rate * tmp\_step / warmup\_steps
|
||||
|
||||
Where :
|
||||
|
||||
|
|
Loading…
Reference in New Issue