forked from mindspore-Ecosystem/mindspore
!4388 Third round of enhancement of API comment & README_CN
Merge pull request !4388 from Simson/enhancement-API
This commit is contained in:
commit
15496ff5a4
|
@ -1,7 +1,9 @@
|
|||
![MindSpore Logo](docs/MindSpore-logo.png "MindSpore logo")
|
||||
============================================================
|
||||
|
||||
- [What Is MindSpore?](#what-is-mindspore)
|
||||
[查看中文](./README_CN.md)
|
||||
|
||||
- [What Is MindSpore](#what-is-mindspore)
|
||||
- [Automatic Differentiation](#automatic-differentiation)
|
||||
- [Automatic Parallel](#automatic-parallel)
|
||||
- [Installation](#installation)
|
||||
|
|
|
@ -0,0 +1,220 @@
|
|||
![MindSpore标志](docs/MindSpore-logo.png "MindSpore logo")
|
||||
============================================================
|
||||
|
||||
[View English](./README.md)
|
||||
|
||||
- [MindSpore介绍](#mindspore介绍)
|
||||
- [自动微分](#自动微分)
|
||||
- [自动并行](#自动并行)
|
||||
- [安装](#安装)
|
||||
- [二进制文件](#二进制文件)
|
||||
- [来源](#来源)
|
||||
- [Docker镜像](#docker镜像)
|
||||
- [快速入门](#快速入门)
|
||||
- [文档](#文档)
|
||||
- [社区](#社区)
|
||||
- [治理](#治理)
|
||||
- [交流](#交流)
|
||||
- [贡献](#贡献)
|
||||
- [版本说明](#版本说明)
|
||||
- [许可证](#许可证)
|
||||
|
||||
## MindSpore介绍
|
||||
|
||||
MindSpore是一种适用于端边云场景的新型开源深度学习训练/推理框架。
|
||||
MindSpore提供了友好的设计和高效的执行,旨在提升数据科学家和算法工程师的开发体验,并为Ascend AI处理器提供原生支持,以及软硬件协同优化。
|
||||
|
||||
|
||||
同时,MindSpore作为全球AI开源社区,致力于进一步开发和丰富AI软硬件应用生态。
|
||||
|
||||
|
||||
|
||||
<img src="docs/MindSpore-architecture.png" alt="MindSpore Architecture" width="600"/>
|
||||
|
||||
欲了解更多详情,请查看我们的[总体架构](https://www.mindspore.cn/docs/zh-CN/master/architecture.html)。
|
||||
|
||||
### 自动微分
|
||||
|
||||
当前主流深度学习框架中有三种自动微分技术:
|
||||
|
||||
- **基于静态计算图的转换**:编译时将网络转换为静态数据流图,将链式法则应用于数据流图,实现自动微分。
|
||||
- **基于动态计算图的转换**:记录算子过载正向执行时网络的运行轨迹,对动态生成的数据流图应用链式法则,实现自动微分。
|
||||
- **基于源码的转换**:该技术是从功能编程框架演进而来,以即时编译(Just-in-time Compilation,JIT)的形式对中间表达式(程序在编译过程中的表达式)进行自动差分转换,支持复杂的控制流场景、高阶函数和闭包。
|
||||
|
||||
TensorFlow早期采用的是静态计算图,PyTorch采用的是动态计算图。静态映射可以利用静态编译技术来优化网络性能,但是构建网络或调试网络非常复杂。动态图的使用非常方便,但很难实现性能的极限优化。
|
||||
|
||||
MindSpore找到了另一种方法,即基于源代码转换的自动微分。一方面,它支持自动控制流的自动微分,因此像PyTorch这样的模型构建非常方便。另一方面,MindSpore可以对神经网络进行静态编译优化,以获得更好的性能。
|
||||
|
||||
<img src="docs/Automatic-differentiation.png" alt="Automatic Differentiation" width="600"/>
|
||||
|
||||
MindSpore自动微分的实现可以理解为程序本身的符号微分。MindSpore IR是一个函数中间表达式,它与基础代数中的复合函数具有直观的对应关系。复合函数的公式由任意可推导的基础函数组成。MindSpore IR中的每个原语操作都可以对应基础代数中的基本功能,从而可以建立更复杂的流控制。
|
||||
|
||||
### 自动并行
|
||||
|
||||
MindSpore自动并行的目的是构建数据并行、模型并行和混合并行相结合的训练方法。该方法能够自动选择开销最小的模型切分策略,实现自动分布并行训练。
|
||||
|
||||
<img src="docs/Automatic-parallel.png" alt="Automatic Parallel" width="600"/>
|
||||
|
||||
目前MindSpore采用的是算子切分的细粒度并行策略,即图中的每个算子被切分为一个集群,完成并行操作。在此期间的切分策略可能非常复杂,但是作为一名Python开发者,您无需关注底层实现,只要顶层API计算是有效的即可。
|
||||
|
||||
## 安装
|
||||
|
||||
### 二进制文件
|
||||
|
||||
MindSpore提供跨多个后端的构建选项:
|
||||
|
||||
| 硬件平台 | 操作系统 | 状态 |
|
||||
| :------------ | :-------------- | :--- |
|
||||
| Ascend 910 | Ubuntu-x86 | ✔️ |
|
||||
| | EulerOS-x86 | ✔️ |
|
||||
| | EulerOS-aarch64 | ✔️ |
|
||||
| GPU CUDA 10.1 | Ubuntu-x86 | ✔️ |
|
||||
| CPU | Ubuntu-x86 | ✔️ |
|
||||
| | Windows-x86 | ✔️ |
|
||||
|
||||
使用`pip`命令安装,以`CPU`和`Ubuntu-x86`build版本为例:
|
||||
|
||||
1. 请从[MindSpore下载页面](https://www.mindspore.cn/versions)下载并安装whl包。
|
||||
|
||||
```
|
||||
pip install https://ms-release.obs.cn-north-4.myhuaweicloud.com/0.6.0-beta/MindSpore/cpu/ubuntu_x86/mindspore-0.6.0-cp37-cp37m-linux_x86_64.whl
|
||||
```
|
||||
|
||||
2. 执行以下命令,验证安装结果。
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
import mindspore.context as context
|
||||
import mindspore.nn as nn
|
||||
from mindspore import Tensor
|
||||
from mindspore.ops import operations as P
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")
|
||||
|
||||
class Mul(nn.Cell):
|
||||
def __init__(self):
|
||||
super(Mul, self).__init__()
|
||||
self.mul = P.Mul()
|
||||
|
||||
def construct(self, x, y):
|
||||
return self.mul(x, y)
|
||||
|
||||
x = Tensor(np.array([1.0, 2.0, 3.0]).astype(np.float32))
|
||||
y = Tensor(np.array([4.0, 5.0, 6.0]).astype(np.float32))
|
||||
|
||||
mul = Mul()
|
||||
print(mul(x, y))
|
||||
```
|
||||
```
|
||||
[ 4. 10. 18.]
|
||||
```
|
||||
### 来源
|
||||
|
||||
[MindSpore安装](https://www.mindspore.cn/install)。
|
||||
|
||||
### Docker镜像
|
||||
|
||||
MindSpore的Docker镜像托管在[Docker Hub](https://hub.docker.com/r/mindspore)上。
|
||||
目前容器化构建选项支持情况如下:
|
||||
|
||||
| 硬件平台 | Docker镜像仓库 | 标签 | 说明 |
|
||||
| :----- | :------------------------ | :----------------------- | :--------------------------------------- |
|
||||
| CPU | `mindspore/mindspore-cpu` | `x.y.z` | 已经预安装MindSpore `x.y.z` CPU版本的生产环境。 |
|
||||
| | | `devel` | 提供开发环境从源头构建MindSpore(`CPU`后端)。安装详情请参考https://www.mindspore.cn/install。 |
|
||||
| | | `runtime` | 提供运行时环境安装MindSpore二进制包(`CPU`后端)。 |
|
||||
| GPU | `mindspore/mindspore-gpu` | `x.y.z` | 已经预安装MindSpore `x.y.z` GPU版本的生产环境。 |
|
||||
| | | `devel` | 提供开发环境从源头构建MindSpore(`GPU CUDA10.1`后端)。安装详情请参考https://www.mindspore.cn/install。 |
|
||||
| | | `runtime` | 提供运行时环境安装MindSpore二进制包(`GPU CUDA10.1`后端)。 |
|
||||
| Ascend | <center>—</center> | <center>—</center> | 即将推出,敬请期待。 |
|
||||
|
||||
> **注意:** 不建议从源头构建GPU `devel` Docker镜像后直接安装whl包。我们强烈建议您在GPU `runtime` Docker镜像中传输并安装whl包。
|
||||
|
||||
* CPU
|
||||
|
||||
对于`CPU`后端,可以直接使用以下命令获取并运行最新的稳定镜像:
|
||||
```
|
||||
docker pull mindspore/mindspore-cpu:0.6.0-beta
|
||||
docker run -it mindspore/mindspore-cpu:0.6.0-beta /bin/bash
|
||||
```
|
||||
|
||||
* GPU
|
||||
|
||||
对于`GPU`后端,请确保`nvidia-container-toolkit`已经提前安装,以下是`Ubuntu`用户安装指南:
|
||||
```
|
||||
DISTRIBUTION=$(. /etc/os-release; echo $ID$VERSION_ID)
|
||||
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -
|
||||
curl -s -L https://nvidia.github.io/nvidia-docker/$DISTRIBUTION/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list
|
||||
|
||||
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit nvidia-docker2
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
使用以下命令获取并运行最新的稳定镜像:
|
||||
```
|
||||
docker pull mindspore/mindspore-gpu:0.6.0-beta
|
||||
docker run -it --runtime=nvidia --privileged=true mindspore/mindspore-gpu:0.6.0-beta /bin/bash
|
||||
```
|
||||
|
||||
要测试Docker是否正常工作,请运行下面的Python代码并检查输出:
|
||||
```python
|
||||
import numpy as np
|
||||
import mindspore.context as context
|
||||
from mindspore import Tensor
|
||||
from mindspore.ops import functional as F
|
||||
|
||||
context.set_context(device_target="GPU")
|
||||
|
||||
x = Tensor(np.ones([1,3,3,4]).astype(np.float32))
|
||||
y = Tensor(np.ones([1,3,3,4]).astype(np.float32))
|
||||
print(F.tensor_add(x, y))
|
||||
```
|
||||
```
|
||||
[[[ 2. 2. 2. 2.],
|
||||
[ 2. 2. 2. 2.],
|
||||
[ 2. 2. 2. 2.]],
|
||||
|
||||
[[ 2. 2. 2. 2.],
|
||||
[ 2. 2. 2. 2.],
|
||||
[ 2. 2. 2. 2.]],
|
||||
|
||||
[[ 2. 2. 2. 2.],
|
||||
[ 2. 2. 2. 2.],
|
||||
[ 2. 2. 2. 2.]]]
|
||||
```
|
||||
|
||||
如果您想了解更多关于MindSpore Docker镜像的构建过程,请查看[docker](docker/README.md) repo了解详细信息。
|
||||
|
||||
## 快速入门
|
||||
|
||||
参考[快速入门](https://www.mindspore.cn/tutorial/zh-CN/master/quick_start/quick_start.html)实现图片分类。
|
||||
|
||||
|
||||
## 文档
|
||||
|
||||
有关安装指南、教程和API的更多详细信息,请参阅[用户文档](https://gitee.com/mindspore/docs)。
|
||||
|
||||
## 社区
|
||||
|
||||
### 治理
|
||||
|
||||
查看MindSpore如何进行[开放治理](https://gitee.com/mindspore/community/blob/master/governance.md)。
|
||||
|
||||
### 交流
|
||||
|
||||
- [MindSpore Slack](https://join.slack.com/t/mindspore/shared_invite/zt-dgk65rli-3ex4xvS4wHX7UDmsQmfu8w) 开发者交流平台。
|
||||
- `#mindspore`IRC频道(仅用于会议记录)
|
||||
- 视频会议:待定
|
||||
- 邮件列表:<https://mailweb.mindspore.cn/postorius/lists>
|
||||
|
||||
## 贡献
|
||||
|
||||
欢迎参与贡献。更多详情,请参阅我们的[贡献者Wiki](CONTRIBUTING.md)。
|
||||
|
||||
|
||||
## 版本说明
|
||||
|
||||
版本说明请参阅[RELEASE](RELEASE.md)。
|
||||
|
||||
## 许可证
|
||||
|
||||
[Apache License 2.0](LICENSE)
|
|
@ -150,7 +150,7 @@ TensorPtr TensorPy::MakeTensor(const py::array &input, const TypePtr &type_ptr)
|
|||
// Get tensor shape.
|
||||
std::vector<int> shape(buf.shape.begin(), buf.shape.end());
|
||||
if (data_type == buf_type) {
|
||||
// Use memory copy if input data type is same as the required type.
|
||||
// Use memory copy if input data type is the same as the required type.
|
||||
return std::make_shared<Tensor>(data_type, shape, buf.ptr, buf.size * buf.itemsize);
|
||||
}
|
||||
// Create tensor with data type converted.
|
||||
|
|
|
@ -546,9 +546,11 @@ def set_context(**kwargs):
|
|||
|
||||
Note:
|
||||
Attribute name is required for setting attributes.
|
||||
The mode is not recommended to be changed after net was initilized because the implementations of some
|
||||
operations are different in graph mode and pynative mode. Default: PYNATIVE_MODE.
|
||||
|
||||
Args:
|
||||
mode (int): Running in GRAPH_MODE(0) or PYNATIVE_MODE(1). Default: PYNATIVE_MODE.
|
||||
mode (int): Running in GRAPH_MODE(0) or PYNATIVE_MODE(1).
|
||||
device_target (str): The target device to run, support "Ascend", "GPU", "CPU". Default: "Ascend".
|
||||
device_id (int): Id of target device, the value must be in [0, device_num_per_host-1],
|
||||
while device_num_per_host should no more than 4096. Default: 0.
|
||||
|
|
|
@ -148,7 +148,7 @@ class Cell:
|
|||
|
||||
def update_cell_type(self, cell_type):
|
||||
"""
|
||||
Update the current cell type mainly identify if quantization aware training network.
|
||||
The current cell type is updated when a quantization aware training network is encountered.
|
||||
|
||||
After being invoked, it can set the cell type to 'cell_type'.
|
||||
"""
|
||||
|
@ -936,7 +936,7 @@ class GraphKernel(Cell):
|
|||
Base class for GraphKernel.
|
||||
|
||||
A `GraphKernel` a composite of basic primitives and can be compiled into a fused kernel automatically when
|
||||
context.set_context(enable_graph_kernel=True).
|
||||
enable_graph_kernel in context is set to True.
|
||||
|
||||
Examples:
|
||||
>>> class Relu(GraphKernel):
|
||||
|
|
|
@ -661,7 +661,7 @@ class LogSoftmax(GraphKernel):
|
|||
Log Softmax activation function.
|
||||
|
||||
Applies the Log Softmax function to the input tensor on the specified axis.
|
||||
Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i`
|
||||
Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i`
|
||||
the Log Softmax function is shown as follows:
|
||||
|
||||
.. math::
|
||||
|
@ -987,10 +987,10 @@ class LayerNorm(Cell):
|
|||
Applies Layer Normalization over a mini-batch of inputs.
|
||||
|
||||
Layer normalization is widely used in recurrent neural networks. It applies
|
||||
normalization over a mini-batch of inputs for each single training case as described
|
||||
normalization on a mini-batch of inputs for each single training case as described
|
||||
in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch
|
||||
normalization, layer normalization performs exactly the same computation at training and
|
||||
testing times. It can be described using the following formula. It is applied across all channels
|
||||
testing time. It can be described using the following formula. It is applied across all channels
|
||||
and pixel but only one batch size.
|
||||
|
||||
.. math::
|
||||
|
@ -1139,9 +1139,9 @@ class LambNextMV(GraphKernel):
|
|||
Outputs:
|
||||
Tuple of 2 Tensor.
|
||||
|
||||
- **add3** (Tensor) - The shape is same as the shape after broadcasting, and the data type is
|
||||
- **add3** (Tensor) - The shape is the same as the shape after broadcasting, and the data type is
|
||||
the one with high precision or high digits among the inputs.
|
||||
- **realdiv4** (Tensor) - The shape is same as the shape after broadcasting, and the data type is
|
||||
- **realdiv4** (Tensor) - The shape is the same as the shape after broadcasting, and the data type is
|
||||
the one with high precision or high digits among the inputs.
|
||||
|
||||
Examples:
|
||||
|
|
|
@ -55,7 +55,7 @@ class Softmax(Cell):
|
|||
.. math::
|
||||
\text{softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_{j=0}^{n-1}\exp(x_j)},
|
||||
|
||||
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor.
|
||||
where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
|
||||
|
||||
Args:
|
||||
axis (Union[int, tuple[int]]): The axis to apply Softmax operation, -1 means the last dimension. Default: -1.
|
||||
|
@ -87,11 +87,11 @@ class LogSoftmax(Cell):
|
|||
|
||||
Applies the LogSoftmax function to n-dimensional input tensor.
|
||||
|
||||
The input is transformed with Softmax function and then with log function to lie in range[-inf,0).
|
||||
The input is transformed by the Softmax function and then by the log function to lie in range[-inf,0).
|
||||
|
||||
Logsoftmax is defined as:
|
||||
:math:`\text{logsoftmax}(x_i) = \log \left(\frac{\exp(x_i)}{\sum_{j=0}^{n-1} \exp(x_j)}\right)`,
|
||||
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor.
|
||||
where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
|
||||
|
||||
Args:
|
||||
axis (int): The axis to apply LogSoftmax operation, -1 means the last dimension. Default: -1.
|
||||
|
@ -123,7 +123,7 @@ class ELU(Cell):
|
|||
Exponential Linear Uint activation function.
|
||||
|
||||
Applies the exponential linear unit function element-wise.
|
||||
The activation function defined as:
|
||||
The activation function is defined as:
|
||||
|
||||
.. math::
|
||||
E_{i} =
|
||||
|
@ -162,7 +162,7 @@ class ReLU(Cell):
|
|||
|
||||
Applies the rectified linear unit function element-wise. It returns
|
||||
element-wise :math:`\max(0, x)`, specially, the neurons with the negative output
|
||||
will suppressed and the active neurons will stay the same.
|
||||
will be suppressed and the active neurons will stay the same.
|
||||
|
||||
Inputs:
|
||||
- **input_data** (Tensor) - The input of ReLU.
|
||||
|
@ -197,7 +197,7 @@ class ReLU6(Cell):
|
|||
- **input_data** (Tensor) - The input of ReLU6.
|
||||
|
||||
Outputs:
|
||||
Tensor, which has the same type with `input_data`.
|
||||
Tensor, which has the same type as `input_data`.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([-1, -2, 0, 2, 1]), mindspore.float16)
|
||||
|
@ -234,7 +234,7 @@ class LeakyReLU(Cell):
|
|||
- **input_x** (Tensor) - The input of LeakyReLU.
|
||||
|
||||
Outputs:
|
||||
Tensor, has the same type and shape with the `input_x`.
|
||||
Tensor, has the same type and shape as the `input_x`.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([[-1.0, 4.0, -8.0], [2.0, -5.0, 9.0]]), mindspore.float32)
|
||||
|
@ -365,7 +365,7 @@ class PReLU(Cell):
|
|||
PReLU is defined as: :math:`prelu(x_i)= \max(0, x_i) + w * \min(0, x_i)`, where :math:`x_i`
|
||||
is an element of an channel of the input.
|
||||
|
||||
Here :math:`w` is an learnable parameter with default initial value 0.25.
|
||||
Here :math:`w` is a learnable parameter with a default initial value 0.25.
|
||||
Parameter :math:`w` has dimensionality of the argument channel. If called without argument
|
||||
channel, a single parameter :math:`w` will be shared across all channels.
|
||||
|
||||
|
@ -413,7 +413,7 @@ class PReLU(Cell):
|
|||
|
||||
class HSwish(Cell):
|
||||
r"""
|
||||
rHard swish activation function.
|
||||
Hard swish activation function.
|
||||
|
||||
Applies hswish-type activation element-wise. The input is a Tensor with any valid shape.
|
||||
|
||||
|
@ -422,7 +422,7 @@ class HSwish(Cell):
|
|||
.. math::
|
||||
\text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6},
|
||||
|
||||
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor.
|
||||
where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
|
||||
|
||||
Inputs:
|
||||
- **input_data** (Tensor) - The input of HSwish.
|
||||
|
@ -456,7 +456,7 @@ class HSigmoid(Cell):
|
|||
.. math::
|
||||
\text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})),
|
||||
|
||||
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor.
|
||||
where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
|
||||
|
||||
Inputs:
|
||||
- **input_data** (Tensor) - The input of HSigmoid.
|
||||
|
|
|
@ -65,7 +65,7 @@ class Dropout(Cell):
|
|||
dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32.
|
||||
|
||||
Raises:
|
||||
ValueError: If keep_prob is not in range (0, 1).
|
||||
ValueError: If `keep_prob` is not in range (0, 1).
|
||||
|
||||
Inputs:
|
||||
- **input** (Tensor) - An N-D Tensor.
|
||||
|
@ -373,8 +373,8 @@ class OneHot(Cell):
|
|||
axis is created at dimension `axis`.
|
||||
|
||||
Args:
|
||||
axis (int): Features x depth if axis == -1, depth x features
|
||||
if axis == 0. Default: -1.
|
||||
axis (int): Features x depth if axis is -1, depth x features
|
||||
if axis is 0. Default: -1.
|
||||
depth (int): A scalar defining the depth of the one hot dimension. Default: 1.
|
||||
on_value (float): A scalar defining the value to fill in output[i][j]
|
||||
when indices[j] = i. Default: 1.0.
|
||||
|
@ -492,18 +492,18 @@ class Unfold(Cell):
|
|||
The input tensor must be a 4-D tensor and the data format is NCHW.
|
||||
|
||||
Args:
|
||||
ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or list of int,
|
||||
ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or a list of integers,
|
||||
and the format is [1, ksize_row, ksize_col, 1].
|
||||
strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches,
|
||||
should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1].
|
||||
rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dim
|
||||
pixel positions, should be a tuple or list of int, and the format is [1, rate_row, rate_col, 1].
|
||||
rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dimension
|
||||
pixel positions, should be a tuple or a list of integers, and the format is [1, rate_row, rate_col, 1].
|
||||
padding (str): The type of padding algorithm, is a string whose value is "same" or "valid",
|
||||
not case sensitive. Default: "valid".
|
||||
|
||||
- same: Means that the patch can take the part beyond the original image, and this part is filled with 0.
|
||||
|
||||
- valid: Means that the patch area taken must be completely contained in the original image.
|
||||
- valid: Means that the taken patch area must be completely covered in the original image.
|
||||
|
||||
Inputs:
|
||||
- **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_depth, in_row, in_col] and
|
||||
|
@ -511,7 +511,7 @@ class Unfold(Cell):
|
|||
|
||||
Outputs:
|
||||
Tensor, a 4-D tensor whose data type is same as 'input_x',
|
||||
and the shape is [out_batch, out_depth, out_row, out_col], the out_batch is same as the in_batch.
|
||||
and the shape is [out_batch, out_depth, out_row, out_col], the out_batch is the same as the in_batch.
|
||||
|
||||
Examples:
|
||||
>>> net = Unfold(ksizes=[1, 2, 2, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1])
|
||||
|
@ -556,11 +556,11 @@ class MatrixDiag(Cell):
|
|||
Returns a batched diagonal tensor with a given batched diagonal values.
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - The diagonal values. It can be of the following data types:
|
||||
float32, float16, int32, int8, uint8.
|
||||
- **x** (Tensor) - The diagonal values. It can be one of the following data types:
|
||||
float32, float16, int32, int8, and uint8.
|
||||
|
||||
Outputs:
|
||||
Tensor, same type as input `x`. The shape should be x.shape + (x.shape[-1], ).
|
||||
Tensor, has the same type as input `x`. The shape should be x.shape + (x.shape[-1], ).
|
||||
|
||||
Examples:
|
||||
>>> x = Tensor(np.array([1, -1]), mstype.float32)
|
||||
|
@ -587,11 +587,11 @@ class MatrixDiagPart(Cell):
|
|||
Returns the batched diagonal part of a batched tensor.
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - The batched tensor. It can be of the following data types:
|
||||
float32, float16, int32, int8, uint8.
|
||||
- **x** (Tensor) - The batched tensor. It can be one of the following data types:
|
||||
float32, float16, int32, int8, and uint8.
|
||||
|
||||
Outputs:
|
||||
Tensor, same type as input `x`. The shape should be x.shape[:-2] + [min(x.shape[-2:])].
|
||||
Tensor, has the same type as input `x`. The shape should be x.shape[:-2] + [min(x.shape[-2:])].
|
||||
|
||||
Examples:
|
||||
>>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32)
|
||||
|
@ -617,12 +617,12 @@ class MatrixSetDiag(Cell):
|
|||
Modify the batched diagonal part of a batched tensor.
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - The batched tensor. It can be of the following data types:
|
||||
float32, float16, int32, int8, uint8.
|
||||
- **x** (Tensor) - The batched tensor. It can be one of the following data types:
|
||||
float32, float16, int32, int8, and uint8.
|
||||
- **diagonal** (Tensor) - The diagonal values.
|
||||
|
||||
Outputs:
|
||||
Tensor, same type as input `x`. The shape same as `x`.
|
||||
Tensor, has the same type and shape as input `x`.
|
||||
|
||||
Examples:
|
||||
>>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32)
|
||||
|
|
|
@ -72,7 +72,7 @@ class SequentialCell(Cell):
|
|||
args (list, OrderedDict): List of subclass of Cell.
|
||||
|
||||
Raises:
|
||||
TypeError: If arg is not of type list or OrderedDict.
|
||||
TypeError: If the type of the argument is not list or OrderedDict.
|
||||
|
||||
Inputs:
|
||||
- **input** (Tensor) - Tensor with shape according to the first Cell in the sequence.
|
||||
|
|
|
@ -131,7 +131,7 @@ class Conv2d(_Conv):
|
|||
Args:
|
||||
in_channels (int): The number of input channel :math:`C_{in}`.
|
||||
out_channels (int): The number of output channel :math:`C_{out}`.
|
||||
kernel_size (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the height
|
||||
kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the height
|
||||
and width of the 2D convolution window. Single int means the value is for both the height and the width of
|
||||
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
|
||||
width of the kernel.
|
||||
|
@ -147,7 +147,7 @@ class Conv2d(_Conv):
|
|||
last extra padding will be done from the bottom and the right side. If this mode is set, `padding`
|
||||
must be 0.
|
||||
|
||||
- valid: Adopts the way of discarding. The possibly largest height and width of output will be returned
|
||||
- valid: Adopts the way of discarding. The possible largest height and width of output will be returned
|
||||
without padding. Extra pixels will be discarded. If this mode is set, `padding`
|
||||
must be 0.
|
||||
|
||||
|
@ -158,7 +158,7 @@ class Conv2d(_Conv):
|
|||
the padding of top, bottom, left and right is the same, equal to padding. If `padding` is a tuple
|
||||
with four integers, the padding of top, bottom, left and right will be equal to padding[0],
|
||||
padding[1], padding[2], and padding[3] accordingly. Default: 0.
|
||||
dilation (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the dilation rate
|
||||
dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate
|
||||
to use for dilated convolution. If set to be :math:`k > 1`, there will
|
||||
be :math:`k - 1` pixels skipped for each sampling location. Its value should
|
||||
be greater or equal to 1 and bounded by the height and width of the
|
||||
|
@ -451,7 +451,7 @@ class Conv2dTranspose(_Conv):
|
|||
Args:
|
||||
in_channels (int): The number of channels in the input space.
|
||||
out_channels (int): The number of channels in the output space.
|
||||
kernel_size (Union[int, tuple]): int or tuple with 2 integers, which specifies the height
|
||||
kernel_size (Union[int, tuple]): int or a tuple of 2 integers, which specifies the height
|
||||
and width of the 2D convolution window. Single int means the value is for both the height and the width of
|
||||
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
|
||||
width of the kernel.
|
||||
|
@ -825,7 +825,7 @@ class DepthwiseConv2d(Cell):
|
|||
Args:
|
||||
in_channels (int): The number of input channel :math:`C_{in}`.
|
||||
out_channels (int): The number of output channel :math:`C_{out}`.
|
||||
kernel_size (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the height
|
||||
kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the height
|
||||
and width of the 2D convolution window. Single int means the value is for both the height and the width of
|
||||
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
|
||||
width of the kernel.
|
||||
|
@ -841,7 +841,7 @@ class DepthwiseConv2d(Cell):
|
|||
last extra padding will be done from the bottom and the right side. If this mode is set, `padding`
|
||||
must be 0.
|
||||
|
||||
- valid: Adopts the way of discarding. The possibly largest height and width of output will be returned
|
||||
- valid: Adopts the way of discarding. The possible largest height and width of output will be returned
|
||||
without padding. Extra pixels will be discarded. If this mode is set, `padding`
|
||||
must be 0.
|
||||
|
||||
|
@ -849,16 +849,16 @@ class DepthwiseConv2d(Cell):
|
|||
Tensor borders. `padding` should be greater than or equal to 0.
|
||||
|
||||
padding (int): Implicit paddings on both sides of the input. Default: 0.
|
||||
dilation (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the dilation rate
|
||||
dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate
|
||||
to use for dilated convolution. If set to be :math:`k > 1`, there will
|
||||
be :math:`k - 1` pixels skipped for each sampling location. Its value should
|
||||
be greater or equal to 1 and bounded by the height and width of the
|
||||
be greater than or equal to 1 and bounded by the height and width of the
|
||||
input. Default: 1.
|
||||
group (int): Split filter into groups, `in_ channels` and `out_channels` should be
|
||||
divisible by the number of groups. Default: 1.
|
||||
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
|
||||
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.
|
||||
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified,
|
||||
It can be a Tensor, a string, an Initializer or a number. When a string is specified,
|
||||
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
|
||||
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
|
||||
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of
|
||||
|
|
|
@ -36,7 +36,7 @@ class Embedding(Cell):
|
|||
the corresponding word embeddings.
|
||||
|
||||
Note:
|
||||
When 'use_one_hot' is set to True, the input should be of type mindspore.int32.
|
||||
When 'use_one_hot' is set to True, the type of the input should be mindspore.int32.
|
||||
|
||||
Args:
|
||||
vocab_size (int): Size of the dictionary of embeddings.
|
||||
|
@ -48,9 +48,9 @@ class Embedding(Cell):
|
|||
dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32.
|
||||
|
||||
Inputs:
|
||||
- **input** (Tensor) - Tensor of shape :math:`(\text{batch_size}, \text{input_length})`. The element of
|
||||
the Tensor should be integer and not larger than vocab_size. else the corresponding embedding vector is zero
|
||||
if larger than vocab_size.
|
||||
- **input** (Tensor) - Tensor of shape :math:`(\text{batch_size}, \text{input_length})`. The elements of
|
||||
the Tensor should be integer and not larger than vocab_size. Otherwise the corresponding embedding vector will
|
||||
be zero.
|
||||
|
||||
Outputs:
|
||||
Tensor of shape :math:`(\text{batch_size}, \text{input_length}, \text{embedding_size})`.
|
||||
|
|
|
@ -253,7 +253,7 @@ class MSSSIM(Cell):
|
|||
Args:
|
||||
max_val (Union[int, float]): The dynamic range of the pixel values (255 for 8-bit grayscale images).
|
||||
Default: 1.0.
|
||||
power_factors (Union[tuple, list]): Iterable of weights for each of the scales.
|
||||
power_factors (Union[tuple, list]): Iterable of weights for each scal e.
|
||||
Default: (0.0448, 0.2856, 0.3001, 0.2363, 0.1333). Default values obtained by Wang et al.
|
||||
filter_size (int): The size of the Gaussian filter. Default: 11.
|
||||
filter_sigma (float): The standard deviation of Gaussian kernel. Default: 1.5.
|
||||
|
|
|
@ -35,7 +35,7 @@ class LSTM(Cell):
|
|||
Applies a LSTM to the input.
|
||||
|
||||
There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline
|
||||
and another is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
|
||||
and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
|
||||
Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell
|
||||
state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at
|
||||
time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell
|
||||
|
@ -68,18 +68,17 @@ class LSTM(Cell):
|
|||
input_size (int): Number of features of input.
|
||||
hidden_size (int): Number of features of hidden layer.
|
||||
num_layers (int): Number of layers of stacked LSTM . Default: 1.
|
||||
has_bias (bool): Specifies whether has bias `b_ih` and `b_hh`. Default: True.
|
||||
has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True.
|
||||
batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False.
|
||||
dropout (float, int): If not 0, append `Dropout` layer on the outputs of each
|
||||
LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0].
|
||||
bidirectional (bool): Specifies whether this is a bidirectional LSTM. If set True,
|
||||
number of directions will be 2 otherwise number of directions is 1. Default: False.
|
||||
bidirectional (bool): Specifies whether it is a bidirectional LSTM. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **input** (Tensor) - Tensor of shape (seq_len, batch_size, `input_size`).
|
||||
- **hx** (tuple) - A tuple of two Tensors (h_0, c_0) both of data type mindspore.float32 or
|
||||
mindspore.float16 and shape (num_directions * `num_layers`, batch_size, `hidden_size`).
|
||||
Data type of `hx` should be the same of `input`.
|
||||
Data type of `hx` should be the same as `input`.
|
||||
|
||||
Outputs:
|
||||
Tuple, a tuple constains (`output`, (`h_n`, `c_n`)).
|
||||
|
@ -205,7 +204,7 @@ class LSTMCell(Cell):
|
|||
Applies a LSTM layer to the input.
|
||||
|
||||
There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline
|
||||
and another is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
|
||||
and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
|
||||
Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell
|
||||
state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at
|
||||
time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell
|
||||
|
@ -238,7 +237,7 @@ class LSTMCell(Cell):
|
|||
input_size (int): Number of features of input.
|
||||
hidden_size (int): Number of features of hidden layer.
|
||||
layer_index (int): index of current layer of stacked LSTM . Default: 0.
|
||||
has_bias (bool): Specifies whether has bias `b_ih` and `b_hh`. Default: True.
|
||||
has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True.
|
||||
batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False.
|
||||
dropout (float, int): If not 0, append `Dropout` layer on the outputs of each
|
||||
LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0].
|
||||
|
|
|
@ -243,6 +243,10 @@ class BatchNorm1d(_BatchNorm):
|
|||
.. math::
|
||||
y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta
|
||||
|
||||
Note:
|
||||
The implementation of BatchNorm is different in graph mode and pynative mode, therefore the mode is not
|
||||
recommended to be changed after net was initilized.
|
||||
|
||||
Args:
|
||||
num_features (int): `C` from an expected input of size (N, C).
|
||||
eps (float): A value added to the denominator for numerical stability. Default: 1e-5.
|
||||
|
@ -319,6 +323,10 @@ class BatchNorm2d(_BatchNorm):
|
|||
.. math::
|
||||
y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta
|
||||
|
||||
Note:
|
||||
The implementation of BatchNorm is different in graph mode and pynative mode, therefore that mode can not be
|
||||
changed after net was initilized.
|
||||
|
||||
Args:
|
||||
num_features (int): `C` from an expected input of size (N, C, H, W).
|
||||
eps (float): A value added to the denominator for numerical stability. Default: 1e-5.
|
||||
|
@ -384,8 +392,8 @@ class GlobalBatchNorm(_BatchNorm):
|
|||
r"""
|
||||
Global normalization layer over a N-dimension input.
|
||||
|
||||
Global Normalization is cross device synchronized batch normalization. Batch Normalization implementation
|
||||
only normalize the data within each device. Global normalization will normalize the input within the group.
|
||||
Global Normalization is cross device synchronized batch normalization. The implementation of Batch Normalization
|
||||
only normalizes the data within each device. Global normalization will normalize the input within the group.
|
||||
It has been described in the paper `Batch Normalization: Accelerating Deep Network Training by
|
||||
Reducing Internal Covariate Shift <https://arxiv.org/abs/1502.03167>`_. It rescales and recenters the
|
||||
feature using a mini-batch of data and the learned parameters which can be described in the following formula.
|
||||
|
@ -467,10 +475,10 @@ class LayerNorm(Cell):
|
|||
Applies Layer Normalization over a mini-batch of inputs.
|
||||
|
||||
Layer normalization is widely used in recurrent neural networks. It applies
|
||||
normalization over a mini-batch of inputs for each single training case as described
|
||||
normalization on a mini-batch of inputs for each single training case as described
|
||||
in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch
|
||||
normalization, layer normalization performs exactly the same computation at training and
|
||||
testing times. It can be described using the following formula. It is applied across all channels
|
||||
testing time. It can be described using the following formula. It is applied across all channels
|
||||
and pixel but only one batch size.
|
||||
|
||||
.. math::
|
||||
|
@ -545,7 +553,7 @@ class GroupNorm(Cell):
|
|||
Group Normalization over a mini-batch of inputs.
|
||||
|
||||
Group normalization is widely used in recurrent neural networks. It applies
|
||||
normalization over a mini-batch of inputs for each single training case as described
|
||||
normalization on a mini-batch of inputs for each single training case as described
|
||||
in the paper `Group Normalization <https://arxiv.org/pdf/1803.08494.pdf>`_. Group normalization
|
||||
divides the channels into groups and computes within each group the mean and variance for normalization,
|
||||
and it performs very stable over a wide range of batch size. It can be described using the following formula.
|
||||
|
@ -557,7 +565,7 @@ class GroupNorm(Cell):
|
|||
num_groups (int): The number of groups to be divided along the channel dimension.
|
||||
num_channels (int): The number of channels per group.
|
||||
eps (float): A value added to the denominator for numerical stability. Default: 1e-5.
|
||||
affine (bool): A bool value, this layer will has learnable affine parameters when set to true. Default: True.
|
||||
affine (bool): A bool value, this layer will have learnable affine parameters when set to true. Default: True.
|
||||
gamma_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the gamma weight.
|
||||
The values of str refer to the function `initializer` including 'zeros', 'ones', 'xavier_uniform',
|
||||
'he_uniform', etc. Default: 'ones'.
|
||||
|
|
|
@ -61,7 +61,7 @@ class Conv2dBnAct(Cell):
|
|||
Args:
|
||||
in_channels (int): The number of input channel :math:`C_{in}`.
|
||||
out_channels (int): The number of output channel :math:`C_{out}`.
|
||||
kernel_size (Union[int, tuple]): The data type is int or tuple with 2 integers. Specifies the height
|
||||
kernel_size (Union[int, tuple]): The data type is int or a tuple of 2 integers. Specifies the height
|
||||
and width of the 2D convolution window. Single int means the value is for both height and width of
|
||||
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
|
||||
width of the kernel.
|
||||
|
@ -292,19 +292,19 @@ class BatchNormFoldCell(Cell):
|
|||
|
||||
class FakeQuantWithMinMax(Cell):
|
||||
r"""
|
||||
Quantization aware op. This OP provide Fake quantization observer function on data with min and max.
|
||||
Quantization aware op. This OP provides the fake quantization observer function on data with min and max.
|
||||
|
||||
Args:
|
||||
min_init (int, float): The dimension of channel or 1(layer). Default: -6.
|
||||
max_init (int, float): The dimension of channel or 1(layer). Default: 6.
|
||||
ema (bool): Exponential Moving Average algorithm update min and max. Default: False.
|
||||
ema (bool): The exponential Moving Average algorithm updates min and max. Default: False.
|
||||
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
|
||||
per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
|
||||
channel_axis (int): Quantization by channel axis. Default: 1.
|
||||
num_channels (int): declarate the min and max channel size, Default: 1.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
||||
Inputs:
|
||||
|
@ -431,7 +431,7 @@ class Conv2dBnFoldQuant(Cell):
|
|||
variance vector. Default: 'ones'.
|
||||
fake (bool): Whether Conv2dBnFoldQuant Cell adds FakeQuantWithMinMax op. Default: True.
|
||||
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): The Quantization delay parameters according to the global step. Default: 0.
|
||||
|
@ -614,7 +614,7 @@ class Conv2dBnWithoutFoldQuant(Cell):
|
|||
Default: 'normal'.
|
||||
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'.
|
||||
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
@ -736,7 +736,7 @@ class Conv2dQuant(Cell):
|
|||
Default: 'normal'.
|
||||
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'.
|
||||
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
@ -845,7 +845,7 @@ class DenseQuant(Cell):
|
|||
has_bias (bool): Specifies whether the layer uses a bias vector. Default: True.
|
||||
activation (str): The regularization function applied to the output of the layer, eg. 'relu'. Default: None.
|
||||
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
@ -947,15 +947,14 @@ class ActQuant(_QuantActivation):
|
|||
r"""
|
||||
Quantization aware training activation function.
|
||||
|
||||
Add Fake Quant OP after activation. Not Recommand to used these cell for Fake Quant Op
|
||||
Will climp the max range of the activation and the relu6 do the same operation.
|
||||
This part is a more detailed overview of ReLU6 op.
|
||||
Add the fake quant op to the end of activation op, by which the output of activation op will be truncated.
|
||||
Please check `FakeQuantWithMinMax` for more details.
|
||||
|
||||
Args:
|
||||
activation (Cell): Activation cell class.
|
||||
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
|
||||
per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global steps. Default: 0.
|
||||
|
@ -1010,7 +1009,7 @@ class LeakyReLUQuant(_QuantActivation):
|
|||
activation (Cell): Activation cell class.
|
||||
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
|
||||
per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
@ -1080,9 +1079,9 @@ class HSwishQuant(_QuantActivation):
|
|||
activation (Cell): Activation cell class.
|
||||
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
|
||||
per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
||||
Inputs:
|
||||
|
@ -1149,9 +1148,9 @@ class HSigmoidQuant(_QuantActivation):
|
|||
activation (Cell): Activation cell class.
|
||||
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
|
||||
per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
||||
Inputs:
|
||||
|
@ -1217,7 +1216,7 @@ class TensorAddQuant(Cell):
|
|||
Args:
|
||||
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
|
||||
per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
@ -1269,7 +1268,7 @@ class MulQuant(Cell):
|
|||
Args:
|
||||
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
|
||||
per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
|
||||
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8.
|
||||
num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
|
||||
symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
|
||||
quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
|
||||
|
|
|
@ -80,7 +80,7 @@ class L1Loss(_Loss):
|
|||
When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size.
|
||||
|
||||
Args:
|
||||
reduction (str): Type of reduction to apply to loss. The optional values are "mean", "sum", "none".
|
||||
reduction (str): Type of reduction to be applied to loss. The optional values are "mean", "sum", and "none".
|
||||
Default: "mean".
|
||||
|
||||
Inputs:
|
||||
|
@ -107,7 +107,7 @@ class L1Loss(_Loss):
|
|||
|
||||
class MSELoss(_Loss):
|
||||
r"""
|
||||
MSELoss create a criterion to measures the mean squared error (squared L2-norm) between :math:`x` and :math:`y`
|
||||
MSELoss creates a criterion to measure the mean squared error (squared L2-norm) between :math:`x` and :math:`y`
|
||||
by element, where :math:`x` is the input and :math:`y` is the target.
|
||||
|
||||
For simplicity, let :math:`x` and :math:`y` be 1-dimensional Tensor with length :math:`N`,
|
||||
|
@ -120,7 +120,7 @@ class MSELoss(_Loss):
|
|||
When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size.
|
||||
|
||||
Args:
|
||||
reduction (str): Type of reduction to apply to loss. The optional values are "mean", "sum", "none".
|
||||
reduction (str): Type of reduction to be applied to loss. The optional values are "mean", "sum", and "none".
|
||||
Default: "mean".
|
||||
|
||||
Inputs:
|
||||
|
@ -210,14 +210,14 @@ class SoftmaxCrossEntropyWithLogits(_Loss):
|
|||
|
||||
Note:
|
||||
While the target classes are mutually exclusive, i.e., only one class is positive in the target, the predicted
|
||||
probabilities need not be exclusive. All that is required is that the predicted probability distribution
|
||||
probabilities need not to be exclusive. It is only required that the predicted probability distribution
|
||||
of entry is a valid one.
|
||||
|
||||
Args:
|
||||
is_grad (bool): Specifies whether calculate grad only. Default: True.
|
||||
sparse (bool): Specifies whether labels use sparse format or not. Default: False.
|
||||
reduction (Union[str, None]): Type of reduction to apply to loss. Support 'sum' or 'mean' If None,
|
||||
do not reduction. Default: None.
|
||||
reduction (Union[str, None]): Type of reduction to be applied to loss. Support 'sum' and 'mean'. If None,
|
||||
do not perform reduction. Default: None.
|
||||
smooth_factor (float): Label smoothing factor. It is a optional input which should be in range [0, 1].
|
||||
Default: 0.
|
||||
num_classes (int): The number of classes in the task. It is a optional input Default: 2.
|
||||
|
@ -225,7 +225,7 @@ class SoftmaxCrossEntropyWithLogits(_Loss):
|
|||
Inputs:
|
||||
- **logits** (Tensor) - Tensor of shape (N, C).
|
||||
- **labels** (Tensor) - Tensor of shape (N, ). If `sparse` is True, The type of
|
||||
`labels` is mindspore.int32. If `sparse` is False, the type of `labels` is same as the type of `logits`.
|
||||
`labels` is mindspore.int32. If `sparse` is False, the type of `labels` is the same as the type of `logits`.
|
||||
|
||||
Outputs:
|
||||
Tensor, a tensor of the same shape as logits with the component-wise
|
||||
|
@ -282,8 +282,8 @@ class SoftmaxCrossEntropyExpand(Cell):
|
|||
where :math:`x_i` is a 1D score Tensor, :math:`t_i` is the target class.
|
||||
|
||||
Note:
|
||||
When argument sparse is set to True, the format of label is the index
|
||||
range from :math:`0` to :math:`C - 1` instead of one-hot vectors.
|
||||
When argument sparse is set to True, the format of the label is the index
|
||||
ranging from :math:`0` to :math:`C - 1` instead of one-hot vectors.
|
||||
|
||||
Args:
|
||||
sparse(bool): Specifies whether labels use sparse format or not. Default: False.
|
||||
|
|
|
@ -69,7 +69,7 @@ def names():
|
|||
|
||||
def get_metric_fn(name, *args, **kwargs):
|
||||
"""
|
||||
Gets the metric method base on the input name.
|
||||
Gets the metric method based on the input name.
|
||||
|
||||
Args:
|
||||
name (str): The name of metric method. Refer to the '__factory__'
|
||||
|
|
|
@ -82,7 +82,7 @@ class Metric(metaclass=ABCMeta):
|
|||
@abstractmethod
|
||||
def clear(self):
|
||||
"""
|
||||
A interface describes the behavior of clearing the internal evaluation result.
|
||||
An interface describes the behavior of clearing the internal evaluation result.
|
||||
|
||||
Note:
|
||||
All subclasses should override this interface.
|
||||
|
@ -92,7 +92,7 @@ class Metric(metaclass=ABCMeta):
|
|||
@abstractmethod
|
||||
def eval(self):
|
||||
"""
|
||||
A interface describes the behavior of computing the evaluation result.
|
||||
An interface describes the behavior of computing the evaluation result.
|
||||
|
||||
Note:
|
||||
All subclasses should override this interface.
|
||||
|
@ -102,7 +102,7 @@ class Metric(metaclass=ABCMeta):
|
|||
@abstractmethod
|
||||
def update(self, *inputs):
|
||||
"""
|
||||
A interface describes the behavior of updating the internal evaluation result.
|
||||
An interface describes the behavior of updating the internal evaluation result.
|
||||
|
||||
Note:
|
||||
All subclasses should override this interface.
|
||||
|
|
|
@ -36,8 +36,8 @@ def _update_run_op(beta1, beta2, eps, lr, weight_decay, param, m, v, gradient, d
|
|||
Update parameters.
|
||||
|
||||
Args:
|
||||
beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0).
|
||||
beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0).
|
||||
beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
|
||||
beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
|
||||
eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0.
|
||||
lr (Tensor): Learning rate.
|
||||
weight_decay (Number): Weight decay. Should be equal to or greater than 0.
|
||||
|
@ -180,12 +180,12 @@ class Adam(Optimizer):
|
|||
the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters
|
||||
which in the 'order_params' should be in one of group parameters.
|
||||
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use the dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use the dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
Default: 1e-3.
|
||||
beta1 (float): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
|
||||
|
@ -195,11 +195,11 @@ class Adam(Optimizer):
|
|||
eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default:
|
||||
1e-8.
|
||||
use_locking (bool): Whether to enable a lock to protect updating variable tensors.
|
||||
If True, updating of the var, m, and v tensors will be protected by a lock.
|
||||
If False, the result is unpredictable. Default: False.
|
||||
If true, updates of the var, m, and v tensors will be protected by a lock.
|
||||
If false, the result is unpredictable. Default: False.
|
||||
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
|
||||
If True, update the gradients using NAG.
|
||||
If False, update the gradients without using NAG. Default: False.
|
||||
If true, update the gradients using NAG.
|
||||
If false, update the gradients without using NAG. Default: False.
|
||||
weight_decay (float): Weight decay (L2 penalty). It should be equal to or greater than 0. Default: 0.0.
|
||||
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. Default: 1.0.
|
||||
|
||||
|
@ -304,12 +304,12 @@ class AdamWeightDecay(Optimizer):
|
|||
the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters
|
||||
which in the 'order_params' should be in one of group parameters.
|
||||
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use the dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use the dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
Default: 1e-3.
|
||||
beta1 (float): The exponential decay rate for the 1st moment estimations. Default: 0.9.
|
||||
|
|
|
@ -114,12 +114,12 @@ class FTRL(Optimizer):
|
|||
than or equal to zero. Use fixed learning rate if lr_power is zero. Default: -0.5.
|
||||
l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0.
|
||||
l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0.
|
||||
use_locking (bool): If True use locks for update operation. Default: False.
|
||||
use_locking (bool): If True, use locks for updating operation. Default: False.
|
||||
loss_scale (float): Value for the loss scale. It should be equal to or greater than 1.0. Default: 1.0.
|
||||
weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0.
|
||||
|
||||
Inputs:
|
||||
- **grads** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is as same as the `params`
|
||||
- **grads** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the same as the `params`
|
||||
in optimizer.
|
||||
|
||||
Outputs:
|
||||
|
|
|
@ -39,8 +39,8 @@ def _update_run_op(beta1, beta2, eps, global_step, lr, weight_decay, param, m, v
|
|||
Update parameters.
|
||||
|
||||
Args:
|
||||
beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0).
|
||||
beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0).
|
||||
beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
|
||||
beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
|
||||
eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0.
|
||||
lr (Tensor): Learning rate.
|
||||
weight_decay (Number): Weight decay. Should be equal to or greater than 0.
|
||||
|
@ -122,8 +122,8 @@ def _update_run_op_graph_kernel(beta1, beta2, eps, global_step, lr, weight_decay
|
|||
Update parameters.
|
||||
|
||||
Args:
|
||||
beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0).
|
||||
beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0).
|
||||
beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
|
||||
beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
|
||||
eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0.
|
||||
lr (Tensor): Learning rate.
|
||||
weight_decay (Number): Weight decay. Should be equal to or greater than 0.
|
||||
|
@ -184,7 +184,7 @@ def _check_param_value(beta1, beta2, eps, prim_name):
|
|||
|
||||
class Lamb(Optimizer):
|
||||
"""
|
||||
Lamb Dynamic LR.
|
||||
Lamb Dynamic Learning Rate.
|
||||
|
||||
LAMB is an optimization algorithm employing a layerwise adaptive large batch
|
||||
optimization technique. Refer to the paper `LARGE BATCH OPTIMIZATION FOR DEEP LEARNING: TRAINING BERT IN 76
|
||||
|
@ -214,16 +214,16 @@ class Lamb(Optimizer):
|
|||
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
|
||||
in the value of 'order_params' should be in one of group parameters.
|
||||
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
beta1 (float): The exponential decay rate for the 1st moment estimates. Default: 0.9.
|
||||
beta1 (float): The exponential decay rate for the 1st moment estimations. Default: 0.9.
|
||||
Should be in range (0.0, 1.0).
|
||||
beta2 (float): The exponential decay rate for the 2nd moment estimates. Default: 0.999.
|
||||
beta2 (float): The exponential decay rate for the 2nd moment estimations. Default: 0.999.
|
||||
Should be in range (0.0, 1.0).
|
||||
eps (float): Term added to the denominator to improve numerical stability. Default: 1e-6.
|
||||
Should be greater than 0.
|
||||
|
|
|
@ -58,12 +58,12 @@ class LARS(Optimizer):
|
|||
epsilon (float): Term added to the denominator to improve numerical stability. Default: 1e-05.
|
||||
coefficient (float): Trust coefficient for calculating the local learning rate. Default: 0.001.
|
||||
use_clip (bool): Whether to use clip operation for calculating the local learning rate. Default: False.
|
||||
lars_filter (Function): A function to determine whether apply lars algorithm. Default:
|
||||
lars_filter (Function): A function to determine whether apply the LARS algorithm. Default:
|
||||
lambda x: 'LayerNorm' not in x.name and 'bias' not in x.name.
|
||||
|
||||
Inputs:
|
||||
- **gradients** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is
|
||||
as same as the `params` in optimizer.
|
||||
- **gradients** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the
|
||||
as same as the `params` in the optimizer.
|
||||
|
||||
Outputs:
|
||||
Union[Tensor[bool], tuple[Parameter]], it depends on the output of `optimizer`.
|
||||
|
|
|
@ -127,26 +127,26 @@ class LazyAdam(Optimizer):
|
|||
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
|
||||
in the value of 'order_params' should be in one of group parameters.
|
||||
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
Default: 1e-3.
|
||||
beta1 (float): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). Default:
|
||||
0.9.
|
||||
beta2 (float): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). Default:
|
||||
0.999.
|
||||
beta1 (float): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
|
||||
Default: 0.9.
|
||||
beta2 (float): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
|
||||
Default: 0.999.
|
||||
eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default:
|
||||
1e-8.
|
||||
use_locking (bool): Whether to enable a lock to protect updating variable tensors.
|
||||
If True, updating of the var, m, and v tensors will be protected by a lock.
|
||||
If False, the result is unpredictable. Default: False.
|
||||
If true, updates of the var, m, and v tensors will be protected by a lock.
|
||||
If false, the result is unpredictable. Default: False.
|
||||
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
|
||||
If True, updates the gradients using NAG.
|
||||
If False, updates the gradients without using NAG. Default: False.
|
||||
If true, update the gradients using NAG.
|
||||
If true, update the gradients without using NAG. Default: False.
|
||||
weight_decay (float): Weight decay (L2 penalty). Default: 0.0.
|
||||
loss_scale (float): A floating point value for the loss scale. Should be equal to or greater than 1. Default:
|
||||
1.0.
|
||||
|
|
|
@ -83,12 +83,12 @@ class Momentum(Optimizer):
|
|||
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
|
||||
in the value of 'order_params' should be in one of group parameters.
|
||||
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
momentum (float): Hyperparameter of type float, means momentum for the moving average.
|
||||
It should be at least 0.0.
|
||||
|
|
|
@ -40,8 +40,6 @@ class Optimizer(Cell):
|
|||
"""
|
||||
Base class for all optimizers.
|
||||
|
||||
This class defines the API to add Ops to train a model.
|
||||
|
||||
Note:
|
||||
This class defines the API to add Ops to train a model. Never use
|
||||
this class directly, but instead instantiate one of its subclasses.
|
||||
|
@ -55,12 +53,12 @@ class Optimizer(Cell):
|
|||
To improve parameter groups performance, the customized order of parameters can be supported.
|
||||
|
||||
Args:
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning
|
||||
rate. When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning
|
||||
rate. When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
parameters (Union[list[Parameter], list[dict]]): When the `parameters` is a list of `Parameter` which will be
|
||||
updated, the element in `parameters` should be class `Parameter`. When the `parameters` is a list of `dict`,
|
||||
|
@ -84,8 +82,8 @@ class Optimizer(Cell):
|
|||
type of `loss_scale` input is int, it will be converted to float. Default: 1.0.
|
||||
|
||||
Raises:
|
||||
ValueError: If the learning_rate is a Tensor, but the dims of tensor is greater than 1.
|
||||
TypeError: If the learning_rate is not any of the three types: float, Tensor, Iterable.
|
||||
ValueError: If the learning_rate is a Tensor, but the dimension of tensor is greater than 1.
|
||||
TypeError: If the learning_rate is not any of the three types: float, Tensor, nor Iterable.
|
||||
"""
|
||||
|
||||
def __init__(self, learning_rate, parameters, weight_decay=0.0, loss_scale=1.0):
|
||||
|
@ -179,7 +177,7 @@ class Optimizer(Cell):
|
|||
An approach to reduce the overfitting of a deep learning neural network model.
|
||||
|
||||
Args:
|
||||
gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape with
|
||||
gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape as
|
||||
`self.parameters`.
|
||||
|
||||
Returns:
|
||||
|
@ -204,7 +202,7 @@ class Optimizer(Cell):
|
|||
network.
|
||||
|
||||
Args:
|
||||
gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape with
|
||||
gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape as
|
||||
`self.parameters`.
|
||||
|
||||
Returns:
|
||||
|
|
|
@ -87,22 +87,22 @@ class ProximalAdagrad(Optimizer):
|
|||
in the value of 'order_params' should be in one of group parameters.
|
||||
|
||||
accum (float): The starting value for accumulators, must be zero or positive values. Default: 0.1.
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
Default: 0.001.
|
||||
l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0.
|
||||
l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0.
|
||||
use_locking (bool): If True use locks for update operation. Default: False.
|
||||
use_locking (bool): If True, use locks for updating operation. Default: False.
|
||||
loss_scale (float): Value for the loss scale. It should be greater than 0.0. Default: 1.0.
|
||||
weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0.
|
||||
|
||||
Inputs:
|
||||
- **grads** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is as same as the `params`
|
||||
- **grads** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the same as the `params`
|
||||
in optimizer.
|
||||
|
||||
Outputs:
|
||||
|
|
|
@ -106,12 +106,12 @@ class RMSProp(Optimizer):
|
|||
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
|
||||
in the value of 'order_params' should be in one of group parameters.
|
||||
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
Default: 0.1.
|
||||
decay (float): Decay rate. Should be equal to or greater than 0. Default: 0.9.
|
||||
|
|
|
@ -78,12 +78,12 @@ class SGD(Optimizer):
|
|||
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
|
||||
in the value of 'order_params' should be in one of group parameters.
|
||||
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate.
|
||||
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then
|
||||
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
|
||||
When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
|
||||
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
|
||||
use dynamic learning rate, the i-th learning rate will be calculated during the process of training
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with
|
||||
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
|
||||
dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
|
||||
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
|
||||
Default: 0.1.
|
||||
momentum (float): A floating point value the momentum. should be at least 0.0. Default: 0.0.
|
||||
|
|
|
@ -138,9 +138,9 @@ class TrainOneStepCell(Cell):
|
|||
r"""
|
||||
Network training package class.
|
||||
|
||||
Wraps the network with an optimizer. The resulting Cell be trained with input *inputs.
|
||||
Backward graph will be created in the construct function to do parameter updating. Different
|
||||
parallel modes are available to run the training.
|
||||
Wraps the network with an optimizer. The resulting Cell is trained with input *inputs.
|
||||
The backward graph will be created in the construct function to update the parameter. Different
|
||||
parallel modes are available for training.
|
||||
|
||||
Args:
|
||||
network (Cell): The training network.
|
||||
|
@ -231,14 +231,14 @@ class DataWrapper(Cell):
|
|||
|
||||
class GetNextSingleOp(Cell):
|
||||
"""
|
||||
Cell to run get next operation.
|
||||
Cell to run for getting the next operation.
|
||||
|
||||
Args:
|
||||
dataset_types (list[:class:`mindspore.dtype`]): The types of dataset.
|
||||
dataset_shapes (list[tuple[int]]): The shapes of dataset.
|
||||
queue_name (str): Queue name to fetch the data.
|
||||
|
||||
Detailed information, please refer to `ops.operations.GetNext`.
|
||||
For detailed information, refer to `ops.operations.GetNext`.
|
||||
"""
|
||||
|
||||
def __init__(self, dataset_types, dataset_shapes, queue_name):
|
||||
|
@ -360,7 +360,7 @@ class ParameterUpdate(Cell):
|
|||
param (Parameter): The parameter to be updated manually.
|
||||
|
||||
Raises:
|
||||
KeyError: If parameter with the specified name do not exist.
|
||||
KeyError: If parameter with the specified name does not exist.
|
||||
|
||||
Examples:
|
||||
>>> network = Net()
|
||||
|
|
|
@ -329,7 +329,7 @@ class DistributedGradReducer(Cell):
|
|||
|
||||
def construct(self, grads):
|
||||
"""
|
||||
In some circumstances, the data precision of grads could be mixed with float16 and float32. Thus, the
|
||||
Under certain circumstances, the data precision of grads could be mixed with float16 and float32. Thus, the
|
||||
result of AllReduce is unreliable. To solve the problem, grads should be cast to float32 before AllReduce,
|
||||
and cast back after the operation.
|
||||
|
||||
|
|
|
@ -54,8 +54,8 @@ class DynamicLossScaleUpdateCell(Cell):
|
|||
Dynamic Loss scale update cell.
|
||||
|
||||
For loss scaling training, the initial loss scaling value will be set to be `loss_scale_value`.
|
||||
In every training step, the loss scaling value will be updated by loss scaling value/`scale_factor`
|
||||
when there is overflow. And it will be increased by loss scaling value * `scale_factor` if there is no
|
||||
In each training step, the loss scaling value will be updated by loss scaling value/`scale_factor`
|
||||
when there is an overflow. And it will be increased by loss scaling value * `scale_factor` if there is no
|
||||
overflow for a continuous `scale_window` steps. This cell is used for Graph mode training in which all
|
||||
logic will be executed on device side(Another training mode is normal(non-sink) mode in which some logic will be
|
||||
executed on host).
|
||||
|
@ -133,7 +133,7 @@ class FixedLossScaleUpdateCell(Cell):
|
|||
"""
|
||||
Static scale update cell, the loss scaling value will not be updated.
|
||||
|
||||
For usage please refer to `DynamicLossScaleUpdateCell`.
|
||||
For usage, refer to `DynamicLossScaleUpdateCell`.
|
||||
|
||||
Args:
|
||||
loss_scale_value (float): Init loss scale.
|
||||
|
|
|
@ -57,7 +57,7 @@ class _TupleGetItemTensor(base.TupleGetItemTensor_):
|
|||
data (tuple): A tuple of items.
|
||||
index (Tensor): The index in tensor.
|
||||
Outputs:
|
||||
Type, is same as the element type of data.
|
||||
Type, is the same as the element type of data.
|
||||
"""
|
||||
|
||||
def __init__(self, name):
|
||||
|
@ -81,7 +81,7 @@ def _tuple_getitem_by_number(data, number_index):
|
|||
number_index (Number): Index in scalar.
|
||||
|
||||
Outputs:
|
||||
Type, is same as the element type of data.
|
||||
Type, is the same as the element type of data.
|
||||
"""
|
||||
return F.tuple_getitem(data, number_index)
|
||||
|
||||
|
@ -96,7 +96,7 @@ def _tuple_getitem_by_slice(data, slice_index):
|
|||
slice_index (Slice): Index in slice.
|
||||
|
||||
Outputs:
|
||||
Tuple, element type is same as the element type of data.
|
||||
Tuple, element type is the same as the element type of data.
|
||||
"""
|
||||
return _tuple_slice(data, slice_index)
|
||||
|
||||
|
@ -111,7 +111,7 @@ def _tuple_getitem_by_tensor(data, tensor_index):
|
|||
tensor_index (Tensor): Index to select item.
|
||||
|
||||
Outputs:
|
||||
Type, is same as the element type of data.
|
||||
Type, is the same as the element type of data.
|
||||
"""
|
||||
return _tuple_get_item_tensor(data, tensor_index)
|
||||
|
||||
|
@ -126,7 +126,7 @@ def _list_getitem_by_number(data, number_index):
|
|||
number_index (Number): Index in scalar.
|
||||
|
||||
Outputs:
|
||||
Type is same as the element type of data.
|
||||
Type is the same as the element type of data.
|
||||
"""
|
||||
return F.list_getitem(data, number_index)
|
||||
|
||||
|
@ -186,7 +186,7 @@ def _tensor_getitem_by_slice(data, slice_index):
|
|||
slice_index (Slice): Index in slice.
|
||||
|
||||
Outputs:
|
||||
Tensor, element type is same as the element type of data.
|
||||
Tensor, element type is the same as the element type of data.
|
||||
"""
|
||||
return compile_utils.tensor_index_by_slice(data, slice_index)
|
||||
|
||||
|
@ -201,7 +201,7 @@ def _tensor_getitem_by_tensor(data, tensor_index):
|
|||
tensor_index (Tensor): An index expressed by tensor.
|
||||
|
||||
Outputs:
|
||||
Tensor, element type is same as the element type of data.
|
||||
Tensor, element type is the same as the element type of data.
|
||||
"""
|
||||
return compile_utils.tensor_index_by_tensor(data, tensor_index)
|
||||
|
||||
|
@ -216,7 +216,7 @@ def _tensor_getitem_by_tuple(data, tuple_index):
|
|||
tuple_index (tuple): Index in tuple.
|
||||
|
||||
Outputs:
|
||||
Tensor, element type is same as the element type of data.
|
||||
Tensor, element type is the same as the element type of data.
|
||||
"""
|
||||
return compile_utils.tensor_index_by_tuple(data, tuple_index)
|
||||
|
||||
|
|
|
@ -32,7 +32,7 @@ def _list_setitem_with_string(data, number_index, value):
|
|||
number_index (Number): Index of data.
|
||||
|
||||
Outputs:
|
||||
list, type is same as the element type of data.
|
||||
list, type is the same as the element type of data.
|
||||
"""
|
||||
return F.list_setitem(data, number_index, value)
|
||||
|
||||
|
@ -48,7 +48,7 @@ def _list_setitem_with_number(data, number_index, value):
|
|||
value (Number): Value given.
|
||||
|
||||
Outputs:
|
||||
list, type is same as the element type of data.
|
||||
list, type is the same as the element type of data.
|
||||
"""
|
||||
return F.list_setitem(data, number_index, value)
|
||||
|
||||
|
@ -64,7 +64,7 @@ def _list_setitem_with_Tensor(data, number_index, value):
|
|||
value (Tensor): Value given.
|
||||
|
||||
Outputs:
|
||||
list, type is same as the element type of data.
|
||||
list, type is the same as the element type of data.
|
||||
"""
|
||||
return F.list_setitem(data, number_index, value)
|
||||
|
||||
|
@ -80,7 +80,7 @@ def _list_setitem_with_List(data, number_index, value):
|
|||
value (list): Value given.
|
||||
|
||||
Outputs:
|
||||
list, type is same as the element type of data.
|
||||
list, type is the same as the element type of data.
|
||||
"""
|
||||
return F.list_setitem(data, number_index, value)
|
||||
|
||||
|
@ -96,7 +96,7 @@ def _list_setitem_with_Tuple(data, number_index, value):
|
|||
value (list): Value given.
|
||||
|
||||
Outputs:
|
||||
list, type is same as the element type of data.
|
||||
list, type is the same as the element type of data.
|
||||
"""
|
||||
return F.list_setitem(data, number_index, value)
|
||||
|
||||
|
|
|
@ -158,18 +158,18 @@ class ExtractImagePatches(PrimitiveWithInfer):
|
|||
The input tensor must be a 4-D tensor and the data format is NHWC.
|
||||
|
||||
Args:
|
||||
ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or list of int,
|
||||
ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or a list of integers,
|
||||
and the format is [1, ksize_row, ksize_col, 1].
|
||||
strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches,
|
||||
should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1].
|
||||
rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dim
|
||||
pixel positions, should be a tuple or list of int, and the format is [1, rate_row, rate_col, 1].
|
||||
rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dimension
|
||||
pixel positions, should be a tuple or a list of integers, and the format is [1, rate_row, rate_col, 1].
|
||||
padding (str): The type of padding algorithm, is a string whose value is "same" or "valid",
|
||||
not case sensitive. Default: "valid".
|
||||
|
||||
- same: Means that the patch can take the part beyond the original image, and this part is filled with 0.
|
||||
|
||||
- valid: Means that the patch area taken must be completely contained in the original image.
|
||||
- valid: Means that the taken patch area must be completely covered in the original image.
|
||||
|
||||
Inputs:
|
||||
- **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_row, in_col, in_depth] and
|
||||
|
@ -177,7 +177,7 @@ class ExtractImagePatches(PrimitiveWithInfer):
|
|||
|
||||
Outputs:
|
||||
Tensor, a 4-D tensor whose data type is same as 'input_x',
|
||||
and the shape is [out_batch, out_row, out_col, out_depth], the out_batch is same as the in_batch.
|
||||
and the shape is [out_batch, out_row, out_col, out_depth], the out_batch is the same as the in_batch.
|
||||
"""
|
||||
|
||||
@prim_attr_register
|
||||
|
@ -436,8 +436,8 @@ class MatrixDiag(PrimitiveWithInfer):
|
|||
Returns a batched diagonal tensor with a given batched diagonal values.
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - A tensor which to be element-wise multi by `assist`. It can be of the following data types:
|
||||
float32, float16, int32, int8, uint8.
|
||||
- **x** (Tensor) - A tensor which to be element-wise multi by `assist`. It can be one of the following data
|
||||
types: float32, float16, int32, int8, and uint8.
|
||||
- **assist** (Tensor) - A eye tensor of the same type as `x`. It's rank must greater than or equal to 2 and
|
||||
it's last dimension must equal to the second to last dimension.
|
||||
|
||||
|
@ -490,7 +490,7 @@ class MatrixDiagPart(PrimitiveWithInfer):
|
|||
Returns the batched diagonal part of a batched tensor.
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - The batched tensor. It can be of the following data types:
|
||||
- **x** (Tensor) - The batched tensor. It can be one of the following data types:
|
||||
float32, float16, int32, int8, uint8.
|
||||
- **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`.
|
||||
|
||||
|
@ -531,7 +531,7 @@ class MatrixSetDiag(PrimitiveWithInfer):
|
|||
Modify the batched diagonal part of a batched tensor.
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - The batched tensor. It can be of the following data types:
|
||||
- **x** (Tensor) - The batched tensor. It can be one of the following data types:
|
||||
float32, float16, int32, int8, uint8.
|
||||
- **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`.
|
||||
- **diagonal** (Tensor) - The diagonal values.
|
||||
|
|
|
@ -178,8 +178,8 @@ class FakeQuantPerLayer(PrimitiveWithInfer):
|
|||
quant_delay (int): Quantilization delay parameter. Before delay step in training time not update
|
||||
simulate quantization aware funcion. After delay step in training time begin simulate the aware
|
||||
quantize funcion. Default: 0.
|
||||
symmetric (bool): Quantization algorithm use symmetric or not. Default: False.
|
||||
narrow_range (bool): Quantization algorithm use narrow range or not. Default: False.
|
||||
symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
|
||||
training (bool): Training the network or not. Default: True.
|
||||
|
||||
Inputs:
|
||||
|
@ -318,8 +318,8 @@ class FakeQuantPerChannel(PrimitiveWithInfer):
|
|||
quant_delay (int): Quantilization delay parameter. Before delay step in training time not
|
||||
update the weight data to simulate quantize operation. After delay step in training time
|
||||
begin simulate the quantize operation. Default: 0.
|
||||
symmetric (bool): Quantization algorithm use symmetric or not. Default: False.
|
||||
narrow_range (bool): Quantization algorithm use narrow range or not. Default: False.
|
||||
symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
|
||||
training (bool): Training the network or not. Default: True.
|
||||
channel_axis (int): Quantization by channel axis. Ascend backend only supports 0 or 1. Default: 1.
|
||||
|
||||
|
|
|
@ -3359,7 +3359,7 @@ class InplaceUpdate(PrimitiveWithInfer):
|
|||
indices (Union[int, tuple]): Indices into the left-most dimension of `x`.
|
||||
|
||||
Inputs:
|
||||
- **x** (Tensor) - A tensor which to be inplace updated. It can be of the following data types:
|
||||
- **x** (Tensor) - A tensor which to be inplace updated. It can be one of the following data types:
|
||||
float32, float16, int32.
|
||||
- **v** (Tensor) - A tensor of the same type as `x`. Same dimension size as `x` except
|
||||
the first dimension, which must be the same as the size of `indices`.
|
||||
|
@ -3474,7 +3474,7 @@ class TransShape(PrimitiveWithInfer):
|
|||
- **out_shape** (tuple[int]) - The shape of output data.
|
||||
|
||||
Outputs:
|
||||
Tensor, a tensor whose data type is same as 'input_x', and the shape is same as the `out_shape`.
|
||||
Tensor, a tensor whose data type is same as 'input_x', and the shape is the same as the `out_shape`.
|
||||
"""
|
||||
@prim_attr_register
|
||||
def __init__(self):
|
||||
|
|
|
@ -31,7 +31,7 @@ class ScalarCast(PrimitiveWithInfer):
|
|||
- **input_y** (mindspore.dtype) - The type should cast to be. Only constant value is allowed.
|
||||
|
||||
Outputs:
|
||||
Scalar. The type is same as the python type corresponding to `input_y`.
|
||||
Scalar. The type is the same as the python type corresponding to `input_y`.
|
||||
|
||||
Examples:
|
||||
>>> scalar_cast = P.ScalarCast()
|
||||
|
|
|
@ -132,7 +132,7 @@ class TensorAdd(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1067,7 +1067,7 @@ class Sub(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1105,7 +1105,7 @@ class Mul(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1144,7 +1144,7 @@ class SquaredDifference(_MathBinaryOp):
|
|||
float16, float32, int32 or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1333,7 +1333,7 @@ class Pow(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1618,7 +1618,7 @@ class Minimum(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1656,7 +1656,7 @@ class Maximum(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1694,7 +1694,7 @@ class RealDiv(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1733,7 +1733,7 @@ class Div(_MathBinaryOp):
|
|||
is a number or a bool, the second input should be a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Raises:
|
||||
|
@ -1772,7 +1772,7 @@ class DivNoNan(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Raises:
|
||||
|
@ -1814,7 +1814,7 @@ class FloorDiv(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1844,7 +1844,7 @@ class TruncateDiv(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1873,7 +1873,7 @@ class TruncateMod(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -1900,7 +1900,7 @@ class Mod(_MathBinaryOp):
|
|||
the second input should be a tensor whose data type is number.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Raises:
|
||||
|
@ -1967,7 +1967,7 @@ class FloorMod(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -2025,7 +2025,7 @@ class Xdivy(_MathBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is float16, float32 or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -2059,7 +2059,7 @@ class Xlogy(_MathBinaryOp):
|
|||
The value must be positive.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,
|
||||
Tensor, the shape is the same as the shape after broadcasting,
|
||||
and the data type is the one with high precision or high digits among the two inputs.
|
||||
|
||||
Examples:
|
||||
|
@ -2219,7 +2219,7 @@ class Equal(_LogicBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32)
|
||||
|
@ -2250,7 +2250,7 @@ class ApproximateEqual(_LogicBinaryOp):
|
|||
- **x2** (Tensor) - A tensor of the same type and shape as 'x1'.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape of 'x1', and the data type is bool.
|
||||
Tensor, the shape is the same as the shape of 'x1', and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> x1 = Tensor(np.array([1, 2, 3]), mindspore.float32)
|
||||
|
@ -2328,7 +2328,7 @@ class NotEqual(_LogicBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32)
|
||||
|
@ -2364,7 +2364,7 @@ class Greater(_LogicBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
|
||||
|
@ -2399,7 +2399,7 @@ class GreaterEqual(_LogicBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
|
||||
|
@ -2434,7 +2434,7 @@ class Less(_LogicBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
|
||||
|
@ -2469,7 +2469,7 @@ class LessEqual(_LogicBinaryOp):
|
|||
a bool when the first input is a tensor or a tensor whose data type is number or bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
|
||||
|
@ -2495,7 +2495,7 @@ class LogicalNot(PrimitiveWithInfer):
|
|||
- **input_x** (Tensor) - The input tensor whose dtype is bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the `input_x`, and the dtype is bool.
|
||||
Tensor, the shape is the same as the `input_x`, and the dtype is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_)
|
||||
|
@ -2533,7 +2533,7 @@ class LogicalAnd(_LogicBinaryOp):
|
|||
a tensor whose data type is bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting, and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting, and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_)
|
||||
|
@ -2563,7 +2563,7 @@ class LogicalOr(_LogicBinaryOp):
|
|||
a tensor whose data type is bool.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is bool.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_)
|
||||
|
@ -3182,7 +3182,7 @@ class Atan2(_MathBinaryOp):
|
|||
- **input_y** (Tensor) - The input tensor.
|
||||
|
||||
Outputs:
|
||||
Tensor, the shape is same as the shape after broadcasting,and the data type is same as `input_x`.
|
||||
Tensor, the shape is the same as the shape after broadcasting,and the data type is same as `input_x`.
|
||||
|
||||
Examples:
|
||||
>>> input_x = Tensor(np.array([[0, 1]]), mindspore.float32)
|
||||
|
|
|
@ -100,7 +100,7 @@ class Softmax(PrimitiveWithInfer):
|
|||
Softmax operation.
|
||||
|
||||
Applies the Softmax operation to the input tensor on the specified axis.
|
||||
Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i`
|
||||
Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i`
|
||||
the Softmax function is shown as follows:
|
||||
|
||||
.. math::
|
||||
|
@ -151,7 +151,7 @@ class LogSoftmax(PrimitiveWithInfer):
|
|||
Log Softmax activation function.
|
||||
|
||||
Applies the Log Softmax function to the input tensor on the specified axis.
|
||||
Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i`
|
||||
Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i`
|
||||
the Log Softmax function is shown as follows:
|
||||
|
||||
.. math::
|
||||
|
@ -429,7 +429,7 @@ class HSwish(PrimitiveWithInfer):
|
|||
.. math::
|
||||
\text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6},
|
||||
|
||||
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor.
|
||||
where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
|
||||
|
||||
Inputs:
|
||||
- **input_data** (Tensor) - The input of HSwish, data type should be float16 or float32.
|
||||
|
@ -502,7 +502,7 @@ class HSigmoid(PrimitiveWithInfer):
|
|||
.. math::
|
||||
\text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})),
|
||||
|
||||
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor.
|
||||
where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
|
||||
|
||||
Inputs:
|
||||
- **input_data** (Tensor) - The input of HSigmoid, data type should be float16 or float32.
|
||||
|
@ -2234,7 +2234,7 @@ class DropoutDoMask(PrimitiveWithInfer):
|
|||
shape of `input_x` must be same as the value of `DropoutGenMask`'s input `shape`. If input wrong `mask`,
|
||||
the output of `DropoutDoMask` are unpredictable.
|
||||
- **keep_prob** (Tensor) - The keep rate, between 0 and 1, e.g. keep_prob = 0.9,
|
||||
means dropping out 10% of input units. The value of `keep_prob` is same as the input `keep_prob` of
|
||||
means dropping out 10% of input units. The value of `keep_prob` is the same as the input `keep_prob` of
|
||||
`DropoutGenMask`.
|
||||
|
||||
Outputs:
|
||||
|
@ -2674,9 +2674,9 @@ class Pad(PrimitiveWithInfer):
|
|||
|
||||
Args:
|
||||
paddings (tuple): The shape of parameter `paddings` is (N, 2). N is the rank of input data. All elements of
|
||||
paddings are int type. For `D` th dimension of input, paddings[D, 0] indicates how many sizes to be
|
||||
extended ahead of the `D` th dimension of the input tensor, and paddings[D, 1] indicates how many sizes to
|
||||
be extended behind of the `D` th dimension of the input tensor.
|
||||
paddings are int type. For the input in `D` th dimension, paddings[D, 0] indicates how many sizes to be
|
||||
extended ahead of the input tensor in the `D` th dimension, and paddings[D, 1] indicates how many sizes to
|
||||
be extended behind of the input tensor in the `D` th dimension.
|
||||
|
||||
Inputs:
|
||||
- **input_x** (Tensor) - The input tensor.
|
||||
|
@ -2733,9 +2733,9 @@ class MirrorPad(PrimitiveWithInfer):
|
|||
- **input_x** (Tensor) - The input tensor.
|
||||
- **paddings** (Tensor) - The paddings tensor. The value of `paddings` is a matrix(list),
|
||||
and its shape is (N, 2). N is the rank of input data. All elements of paddings
|
||||
are int type. For `D` th dimension of input, paddings[D, 0] indicates how many sizes to be
|
||||
extended ahead of the `D` th dimension of the input tensor, and paddings[D, 1] indicates
|
||||
how many sizes to be extended behind of the `D` th dimension of the input tensor.
|
||||
are int type. For the input in `D` th dimension, paddings[D, 0] indicates how many sizes to be
|
||||
extended ahead of the input tensor in the `D` th dimension, and paddings[D, 1] indicates how many sizes to
|
||||
be extended behind of the input tensor in the `D` th dimension.
|
||||
|
||||
Outputs:
|
||||
Tensor, the tensor after padding.
|
||||
|
@ -2880,11 +2880,11 @@ class Adam(PrimitiveWithInfer):
|
|||
|
||||
Args:
|
||||
use_locking (bool): Whether to enable a lock to protect updating variable tensors.
|
||||
If True, updating of the var, m, and v tensors will be protected by a lock.
|
||||
If False, the result is unpredictable. Default: False.
|
||||
If true, updates of the var, m, and v tensors will be protected by a lock.
|
||||
If false, the result is unpredictable. Default: False.
|
||||
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
|
||||
If True, updates the gradients using NAG.
|
||||
If False, updates the gradients without using NAG. Default: False.
|
||||
If true, update the gradients using NAG.
|
||||
If true, update the gradients without using NAG. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Tensor) - Weights to be updated.
|
||||
|
@ -2894,8 +2894,8 @@ class Adam(PrimitiveWithInfer):
|
|||
- **beta1_power** (float) - :math:`beta_1^t` in the updating formula.
|
||||
- **beta2_power** (float) - :math:`beta_2^t` in the updating formula.
|
||||
- **lr** (float) - :math:`l` in the updating formula.
|
||||
- **beta1** (float) - The exponential decay rate for the 1st moment estimates.
|
||||
- **beta2** (float) - The exponential decay rate for the 2nd moment estimates.
|
||||
- **beta1** (float) - The exponential decay rate for the 1st moment estimations.
|
||||
- **beta2** (float) - The exponential decay rate for the 2nd moment estimations.
|
||||
- **epsilon** (float) - Term added to the denominator to improve numerical stability.
|
||||
- **gradient** (Tensor) - Gradients. Has the same type as `var`.
|
||||
|
||||
|
@ -2974,11 +2974,11 @@ class FusedSparseAdam(PrimitiveWithInfer):
|
|||
|
||||
Args:
|
||||
use_locking (bool): Whether to enable a lock to protect updating variable tensors.
|
||||
If True, updating of the var, m, and v tensors will be protected by a lock.
|
||||
If False, the result is unpredictable. Default: False.
|
||||
If true, updates of the var, m, and v tensors will be protected by a lock.
|
||||
If false, the result is unpredictable. Default: False.
|
||||
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
|
||||
If True, updates the gradients using NAG.
|
||||
If False, updates the gradients without using NAG. Default: False.
|
||||
If true, update the gradients using NAG.
|
||||
If true, update the gradients without using NAG. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - Parameters to be updated. With float32 data type.
|
||||
|
@ -2989,8 +2989,8 @@ class FusedSparseAdam(PrimitiveWithInfer):
|
|||
- **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type.
|
||||
- **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type.
|
||||
- **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type.
|
||||
- **beta1** (Tensor) - The exponential decay rate for the 1st moment estimates. With float32 data type.
|
||||
- **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimates. With float32 data type.
|
||||
- **beta1** (Tensor) - The exponential decay rate for the 1st moment estimations. With float32 data type.
|
||||
- **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimations. With float32 data type.
|
||||
- **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type.
|
||||
- **gradient** (Tensor) - Gradient value. With float32 data type.
|
||||
- **indices** (Tensor) - Gradient indices. With int32 data type.
|
||||
|
@ -3108,11 +3108,11 @@ class FusedSparseLazyAdam(PrimitiveWithInfer):
|
|||
|
||||
Args:
|
||||
use_locking (bool): Whether to enable a lock to protect updating variable tensors.
|
||||
If True, updating of the var, m, and v tensors will be protected by a lock.
|
||||
If False, the result is unpredictable. Default: False.
|
||||
If true, updates of the var, m, and v tensors will be protected by a lock.
|
||||
If false, the result is unpredictable. Default: False.
|
||||
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
|
||||
If True, updates the gradients using NAG.
|
||||
If False, updates the gradients without using NAG. Default: False.
|
||||
If true, update the gradients using NAG.
|
||||
If true, update the gradients without using NAG. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - Parameters to be updated. With float32 data type.
|
||||
|
@ -3123,8 +3123,8 @@ class FusedSparseLazyAdam(PrimitiveWithInfer):
|
|||
- **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type.
|
||||
- **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type.
|
||||
- **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type.
|
||||
- **beta1** (Tensor) - The exponential decay rate for the 1st moment estimates. With float32 data type.
|
||||
- **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimates. With float32 data type.
|
||||
- **beta1** (Tensor) - The exponential decay rate for the 1st moment estimations. With float32 data type.
|
||||
- **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimations. With float32 data type.
|
||||
- **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type.
|
||||
- **gradient** (Tensor) - Gradient value. With float32 data type.
|
||||
- **indices** (Tensor) - Gradient indices. With int32 data type.
|
||||
|
@ -3227,7 +3227,7 @@ class FusedSparseFtrl(PrimitiveWithInfer):
|
|||
l2 (float): l2 regularization strength, must be greater than or equal to zero.
|
||||
lr_power (float): Learning rate power controls how the learning rate decreases during training,
|
||||
must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero.
|
||||
use_locking (bool): Use locks for update operation if True . Default: False.
|
||||
use_locking (bool): Use locks for updating operation if True . Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - The variable to be updated. The data type must be float32.
|
||||
|
@ -3320,7 +3320,7 @@ class FusedSparseProximalAdagrad(PrimitiveWithInfer):
|
|||
var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0)
|
||||
|
||||
Args:
|
||||
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False.
|
||||
use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - Variable tensor to be updated. The data type must be float32.
|
||||
|
@ -3415,7 +3415,7 @@ class KLDivLoss(PrimitiveWithInfer):
|
|||
\end{cases}
|
||||
|
||||
Args:
|
||||
reduction (str): Specifies the reduction to apply to the output.
|
||||
reduction (str): Specifies the reduction to be applied to the output.
|
||||
Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'.
|
||||
|
||||
Inputs:
|
||||
|
@ -3487,7 +3487,7 @@ class BinaryCrossEntropy(PrimitiveWithInfer):
|
|||
\end{cases}
|
||||
|
||||
Args:
|
||||
reduction (str): Specifies the reduction to apply to the output.
|
||||
reduction (str): Specifies the reduction to be applied to the output.
|
||||
Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'.
|
||||
|
||||
Inputs:
|
||||
|
@ -3575,9 +3575,9 @@ class ApplyAdaMax(PrimitiveWithInfer):
|
|||
With float32 or float16 data type.
|
||||
- **lr** (Union[Number, Tensor]) - Learning rate, :math:`l` in the updating formula, should be scalar.
|
||||
With float32 or float16 data type.
|
||||
- **beta1** (Union[Number, Tensor]) - The exponential decay rate for the 1st moment estimates,
|
||||
- **beta1** (Union[Number, Tensor]) - The exponential decay rate for the 1st moment estimations,
|
||||
should be scalar. With float32 or float16 data type.
|
||||
- **beta2** (Union[Number, Tensor]) - The exponential decay rate for the 2nd moment estimates,
|
||||
- **beta2** (Union[Number, Tensor]) - The exponential decay rate for the 2nd moment estimations,
|
||||
should be scalar. With float32 or float16 data type.
|
||||
- **epsilon** (Union[Number, Tensor]) - A small value added for numerical stability, should be scalar.
|
||||
With float32 or float16 data type.
|
||||
|
@ -3939,7 +3939,7 @@ class SparseApplyAdagrad(PrimitiveWithInfer):
|
|||
Args:
|
||||
lr (float): Learning rate.
|
||||
update_slots (bool): If `True`, `accum` will be updated. Default: True.
|
||||
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False.
|
||||
use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - Variable to be updated. The data type must be float16 or float32.
|
||||
|
@ -4099,7 +4099,7 @@ class ApplyProximalAdagrad(PrimitiveWithInfer):
|
|||
var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0)
|
||||
|
||||
Args:
|
||||
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False.
|
||||
use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - Variable to be updated. The data type should be float16 or float32.
|
||||
|
@ -4195,7 +4195,7 @@ class SparseApplyProximalAdagrad(PrimitiveWithInfer):
|
|||
var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0)
|
||||
|
||||
Args:
|
||||
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False.
|
||||
use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - Variable tensor to be updated. The data type must be float16 or float32.
|
||||
|
@ -4697,7 +4697,7 @@ class ApplyFtrl(PrimitiveWithInfer):
|
|||
Update relevant entries according to the FTRL scheme.
|
||||
|
||||
Args:
|
||||
use_locking (bool): Use locks for update operation if True . Default: False.
|
||||
use_locking (bool): Use locks for updating operation if True . Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - The variable to be updated. The data type should be float16 or float32.
|
||||
|
@ -4788,7 +4788,7 @@ class SparseApplyFtrl(PrimitiveWithInfer):
|
|||
l2 (float): l2 regularization strength, must be greater than or equal to zero.
|
||||
lr_power (float): Learning rate power controls how the learning rate decreases during training,
|
||||
must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero.
|
||||
use_locking (bool): Use locks for update operation if True . Default: False.
|
||||
use_locking (bool): Use locks for updating operation if True . Default: False.
|
||||
|
||||
Inputs:
|
||||
- **var** (Parameter) - The variable to be updated. The data type must be float16 or float32.
|
||||
|
@ -4967,8 +4967,8 @@ class ConfusionMulGrad(PrimitiveWithInfer):
|
|||
axis (Union[int, tuple[int], list[int]]): The dimensions to reduce.
|
||||
Default:(), reduce all dimensions. Only constant value is allowed.
|
||||
keep_dims (bool):
|
||||
- If True, keep these reduced dimensions and the length is 1.
|
||||
- If False, don't keep these dimensions. Default:False.
|
||||
- If true, keep these reduced dimensions and the length is 1.
|
||||
- If false, don't keep these dimensions. Default:False.
|
||||
|
||||
Inputs:
|
||||
- **input_0** (Tensor) - The input Tensor.
|
||||
|
@ -5094,9 +5094,9 @@ class CTCLoss(PrimitiveWithInfer):
|
|||
Calculates the CTC(Connectionist Temporal Classification) loss. Also calculates the gradient.
|
||||
|
||||
Args:
|
||||
preprocess_collapse_repeated (bool): If True, repeated labels are collapsed prior to the CTC calculation.
|
||||
preprocess_collapse_repeated (bool): If true, repeated labels are collapsed prior to the CTC calculation.
|
||||
Default: False.
|
||||
ctc_merge_repeated (bool): If False, during CTC calculation, repeated non-blank labels will not be merged
|
||||
ctc_merge_repeated (bool): If false, during CTC calculation, repeated non-blank labels will not be merged
|
||||
and are interpreted as individual labels. This is a simplfied version of CTC.
|
||||
Default: True.
|
||||
ignore_longer_outputs_than_inputs (bool): If True, sequences with longer outputs than inputs will be ignored.
|
||||
|
@ -5192,7 +5192,7 @@ class BasicLSTMCell(PrimitiveWithInfer):
|
|||
keep_prob (float): If not 1.0, append `Dropout` layer on the outputs of each
|
||||
LSTM layer except the last layer. Default 1.0. The range of dropout is [0.0, 1.0].
|
||||
forget_bias (float): Add forget bias to forget gate biases in order to decrease former scale. Default to 1.0.
|
||||
state_is_tuple (bool): If True, state is tensor tuple, containing h and c; If False, one tensor,
|
||||
state_is_tuple (bool): If true, state is tensor tuple, containing h and c; If false, one tensor,
|
||||
need split first. Default to True.
|
||||
activation (str): Activation. Default to "tanh".
|
||||
|
||||
|
|
|
@ -496,12 +496,11 @@ def convert_quant_network(network,
|
|||
per_channel (bool, list or tuple): Quantization granularity based on layer or on channel. If `True`
|
||||
then base on per channel otherwise base on per layer. The first element represent weights
|
||||
and second element represent data flow. Default: (False, False)
|
||||
symmetric (bool, list or tuple): Quantization algorithm use symmetric or not. If `True` then base on
|
||||
symmetric (bool, list or tuple): Whether the quantization algorithm is symmetric or not. If `True` then base on
|
||||
symmetric otherwise base on asymmetric. The first element represent weights and second
|
||||
element represent data flow. Default: (False, False)
|
||||
narrow_range (bool, list or tuple): Quantization algorithm use narrow range or not. If `True` then base
|
||||
on narrow range otherwise base on off narrow range. The first element represent weights and
|
||||
second element represent data flow. Default: (False, False)
|
||||
narrow_range (bool, list or tuple): Whether the quantization algorithm uses narrow range or not.
|
||||
The first element represents weights and the second element represents data flow. Default: (False, False)
|
||||
|
||||
Returns:
|
||||
Cell, Network which has change to quantization aware training network cell.
|
||||
|
|
|
@ -31,8 +31,8 @@ def cal_quantization_params(input_min,
|
|||
input_max (numpy.ndarray): The dimension of channel or 1.
|
||||
data_type (numpy type) : Can ben numpy int8, numpy uint8.
|
||||
num_bits (int): Quantization number bit, support 4 and 8bit. Default: 8.
|
||||
symmetric (bool): Quantization algorithm use symmetric or not. Default: False.
|
||||
narrow_range (bool): Quantization algorithm use narrow range or not. Default: False.
|
||||
symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
|
||||
narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
|
||||
|
||||
Returns:
|
||||
scale (numpy.ndarray): quantization param.
|
||||
|
|
Loading…
Reference in New Issue