!4388 Third round of enhancement of API comment & README_CN

Merge pull request !4388 from Simson/enhancement-API
This commit is contained in:
mindspore-ci-bot 2020-08-15 11:29:24 +08:00 committed by Gitee
commit 15496ff5a4
42 changed files with 518 additions and 291 deletions

View File

@ -1,7 +1,9 @@
![MindSpore Logo](docs/MindSpore-logo.png "MindSpore logo") ![MindSpore Logo](docs/MindSpore-logo.png "MindSpore logo")
============================================================ ============================================================
- [What Is MindSpore?](#what-is-mindspore) [查看中文](./README_CN.md)
- [What Is MindSpore](#what-is-mindspore)
- [Automatic Differentiation](#automatic-differentiation) - [Automatic Differentiation](#automatic-differentiation)
- [Automatic Parallel](#automatic-parallel) - [Automatic Parallel](#automatic-parallel)
- [Installation](#installation) - [Installation](#installation)

220
README_CN.md Normal file
View File

@ -0,0 +1,220 @@
![MindSpore标志](docs/MindSpore-logo.png "MindSpore logo")
============================================================
[View English](./README.md)
- [MindSpore介绍](#mindspore介绍)
- [自动微分](#自动微分)
- [自动并行](#自动并行)
- [安装](#安装)
- [二进制文件](#二进制文件)
- [来源](#来源)
- [Docker镜像](#docker镜像)
- [快速入门](#快速入门)
- [文档](#文档)
- [社区](#社区)
- [治理](#治理)
- [交流](#交流)
- [贡献](#贡献)
- [版本说明](#版本说明)
- [许可证](#许可证)
## MindSpore介绍
MindSpore是一种适用于端边云场景的新型开源深度学习训练/推理框架。
MindSpore提供了友好的设计和高效的执行旨在提升数据科学家和算法工程师的开发体验并为Ascend AI处理器提供原生支持以及软硬件协同优化。
同时MindSpore作为全球AI开源社区致力于进一步开发和丰富AI软硬件应用生态。
<img src="docs/MindSpore-architecture.png" alt="MindSpore Architecture" width="600"/>
欲了解更多详情,请查看我们的[总体架构](https://www.mindspore.cn/docs/zh-CN/master/architecture.html)。
### 自动微分
当前主流深度学习框架中有三种自动微分技术:
- **基于静态计算图的转换**:编译时将网络转换为静态数据流图,将链式法则应用于数据流图,实现自动微分。
- **基于动态计算图的转换**:记录算子过载正向执行时网络的运行轨迹,对动态生成的数据流图应用链式法则,实现自动微分。
- **基于源码的转换**该技术是从功能编程框架演进而来以即时编译Just-in-time CompilationJIT的形式对中间表达式程序在编译过程中的表达式进行自动差分转换支持复杂的控制流场景、高阶函数和闭包。
TensorFlow早期采用的是静态计算图PyTorch采用的是动态计算图。静态映射可以利用静态编译技术来优化网络性能但是构建网络或调试网络非常复杂。动态图的使用非常方便但很难实现性能的极限优化。
MindSpore找到了另一种方法即基于源代码转换的自动微分。一方面它支持自动控制流的自动微分因此像PyTorch这样的模型构建非常方便。另一方面MindSpore可以对神经网络进行静态编译优化以获得更好的性能。
<img src="docs/Automatic-differentiation.png" alt="Automatic Differentiation" width="600"/>
MindSpore自动微分的实现可以理解为程序本身的符号微分。MindSpore IR是一个函数中间表达式它与基础代数中的复合函数具有直观的对应关系。复合函数的公式由任意可推导的基础函数组成。MindSpore IR中的每个原语操作都可以对应基础代数中的基本功能从而可以建立更复杂的流控制。
### 自动并行
MindSpore自动并行的目的是构建数据并行、模型并行和混合并行相结合的训练方法。该方法能够自动选择开销最小的模型切分策略实现自动分布并行训练。
<img src="docs/Automatic-parallel.png" alt="Automatic Parallel" width="600"/>
目前MindSpore采用的是算子切分的细粒度并行策略即图中的每个算子被切分为一个集群完成并行操作。在此期间的切分策略可能非常复杂但是作为一名Python开发者您无需关注底层实现只要顶层API计算是有效的即可。
## 安装
### 二进制文件
MindSpore提供跨多个后端的构建选项
| 硬件平台 | 操作系统 | 状态 |
| :------------ | :-------------- | :--- |
| Ascend 910 | Ubuntu-x86 | ✔️ |
| | EulerOS-x86 | ✔️ |
| | EulerOS-aarch64 | ✔️ |
| GPU CUDA 10.1 | Ubuntu-x86 | ✔️ |
| CPU | Ubuntu-x86 | ✔️ |
| | Windows-x86 | ✔️ |
使用`pip`命令安装,以`CPU`和`Ubuntu-x86`build版本为例
1. 请从[MindSpore下载页面](https://www.mindspore.cn/versions)下载并安装whl包。
```
pip install https://ms-release.obs.cn-north-4.myhuaweicloud.com/0.6.0-beta/MindSpore/cpu/ubuntu_x86/mindspore-0.6.0-cp37-cp37m-linux_x86_64.whl
```
2. 执行以下命令,验证安装结果。
```python
import numpy as np
import mindspore.context as context
import mindspore.nn as nn
from mindspore import Tensor
from mindspore.ops import operations as P
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")
class Mul(nn.Cell):
def __init__(self):
super(Mul, self).__init__()
self.mul = P.Mul()
def construct(self, x, y):
return self.mul(x, y)
x = Tensor(np.array([1.0, 2.0, 3.0]).astype(np.float32))
y = Tensor(np.array([4.0, 5.0, 6.0]).astype(np.float32))
mul = Mul()
print(mul(x, y))
```
```
[ 4. 10. 18.]
```
### 来源
[MindSpore安装](https://www.mindspore.cn/install)。
### Docker镜像
MindSpore的Docker镜像托管在[Docker Hub](https://hub.docker.com/r/mindspore)上。
目前容器化构建选项支持情况如下:
| 硬件平台 | Docker镜像仓库 | 标签 | 说明 |
| :----- | :------------------------ | :----------------------- | :--------------------------------------- |
| CPU | `mindspore/mindspore-cpu` | `x.y.z` | 已经预安装MindSpore `x.y.z` CPU版本的生产环境。 |
| | | `devel` | 提供开发环境从源头构建MindSpore`CPU`后端。安装详情请参考https://www.mindspore.cn/install。 |
| | | `runtime` | 提供运行时环境安装MindSpore二进制包`CPU`后端)。 |
| GPU | `mindspore/mindspore-gpu` | `x.y.z` | 已经预安装MindSpore `x.y.z` GPU版本的生产环境。 |
| | | `devel` | 提供开发环境从源头构建MindSpore`GPU CUDA10.1`后端。安装详情请参考https://www.mindspore.cn/install。 |
| | | `runtime` | 提供运行时环境安装MindSpore二进制包`GPU CUDA10.1`后端)。 |
| Ascend | <center>&mdash;</center> | <center>&mdash;</center> | 即将推出,敬请期待。 |
> **注意:** 不建议从源头构建GPU `devel` Docker镜像后直接安装whl包。我们强烈建议您在GPU `runtime` Docker镜像中传输并安装whl包。
* CPU
对于`CPU`后端,可以直接使用以下命令获取并运行最新的稳定镜像:
```
docker pull mindspore/mindspore-cpu:0.6.0-beta
docker run -it mindspore/mindspore-cpu:0.6.0-beta /bin/bash
```
* GPU
对于`GPU`后端,请确保`nvidia-container-toolkit`已经提前安装,以下是`Ubuntu`用户安装指南:
```
DISTRIBUTION=$(. /etc/os-release; echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$DISTRIBUTION/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit nvidia-docker2
sudo systemctl restart docker
```
使用以下命令获取并运行最新的稳定镜像:
```
docker pull mindspore/mindspore-gpu:0.6.0-beta
docker run -it --runtime=nvidia --privileged=true mindspore/mindspore-gpu:0.6.0-beta /bin/bash
```
要测试Docker是否正常工作请运行下面的Python代码并检查输出
```python
import numpy as np
import mindspore.context as context
from mindspore import Tensor
from mindspore.ops import functional as F
context.set_context(device_target="GPU")
x = Tensor(np.ones([1,3,3,4]).astype(np.float32))
y = Tensor(np.ones([1,3,3,4]).astype(np.float32))
print(F.tensor_add(x, y))
```
```
[[[ 2. 2. 2. 2.],
[ 2. 2. 2. 2.],
[ 2. 2. 2. 2.]],
[[ 2. 2. 2. 2.],
[ 2. 2. 2. 2.],
[ 2. 2. 2. 2.]],
[[ 2. 2. 2. 2.],
[ 2. 2. 2. 2.],
[ 2. 2. 2. 2.]]]
```
如果您想了解更多关于MindSpore Docker镜像的构建过程请查看[docker](docker/README.md) repo了解详细信息。
## 快速入门
参考[快速入门](https://www.mindspore.cn/tutorial/zh-CN/master/quick_start/quick_start.html)实现图片分类。
## 文档
有关安装指南、教程和API的更多详细信息请参阅[用户文档](https://gitee.com/mindspore/docs)。
## 社区
### 治理
查看MindSpore如何进行[开放治理](https://gitee.com/mindspore/community/blob/master/governance.md)。
### 交流
- [MindSpore Slack](https://join.slack.com/t/mindspore/shared_invite/zt-dgk65rli-3ex4xvS4wHX7UDmsQmfu8w) 开发者交流平台。
- `#mindspore`IRC频道仅用于会议记录
- 视频会议:待定
- 邮件列表:<https://mailweb.mindspore.cn/postorius/lists>
## 贡献
欢迎参与贡献。更多详情,请参阅我们的[贡献者Wiki](CONTRIBUTING.md)。
## 版本说明
版本说明请参阅[RELEASE](RELEASE.md)。
## 许可证
[Apache License 2.0](LICENSE)

View File

@ -150,7 +150,7 @@ TensorPtr TensorPy::MakeTensor(const py::array &input, const TypePtr &type_ptr)
// Get tensor shape. // Get tensor shape.
std::vector<int> shape(buf.shape.begin(), buf.shape.end()); std::vector<int> shape(buf.shape.begin(), buf.shape.end());
if (data_type == buf_type) { if (data_type == buf_type) {
// Use memory copy if input data type is same as the required type. // Use memory copy if input data type is the same as the required type.
return std::make_shared<Tensor>(data_type, shape, buf.ptr, buf.size * buf.itemsize); return std::make_shared<Tensor>(data_type, shape, buf.ptr, buf.size * buf.itemsize);
} }
// Create tensor with data type converted. // Create tensor with data type converted.

View File

@ -546,9 +546,11 @@ def set_context(**kwargs):
Note: Note:
Attribute name is required for setting attributes. Attribute name is required for setting attributes.
The mode is not recommended to be changed after net was initilized because the implementations of some
operations are different in graph mode and pynative mode. Default: PYNATIVE_MODE.
Args: Args:
mode (int): Running in GRAPH_MODE(0) or PYNATIVE_MODE(1). Default: PYNATIVE_MODE. mode (int): Running in GRAPH_MODE(0) or PYNATIVE_MODE(1).
device_target (str): The target device to run, support "Ascend", "GPU", "CPU". Default: "Ascend". device_target (str): The target device to run, support "Ascend", "GPU", "CPU". Default: "Ascend".
device_id (int): Id of target device, the value must be in [0, device_num_per_host-1], device_id (int): Id of target device, the value must be in [0, device_num_per_host-1],
while device_num_per_host should no more than 4096. Default: 0. while device_num_per_host should no more than 4096. Default: 0.

View File

@ -148,7 +148,7 @@ class Cell:
def update_cell_type(self, cell_type): def update_cell_type(self, cell_type):
""" """
Update the current cell type mainly identify if quantization aware training network. The current cell type is updated when a quantization aware training network is encountered.
After being invoked, it can set the cell type to 'cell_type'. After being invoked, it can set the cell type to 'cell_type'.
""" """
@ -936,7 +936,7 @@ class GraphKernel(Cell):
Base class for GraphKernel. Base class for GraphKernel.
A `GraphKernel` a composite of basic primitives and can be compiled into a fused kernel automatically when A `GraphKernel` a composite of basic primitives and can be compiled into a fused kernel automatically when
context.set_context(enable_graph_kernel=True). enable_graph_kernel in context is set to True.
Examples: Examples:
>>> class Relu(GraphKernel): >>> class Relu(GraphKernel):

View File

@ -661,7 +661,7 @@ class LogSoftmax(GraphKernel):
Log Softmax activation function. Log Softmax activation function.
Applies the Log Softmax function to the input tensor on the specified axis. Applies the Log Softmax function to the input tensor on the specified axis.
Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i` Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i`
the Log Softmax function is shown as follows: the Log Softmax function is shown as follows:
.. math:: .. math::
@ -987,10 +987,10 @@ class LayerNorm(Cell):
Applies Layer Normalization over a mini-batch of inputs. Applies Layer Normalization over a mini-batch of inputs.
Layer normalization is widely used in recurrent neural networks. It applies Layer normalization is widely used in recurrent neural networks. It applies
normalization over a mini-batch of inputs for each single training case as described normalization on a mini-batch of inputs for each single training case as described
in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch
normalization, layer normalization performs exactly the same computation at training and normalization, layer normalization performs exactly the same computation at training and
testing times. It can be described using the following formula. It is applied across all channels testing time. It can be described using the following formula. It is applied across all channels
and pixel but only one batch size. and pixel but only one batch size.
.. math:: .. math::
@ -1139,9 +1139,9 @@ class LambNextMV(GraphKernel):
Outputs: Outputs:
Tuple of 2 Tensor. Tuple of 2 Tensor.
- **add3** (Tensor) - The shape is same as the shape after broadcasting, and the data type is - **add3** (Tensor) - The shape is the same as the shape after broadcasting, and the data type is
the one with high precision or high digits among the inputs. the one with high precision or high digits among the inputs.
- **realdiv4** (Tensor) - The shape is same as the shape after broadcasting, and the data type is - **realdiv4** (Tensor) - The shape is the same as the shape after broadcasting, and the data type is
the one with high precision or high digits among the inputs. the one with high precision or high digits among the inputs.
Examples: Examples:

View File

@ -55,7 +55,7 @@ class Softmax(Cell):
.. math:: .. math::
\text{softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_{j=0}^{n-1}\exp(x_j)}, \text{softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_{j=0}^{n-1}\exp(x_j)},
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
Args: Args:
axis (Union[int, tuple[int]]): The axis to apply Softmax operation, -1 means the last dimension. Default: -1. axis (Union[int, tuple[int]]): The axis to apply Softmax operation, -1 means the last dimension. Default: -1.
@ -87,11 +87,11 @@ class LogSoftmax(Cell):
Applies the LogSoftmax function to n-dimensional input tensor. Applies the LogSoftmax function to n-dimensional input tensor.
The input is transformed with Softmax function and then with log function to lie in range[-inf,0). The input is transformed by the Softmax function and then by the log function to lie in range[-inf,0).
Logsoftmax is defined as: Logsoftmax is defined as:
:math:`\text{logsoftmax}(x_i) = \log \left(\frac{\exp(x_i)}{\sum_{j=0}^{n-1} \exp(x_j)}\right)`, :math:`\text{logsoftmax}(x_i) = \log \left(\frac{\exp(x_i)}{\sum_{j=0}^{n-1} \exp(x_j)}\right)`,
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
Args: Args:
axis (int): The axis to apply LogSoftmax operation, -1 means the last dimension. Default: -1. axis (int): The axis to apply LogSoftmax operation, -1 means the last dimension. Default: -1.
@ -123,7 +123,7 @@ class ELU(Cell):
Exponential Linear Uint activation function. Exponential Linear Uint activation function.
Applies the exponential linear unit function element-wise. Applies the exponential linear unit function element-wise.
The activation function defined as: The activation function is defined as:
.. math:: .. math::
E_{i} = E_{i} =
@ -162,7 +162,7 @@ class ReLU(Cell):
Applies the rectified linear unit function element-wise. It returns Applies the rectified linear unit function element-wise. It returns
element-wise :math:`\max(0, x)`, specially, the neurons with the negative output element-wise :math:`\max(0, x)`, specially, the neurons with the negative output
will suppressed and the active neurons will stay the same. will be suppressed and the active neurons will stay the same.
Inputs: Inputs:
- **input_data** (Tensor) - The input of ReLU. - **input_data** (Tensor) - The input of ReLU.
@ -197,7 +197,7 @@ class ReLU6(Cell):
- **input_data** (Tensor) - The input of ReLU6. - **input_data** (Tensor) - The input of ReLU6.
Outputs: Outputs:
Tensor, which has the same type with `input_data`. Tensor, which has the same type as `input_data`.
Examples: Examples:
>>> input_x = Tensor(np.array([-1, -2, 0, 2, 1]), mindspore.float16) >>> input_x = Tensor(np.array([-1, -2, 0, 2, 1]), mindspore.float16)
@ -234,7 +234,7 @@ class LeakyReLU(Cell):
- **input_x** (Tensor) - The input of LeakyReLU. - **input_x** (Tensor) - The input of LeakyReLU.
Outputs: Outputs:
Tensor, has the same type and shape with the `input_x`. Tensor, has the same type and shape as the `input_x`.
Examples: Examples:
>>> input_x = Tensor(np.array([[-1.0, 4.0, -8.0], [2.0, -5.0, 9.0]]), mindspore.float32) >>> input_x = Tensor(np.array([[-1.0, 4.0, -8.0], [2.0, -5.0, 9.0]]), mindspore.float32)
@ -365,7 +365,7 @@ class PReLU(Cell):
PReLU is defined as: :math:`prelu(x_i)= \max(0, x_i) + w * \min(0, x_i)`, where :math:`x_i` PReLU is defined as: :math:`prelu(x_i)= \max(0, x_i) + w * \min(0, x_i)`, where :math:`x_i`
is an element of an channel of the input. is an element of an channel of the input.
Here :math:`w` is an learnable parameter with default initial value 0.25. Here :math:`w` is a learnable parameter with a default initial value 0.25.
Parameter :math:`w` has dimensionality of the argument channel. If called without argument Parameter :math:`w` has dimensionality of the argument channel. If called without argument
channel, a single parameter :math:`w` will be shared across all channels. channel, a single parameter :math:`w` will be shared across all channels.
@ -413,7 +413,7 @@ class PReLU(Cell):
class HSwish(Cell): class HSwish(Cell):
r""" r"""
rHard swish activation function. Hard swish activation function.
Applies hswish-type activation element-wise. The input is a Tensor with any valid shape. Applies hswish-type activation element-wise. The input is a Tensor with any valid shape.
@ -422,7 +422,7 @@ class HSwish(Cell):
.. math:: .. math::
\text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6}, \text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6},
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
Inputs: Inputs:
- **input_data** (Tensor) - The input of HSwish. - **input_data** (Tensor) - The input of HSwish.
@ -456,7 +456,7 @@ class HSigmoid(Cell):
.. math:: .. math::
\text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})), \text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})),
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
Inputs: Inputs:
- **input_data** (Tensor) - The input of HSigmoid. - **input_data** (Tensor) - The input of HSigmoid.

View File

@ -65,7 +65,7 @@ class Dropout(Cell):
dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32. dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32.
Raises: Raises:
ValueError: If keep_prob is not in range (0, 1). ValueError: If `keep_prob` is not in range (0, 1).
Inputs: Inputs:
- **input** (Tensor) - An N-D Tensor. - **input** (Tensor) - An N-D Tensor.
@ -373,8 +373,8 @@ class OneHot(Cell):
axis is created at dimension `axis`. axis is created at dimension `axis`.
Args: Args:
axis (int): Features x depth if axis == -1, depth x features axis (int): Features x depth if axis is -1, depth x features
if axis == 0. Default: -1. if axis is 0. Default: -1.
depth (int): A scalar defining the depth of the one hot dimension. Default: 1. depth (int): A scalar defining the depth of the one hot dimension. Default: 1.
on_value (float): A scalar defining the value to fill in output[i][j] on_value (float): A scalar defining the value to fill in output[i][j]
when indices[j] = i. Default: 1.0. when indices[j] = i. Default: 1.0.
@ -492,18 +492,18 @@ class Unfold(Cell):
The input tensor must be a 4-D tensor and the data format is NCHW. The input tensor must be a 4-D tensor and the data format is NCHW.
Args: Args:
ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or list of int, ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or a list of integers,
and the format is [1, ksize_row, ksize_col, 1]. and the format is [1, ksize_row, ksize_col, 1].
strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches, strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches,
should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1]. should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1].
rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dim rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dimension
pixel positions, should be a tuple or list of int, and the format is [1, rate_row, rate_col, 1]. pixel positions, should be a tuple or a list of integers, and the format is [1, rate_row, rate_col, 1].
padding (str): The type of padding algorithm, is a string whose value is "same" or "valid", padding (str): The type of padding algorithm, is a string whose value is "same" or "valid",
not case sensitive. Default: "valid". not case sensitive. Default: "valid".
- same: Means that the patch can take the part beyond the original image, and this part is filled with 0. - same: Means that the patch can take the part beyond the original image, and this part is filled with 0.
- valid: Means that the patch area taken must be completely contained in the original image. - valid: Means that the taken patch area must be completely covered in the original image.
Inputs: Inputs:
- **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_depth, in_row, in_col] and - **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_depth, in_row, in_col] and
@ -511,7 +511,7 @@ class Unfold(Cell):
Outputs: Outputs:
Tensor, a 4-D tensor whose data type is same as 'input_x', Tensor, a 4-D tensor whose data type is same as 'input_x',
and the shape is [out_batch, out_depth, out_row, out_col], the out_batch is same as the in_batch. and the shape is [out_batch, out_depth, out_row, out_col], the out_batch is the same as the in_batch.
Examples: Examples:
>>> net = Unfold(ksizes=[1, 2, 2, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1]) >>> net = Unfold(ksizes=[1, 2, 2, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1])
@ -556,11 +556,11 @@ class MatrixDiag(Cell):
Returns a batched diagonal tensor with a given batched diagonal values. Returns a batched diagonal tensor with a given batched diagonal values.
Inputs: Inputs:
- **x** (Tensor) - The diagonal values. It can be of the following data types: - **x** (Tensor) - The diagonal values. It can be one of the following data types:
float32, float16, int32, int8, uint8. float32, float16, int32, int8, and uint8.
Outputs: Outputs:
Tensor, same type as input `x`. The shape should be x.shape + (x.shape[-1], ). Tensor, has the same type as input `x`. The shape should be x.shape + (x.shape[-1], ).
Examples: Examples:
>>> x = Tensor(np.array([1, -1]), mstype.float32) >>> x = Tensor(np.array([1, -1]), mstype.float32)
@ -587,11 +587,11 @@ class MatrixDiagPart(Cell):
Returns the batched diagonal part of a batched tensor. Returns the batched diagonal part of a batched tensor.
Inputs: Inputs:
- **x** (Tensor) - The batched tensor. It can be of the following data types: - **x** (Tensor) - The batched tensor. It can be one of the following data types:
float32, float16, int32, int8, uint8. float32, float16, int32, int8, and uint8.
Outputs: Outputs:
Tensor, same type as input `x`. The shape should be x.shape[:-2] + [min(x.shape[-2:])]. Tensor, has the same type as input `x`. The shape should be x.shape[:-2] + [min(x.shape[-2:])].
Examples: Examples:
>>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32) >>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32)
@ -617,12 +617,12 @@ class MatrixSetDiag(Cell):
Modify the batched diagonal part of a batched tensor. Modify the batched diagonal part of a batched tensor.
Inputs: Inputs:
- **x** (Tensor) - The batched tensor. It can be of the following data types: - **x** (Tensor) - The batched tensor. It can be one of the following data types:
float32, float16, int32, int8, uint8. float32, float16, int32, int8, and uint8.
- **diagonal** (Tensor) - The diagonal values. - **diagonal** (Tensor) - The diagonal values.
Outputs: Outputs:
Tensor, same type as input `x`. The shape same as `x`. Tensor, has the same type and shape as input `x`.
Examples: Examples:
>>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32) >>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32)

View File

@ -72,7 +72,7 @@ class SequentialCell(Cell):
args (list, OrderedDict): List of subclass of Cell. args (list, OrderedDict): List of subclass of Cell.
Raises: Raises:
TypeError: If arg is not of type list or OrderedDict. TypeError: If the type of the argument is not list or OrderedDict.
Inputs: Inputs:
- **input** (Tensor) - Tensor with shape according to the first Cell in the sequence. - **input** (Tensor) - Tensor with shape according to the first Cell in the sequence.

View File

@ -131,7 +131,7 @@ class Conv2d(_Conv):
Args: Args:
in_channels (int): The number of input channel :math:`C_{in}`. in_channels (int): The number of input channel :math:`C_{in}`.
out_channels (int): The number of output channel :math:`C_{out}`. out_channels (int): The number of output channel :math:`C_{out}`.
kernel_size (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the height kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the height
and width of the 2D convolution window. Single int means the value is for both the height and the width of and width of the 2D convolution window. Single int means the value is for both the height and the width of
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
width of the kernel. width of the kernel.
@ -147,7 +147,7 @@ class Conv2d(_Conv):
last extra padding will be done from the bottom and the right side. If this mode is set, `padding` last extra padding will be done from the bottom and the right side. If this mode is set, `padding`
must be 0. must be 0.
- valid: Adopts the way of discarding. The possibly largest height and width of output will be returned - valid: Adopts the way of discarding. The possible largest height and width of output will be returned
without padding. Extra pixels will be discarded. If this mode is set, `padding` without padding. Extra pixels will be discarded. If this mode is set, `padding`
must be 0. must be 0.
@ -158,7 +158,7 @@ class Conv2d(_Conv):
the padding of top, bottom, left and right is the same, equal to padding. If `padding` is a tuple the padding of top, bottom, left and right is the same, equal to padding. If `padding` is a tuple
with four integers, the padding of top, bottom, left and right will be equal to padding[0], with four integers, the padding of top, bottom, left and right will be equal to padding[0],
padding[1], padding[2], and padding[3] accordingly. Default: 0. padding[1], padding[2], and padding[3] accordingly. Default: 0.
dilation (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the dilation rate dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate
to use for dilated convolution. If set to be :math:`k > 1`, there will to use for dilated convolution. If set to be :math:`k > 1`, there will
be :math:`k - 1` pixels skipped for each sampling location. Its value should be :math:`k - 1` pixels skipped for each sampling location. Its value should
be greater or equal to 1 and bounded by the height and width of the be greater or equal to 1 and bounded by the height and width of the
@ -451,7 +451,7 @@ class Conv2dTranspose(_Conv):
Args: Args:
in_channels (int): The number of channels in the input space. in_channels (int): The number of channels in the input space.
out_channels (int): The number of channels in the output space. out_channels (int): The number of channels in the output space.
kernel_size (Union[int, tuple]): int or tuple with 2 integers, which specifies the height kernel_size (Union[int, tuple]): int or a tuple of 2 integers, which specifies the height
and width of the 2D convolution window. Single int means the value is for both the height and the width of and width of the 2D convolution window. Single int means the value is for both the height and the width of
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
width of the kernel. width of the kernel.
@ -825,7 +825,7 @@ class DepthwiseConv2d(Cell):
Args: Args:
in_channels (int): The number of input channel :math:`C_{in}`. in_channels (int): The number of input channel :math:`C_{in}`.
out_channels (int): The number of output channel :math:`C_{out}`. out_channels (int): The number of output channel :math:`C_{out}`.
kernel_size (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the height kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the height
and width of the 2D convolution window. Single int means the value is for both the height and the width of and width of the 2D convolution window. Single int means the value is for both the height and the width of
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
width of the kernel. width of the kernel.
@ -841,7 +841,7 @@ class DepthwiseConv2d(Cell):
last extra padding will be done from the bottom and the right side. If this mode is set, `padding` last extra padding will be done from the bottom and the right side. If this mode is set, `padding`
must be 0. must be 0.
- valid: Adopts the way of discarding. The possibly largest height and width of output will be returned - valid: Adopts the way of discarding. The possible largest height and width of output will be returned
without padding. Extra pixels will be discarded. If this mode is set, `padding` without padding. Extra pixels will be discarded. If this mode is set, `padding`
must be 0. must be 0.
@ -849,16 +849,16 @@ class DepthwiseConv2d(Cell):
Tensor borders. `padding` should be greater than or equal to 0. Tensor borders. `padding` should be greater than or equal to 0.
padding (int): Implicit paddings on both sides of the input. Default: 0. padding (int): Implicit paddings on both sides of the input. Default: 0.
dilation (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the dilation rate dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate
to use for dilated convolution. If set to be :math:`k > 1`, there will to use for dilated convolution. If set to be :math:`k > 1`, there will
be :math:`k - 1` pixels skipped for each sampling location. Its value should be :math:`k - 1` pixels skipped for each sampling location. Its value should
be greater or equal to 1 and bounded by the height and width of the be greater than or equal to 1 and bounded by the height and width of the
input. Default: 1. input. Default: 1.
group (int): Split filter into groups, `in_ channels` and `out_channels` should be group (int): Split filter into groups, `in_ channels` and `out_channels` should be
divisible by the number of groups. Default: 1. divisible by the number of groups. Default: 1.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False. has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel. weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified, It can be a Tensor, a string, an Initializer or a number. When a string is specified,
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones' as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of

View File

@ -36,7 +36,7 @@ class Embedding(Cell):
the corresponding word embeddings. the corresponding word embeddings.
Note: Note:
When 'use_one_hot' is set to True, the input should be of type mindspore.int32. When 'use_one_hot' is set to True, the type of the input should be mindspore.int32.
Args: Args:
vocab_size (int): Size of the dictionary of embeddings. vocab_size (int): Size of the dictionary of embeddings.
@ -48,9 +48,9 @@ class Embedding(Cell):
dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32. dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32.
Inputs: Inputs:
- **input** (Tensor) - Tensor of shape :math:`(\text{batch_size}, \text{input_length})`. The element of - **input** (Tensor) - Tensor of shape :math:`(\text{batch_size}, \text{input_length})`. The elements of
the Tensor should be integer and not larger than vocab_size. else the corresponding embedding vector is zero the Tensor should be integer and not larger than vocab_size. Otherwise the corresponding embedding vector will
if larger than vocab_size. be zero.
Outputs: Outputs:
Tensor of shape :math:`(\text{batch_size}, \text{input_length}, \text{embedding_size})`. Tensor of shape :math:`(\text{batch_size}, \text{input_length}, \text{embedding_size})`.

View File

@ -253,7 +253,7 @@ class MSSSIM(Cell):
Args: Args:
max_val (Union[int, float]): The dynamic range of the pixel values (255 for 8-bit grayscale images). max_val (Union[int, float]): The dynamic range of the pixel values (255 for 8-bit grayscale images).
Default: 1.0. Default: 1.0.
power_factors (Union[tuple, list]): Iterable of weights for each of the scales. power_factors (Union[tuple, list]): Iterable of weights for each scal e.
Default: (0.0448, 0.2856, 0.3001, 0.2363, 0.1333). Default values obtained by Wang et al. Default: (0.0448, 0.2856, 0.3001, 0.2363, 0.1333). Default values obtained by Wang et al.
filter_size (int): The size of the Gaussian filter. Default: 11. filter_size (int): The size of the Gaussian filter. Default: 11.
filter_sigma (float): The standard deviation of Gaussian kernel. Default: 1.5. filter_sigma (float): The standard deviation of Gaussian kernel. Default: 1.5.

View File

@ -35,7 +35,7 @@ class LSTM(Cell):
Applies a LSTM to the input. Applies a LSTM to the input.
There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline
and another is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`. and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell
state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at
time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell
@ -68,18 +68,17 @@ class LSTM(Cell):
input_size (int): Number of features of input. input_size (int): Number of features of input.
hidden_size (int): Number of features of hidden layer. hidden_size (int): Number of features of hidden layer.
num_layers (int): Number of layers of stacked LSTM . Default: 1. num_layers (int): Number of layers of stacked LSTM . Default: 1.
has_bias (bool): Specifies whether has bias `b_ih` and `b_hh`. Default: True. has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True.
batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False. batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False.
dropout (float, int): If not 0, append `Dropout` layer on the outputs of each dropout (float, int): If not 0, append `Dropout` layer on the outputs of each
LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0]. LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0].
bidirectional (bool): Specifies whether this is a bidirectional LSTM. If set True, bidirectional (bool): Specifies whether it is a bidirectional LSTM. Default: False.
number of directions will be 2 otherwise number of directions is 1. Default: False.
Inputs: Inputs:
- **input** (Tensor) - Tensor of shape (seq_len, batch_size, `input_size`). - **input** (Tensor) - Tensor of shape (seq_len, batch_size, `input_size`).
- **hx** (tuple) - A tuple of two Tensors (h_0, c_0) both of data type mindspore.float32 or - **hx** (tuple) - A tuple of two Tensors (h_0, c_0) both of data type mindspore.float32 or
mindspore.float16 and shape (num_directions * `num_layers`, batch_size, `hidden_size`). mindspore.float16 and shape (num_directions * `num_layers`, batch_size, `hidden_size`).
Data type of `hx` should be the same of `input`. Data type of `hx` should be the same as `input`.
Outputs: Outputs:
Tuple, a tuple constains (`output`, (`h_n`, `c_n`)). Tuple, a tuple constains (`output`, (`h_n`, `c_n`)).
@ -205,7 +204,7 @@ class LSTMCell(Cell):
Applies a LSTM layer to the input. Applies a LSTM layer to the input.
There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline
and another is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`. and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell
state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at
time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell
@ -238,7 +237,7 @@ class LSTMCell(Cell):
input_size (int): Number of features of input. input_size (int): Number of features of input.
hidden_size (int): Number of features of hidden layer. hidden_size (int): Number of features of hidden layer.
layer_index (int): index of current layer of stacked LSTM . Default: 0. layer_index (int): index of current layer of stacked LSTM . Default: 0.
has_bias (bool): Specifies whether has bias `b_ih` and `b_hh`. Default: True. has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True.
batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False. batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False.
dropout (float, int): If not 0, append `Dropout` layer on the outputs of each dropout (float, int): If not 0, append `Dropout` layer on the outputs of each
LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0]. LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0].

View File

@ -243,6 +243,10 @@ class BatchNorm1d(_BatchNorm):
.. math:: .. math::
y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta
Note:
The implementation of BatchNorm is different in graph mode and pynative mode, therefore the mode is not
recommended to be changed after net was initilized.
Args: Args:
num_features (int): `C` from an expected input of size (N, C). num_features (int): `C` from an expected input of size (N, C).
eps (float): A value added to the denominator for numerical stability. Default: 1e-5. eps (float): A value added to the denominator for numerical stability. Default: 1e-5.
@ -319,6 +323,10 @@ class BatchNorm2d(_BatchNorm):
.. math:: .. math::
y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta
Note:
The implementation of BatchNorm is different in graph mode and pynative mode, therefore that mode can not be
changed after net was initilized.
Args: Args:
num_features (int): `C` from an expected input of size (N, C, H, W). num_features (int): `C` from an expected input of size (N, C, H, W).
eps (float): A value added to the denominator for numerical stability. Default: 1e-5. eps (float): A value added to the denominator for numerical stability. Default: 1e-5.
@ -384,8 +392,8 @@ class GlobalBatchNorm(_BatchNorm):
r""" r"""
Global normalization layer over a N-dimension input. Global normalization layer over a N-dimension input.
Global Normalization is cross device synchronized batch normalization. Batch Normalization implementation Global Normalization is cross device synchronized batch normalization. The implementation of Batch Normalization
only normalize the data within each device. Global normalization will normalize the input within the group. only normalizes the data within each device. Global normalization will normalize the input within the group.
It has been described in the paper `Batch Normalization: Accelerating Deep Network Training by It has been described in the paper `Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift <https://arxiv.org/abs/1502.03167>`_. It rescales and recenters the Reducing Internal Covariate Shift <https://arxiv.org/abs/1502.03167>`_. It rescales and recenters the
feature using a mini-batch of data and the learned parameters which can be described in the following formula. feature using a mini-batch of data and the learned parameters which can be described in the following formula.
@ -467,10 +475,10 @@ class LayerNorm(Cell):
Applies Layer Normalization over a mini-batch of inputs. Applies Layer Normalization over a mini-batch of inputs.
Layer normalization is widely used in recurrent neural networks. It applies Layer normalization is widely used in recurrent neural networks. It applies
normalization over a mini-batch of inputs for each single training case as described normalization on a mini-batch of inputs for each single training case as described
in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch
normalization, layer normalization performs exactly the same computation at training and normalization, layer normalization performs exactly the same computation at training and
testing times. It can be described using the following formula. It is applied across all channels testing time. It can be described using the following formula. It is applied across all channels
and pixel but only one batch size. and pixel but only one batch size.
.. math:: .. math::
@ -545,7 +553,7 @@ class GroupNorm(Cell):
Group Normalization over a mini-batch of inputs. Group Normalization over a mini-batch of inputs.
Group normalization is widely used in recurrent neural networks. It applies Group normalization is widely used in recurrent neural networks. It applies
normalization over a mini-batch of inputs for each single training case as described normalization on a mini-batch of inputs for each single training case as described
in the paper `Group Normalization <https://arxiv.org/pdf/1803.08494.pdf>`_. Group normalization in the paper `Group Normalization <https://arxiv.org/pdf/1803.08494.pdf>`_. Group normalization
divides the channels into groups and computes within each group the mean and variance for normalization, divides the channels into groups and computes within each group the mean and variance for normalization,
and it performs very stable over a wide range of batch size. It can be described using the following formula. and it performs very stable over a wide range of batch size. It can be described using the following formula.
@ -557,7 +565,7 @@ class GroupNorm(Cell):
num_groups (int): The number of groups to be divided along the channel dimension. num_groups (int): The number of groups to be divided along the channel dimension.
num_channels (int): The number of channels per group. num_channels (int): The number of channels per group.
eps (float): A value added to the denominator for numerical stability. Default: 1e-5. eps (float): A value added to the denominator for numerical stability. Default: 1e-5.
affine (bool): A bool value, this layer will has learnable affine parameters when set to true. Default: True. affine (bool): A bool value, this layer will have learnable affine parameters when set to true. Default: True.
gamma_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the gamma weight. gamma_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the gamma weight.
The values of str refer to the function `initializer` including 'zeros', 'ones', 'xavier_uniform', The values of str refer to the function `initializer` including 'zeros', 'ones', 'xavier_uniform',
'he_uniform', etc. Default: 'ones'. 'he_uniform', etc. Default: 'ones'.

View File

@ -61,7 +61,7 @@ class Conv2dBnAct(Cell):
Args: Args:
in_channels (int): The number of input channel :math:`C_{in}`. in_channels (int): The number of input channel :math:`C_{in}`.
out_channels (int): The number of output channel :math:`C_{out}`. out_channels (int): The number of output channel :math:`C_{out}`.
kernel_size (Union[int, tuple]): The data type is int or tuple with 2 integers. Specifies the height kernel_size (Union[int, tuple]): The data type is int or a tuple of 2 integers. Specifies the height
and width of the 2D convolution window. Single int means the value is for both height and width of and width of the 2D convolution window. Single int means the value is for both height and width of
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
width of the kernel. width of the kernel.
@ -292,19 +292,19 @@ class BatchNormFoldCell(Cell):
class FakeQuantWithMinMax(Cell): class FakeQuantWithMinMax(Cell):
r""" r"""
Quantization aware op. This OP provide Fake quantization observer function on data with min and max. Quantization aware op. This OP provides the fake quantization observer function on data with min and max.
Args: Args:
min_init (int, float): The dimension of channel or 1(layer). Default: -6. min_init (int, float): The dimension of channel or 1(layer). Default: -6.
max_init (int, float): The dimension of channel or 1(layer). Default: 6. max_init (int, float): The dimension of channel or 1(layer). Default: 6.
ema (bool): Exponential Moving Average algorithm update min and max. Default: False. ema (bool): The exponential Moving Average algorithm updates min and max. Default: False.
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
per_channel (bool): Quantization granularity based on layer or on channel. Default: False. per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
channel_axis (int): Quantization by channel axis. Default: 1. channel_axis (int): Quantization by channel axis. Default: 1.
num_channels (int): declarate the min and max channel size, Default: 1. num_channels (int): declarate the min and max channel size, Default: 1.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
Inputs: Inputs:
@ -431,7 +431,7 @@ class Conv2dBnFoldQuant(Cell):
variance vector. Default: 'ones'. variance vector. Default: 'ones'.
fake (bool): Whether Conv2dBnFoldQuant Cell adds FakeQuantWithMinMax op. Default: True. fake (bool): Whether Conv2dBnFoldQuant Cell adds FakeQuantWithMinMax op. Default: True.
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): The Quantization delay parameters according to the global step. Default: 0. quant_delay (int): The Quantization delay parameters according to the global step. Default: 0.
@ -614,7 +614,7 @@ class Conv2dBnWithoutFoldQuant(Cell):
Default: 'normal'. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'. bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'.
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
@ -736,7 +736,7 @@ class Conv2dQuant(Cell):
Default: 'normal'. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'. bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'.
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
@ -845,7 +845,7 @@ class DenseQuant(Cell):
has_bias (bool): Specifies whether the layer uses a bias vector. Default: True. has_bias (bool): Specifies whether the layer uses a bias vector. Default: True.
activation (str): The regularization function applied to the output of the layer, eg. 'relu'. Default: None. activation (str): The regularization function applied to the output of the layer, eg. 'relu'. Default: None.
per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. per_channel (bool): FakeQuantWithMinMax Parameters. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
@ -947,15 +947,14 @@ class ActQuant(_QuantActivation):
r""" r"""
Quantization aware training activation function. Quantization aware training activation function.
Add Fake Quant OP after activation. Not Recommand to used these cell for Fake Quant Op Add the fake quant op to the end of activation op, by which the output of activation op will be truncated.
Will climp the max range of the activation and the relu6 do the same operation. Please check `FakeQuantWithMinMax` for more details.
This part is a more detailed overview of ReLU6 op.
Args: Args:
activation (Cell): Activation cell class. activation (Cell): Activation cell class.
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
per_channel (bool): Quantization granularity based on layer or on channel. Default: False. per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global steps. Default: 0. quant_delay (int): Quantization delay parameters according to the global steps. Default: 0.
@ -1010,7 +1009,7 @@ class LeakyReLUQuant(_QuantActivation):
activation (Cell): Activation cell class. activation (Cell): Activation cell class.
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
per_channel (bool): Quantization granularity based on layer or on channel. Default: False. per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
@ -1080,9 +1079,9 @@ class HSwishQuant(_QuantActivation):
activation (Cell): Activation cell class. activation (Cell): Activation cell class.
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
per_channel (bool): Quantization granularity based on layer or on channel. Default: False. per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
Inputs: Inputs:
@ -1149,9 +1148,9 @@ class HSigmoidQuant(_QuantActivation):
activation (Cell): Activation cell class. activation (Cell): Activation cell class.
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
per_channel (bool): Quantization granularity based on layer or on channel. Default: False. per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
Inputs: Inputs:
@ -1217,7 +1216,7 @@ class TensorAddQuant(Cell):
Args: Args:
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
per_channel (bool): Quantization granularity based on layer or on channel. Default: False. per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.
@ -1269,7 +1268,7 @@ class MulQuant(Cell):
Args: Args:
ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999.
per_channel (bool): Quantization granularity based on layer or on channel. Default: False. per_channel (bool): Quantization granularity based on layer or on channel. Default: False.
num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8.
symmetric (bool): The quantization algorithm is symmetric or not. Default: False. symmetric (bool): The quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False.
quant_delay (int): Quantization delay parameters according to the global step. Default: 0. quant_delay (int): Quantization delay parameters according to the global step. Default: 0.

View File

@ -80,7 +80,7 @@ class L1Loss(_Loss):
When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size. When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size.
Args: Args:
reduction (str): Type of reduction to apply to loss. The optional values are "mean", "sum", "none". reduction (str): Type of reduction to be applied to loss. The optional values are "mean", "sum", and "none".
Default: "mean". Default: "mean".
Inputs: Inputs:
@ -107,7 +107,7 @@ class L1Loss(_Loss):
class MSELoss(_Loss): class MSELoss(_Loss):
r""" r"""
MSELoss create a criterion to measures the mean squared error (squared L2-norm) between :math:`x` and :math:`y` MSELoss creates a criterion to measure the mean squared error (squared L2-norm) between :math:`x` and :math:`y`
by element, where :math:`x` is the input and :math:`y` is the target. by element, where :math:`x` is the input and :math:`y` is the target.
For simplicity, let :math:`x` and :math:`y` be 1-dimensional Tensor with length :math:`N`, For simplicity, let :math:`x` and :math:`y` be 1-dimensional Tensor with length :math:`N`,
@ -120,7 +120,7 @@ class MSELoss(_Loss):
When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size. When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size.
Args: Args:
reduction (str): Type of reduction to apply to loss. The optional values are "mean", "sum", "none". reduction (str): Type of reduction to be applied to loss. The optional values are "mean", "sum", and "none".
Default: "mean". Default: "mean".
Inputs: Inputs:
@ -210,14 +210,14 @@ class SoftmaxCrossEntropyWithLogits(_Loss):
Note: Note:
While the target classes are mutually exclusive, i.e., only one class is positive in the target, the predicted While the target classes are mutually exclusive, i.e., only one class is positive in the target, the predicted
probabilities need not be exclusive. All that is required is that the predicted probability distribution probabilities need not to be exclusive. It is only required that the predicted probability distribution
of entry is a valid one. of entry is a valid one.
Args: Args:
is_grad (bool): Specifies whether calculate grad only. Default: True. is_grad (bool): Specifies whether calculate grad only. Default: True.
sparse (bool): Specifies whether labels use sparse format or not. Default: False. sparse (bool): Specifies whether labels use sparse format or not. Default: False.
reduction (Union[str, None]): Type of reduction to apply to loss. Support 'sum' or 'mean' If None, reduction (Union[str, None]): Type of reduction to be applied to loss. Support 'sum' and 'mean'. If None,
do not reduction. Default: None. do not perform reduction. Default: None.
smooth_factor (float): Label smoothing factor. It is a optional input which should be in range [0, 1]. smooth_factor (float): Label smoothing factor. It is a optional input which should be in range [0, 1].
Default: 0. Default: 0.
num_classes (int): The number of classes in the task. It is a optional input Default: 2. num_classes (int): The number of classes in the task. It is a optional input Default: 2.
@ -225,7 +225,7 @@ class SoftmaxCrossEntropyWithLogits(_Loss):
Inputs: Inputs:
- **logits** (Tensor) - Tensor of shape (N, C). - **logits** (Tensor) - Tensor of shape (N, C).
- **labels** (Tensor) - Tensor of shape (N, ). If `sparse` is True, The type of - **labels** (Tensor) - Tensor of shape (N, ). If `sparse` is True, The type of
`labels` is mindspore.int32. If `sparse` is False, the type of `labels` is same as the type of `logits`. `labels` is mindspore.int32. If `sparse` is False, the type of `labels` is the same as the type of `logits`.
Outputs: Outputs:
Tensor, a tensor of the same shape as logits with the component-wise Tensor, a tensor of the same shape as logits with the component-wise
@ -282,8 +282,8 @@ class SoftmaxCrossEntropyExpand(Cell):
where :math:`x_i` is a 1D score Tensor, :math:`t_i` is the target class. where :math:`x_i` is a 1D score Tensor, :math:`t_i` is the target class.
Note: Note:
When argument sparse is set to True, the format of label is the index When argument sparse is set to True, the format of the label is the index
range from :math:`0` to :math:`C - 1` instead of one-hot vectors. ranging from :math:`0` to :math:`C - 1` instead of one-hot vectors.
Args: Args:
sparse(bool): Specifies whether labels use sparse format or not. Default: False. sparse(bool): Specifies whether labels use sparse format or not. Default: False.

View File

@ -69,7 +69,7 @@ def names():
def get_metric_fn(name, *args, **kwargs): def get_metric_fn(name, *args, **kwargs):
""" """
Gets the metric method base on the input name. Gets the metric method based on the input name.
Args: Args:
name (str): The name of metric method. Refer to the '__factory__' name (str): The name of metric method. Refer to the '__factory__'

View File

@ -82,7 +82,7 @@ class Metric(metaclass=ABCMeta):
@abstractmethod @abstractmethod
def clear(self): def clear(self):
""" """
A interface describes the behavior of clearing the internal evaluation result. An interface describes the behavior of clearing the internal evaluation result.
Note: Note:
All subclasses should override this interface. All subclasses should override this interface.
@ -92,7 +92,7 @@ class Metric(metaclass=ABCMeta):
@abstractmethod @abstractmethod
def eval(self): def eval(self):
""" """
A interface describes the behavior of computing the evaluation result. An interface describes the behavior of computing the evaluation result.
Note: Note:
All subclasses should override this interface. All subclasses should override this interface.
@ -102,7 +102,7 @@ class Metric(metaclass=ABCMeta):
@abstractmethod @abstractmethod
def update(self, *inputs): def update(self, *inputs):
""" """
A interface describes the behavior of updating the internal evaluation result. An interface describes the behavior of updating the internal evaluation result.
Note: Note:
All subclasses should override this interface. All subclasses should override this interface.

View File

@ -36,8 +36,8 @@ def _update_run_op(beta1, beta2, eps, lr, weight_decay, param, m, v, gradient, d
Update parameters. Update parameters.
Args: Args:
beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0. eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0.
lr (Tensor): Learning rate. lr (Tensor): Learning rate.
weight_decay (Number): Weight decay. Should be equal to or greater than 0. weight_decay (Number): Weight decay. Should be equal to or greater than 0.
@ -180,12 +180,12 @@ class Adam(Optimizer):
the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters
which in the 'order_params' should be in one of group parameters. which in the 'order_params' should be in one of group parameters.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use the dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use the dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
Default: 1e-3. Default: 1e-3.
beta1 (float): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0). beta1 (float): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
@ -195,11 +195,11 @@ class Adam(Optimizer):
eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default: eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default:
1e-8. 1e-8.
use_locking (bool): Whether to enable a lock to protect updating variable tensors. use_locking (bool): Whether to enable a lock to protect updating variable tensors.
If True, updating of the var, m, and v tensors will be protected by a lock. If true, updates of the var, m, and v tensors will be protected by a lock.
If False, the result is unpredictable. Default: False. If false, the result is unpredictable. Default: False.
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
If True, update the gradients using NAG. If true, update the gradients using NAG.
If False, update the gradients without using NAG. Default: False. If false, update the gradients without using NAG. Default: False.
weight_decay (float): Weight decay (L2 penalty). It should be equal to or greater than 0. Default: 0.0. weight_decay (float): Weight decay (L2 penalty). It should be equal to or greater than 0. Default: 0.0.
loss_scale (float): A floating point value for the loss scale. Should be greater than 0. Default: 1.0. loss_scale (float): A floating point value for the loss scale. Should be greater than 0. Default: 1.0.
@ -304,12 +304,12 @@ class AdamWeightDecay(Optimizer):
the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters
which in the 'order_params' should be in one of group parameters. which in the 'order_params' should be in one of group parameters.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use the dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use the dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
Default: 1e-3. Default: 1e-3.
beta1 (float): The exponential decay rate for the 1st moment estimations. Default: 0.9. beta1 (float): The exponential decay rate for the 1st moment estimations. Default: 0.9.

View File

@ -114,12 +114,12 @@ class FTRL(Optimizer):
than or equal to zero. Use fixed learning rate if lr_power is zero. Default: -0.5. than or equal to zero. Use fixed learning rate if lr_power is zero. Default: -0.5.
l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0. l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0.
l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0. l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0.
use_locking (bool): If True use locks for update operation. Default: False. use_locking (bool): If True, use locks for updating operation. Default: False.
loss_scale (float): Value for the loss scale. It should be equal to or greater than 1.0. Default: 1.0. loss_scale (float): Value for the loss scale. It should be equal to or greater than 1.0. Default: 1.0.
weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0. weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0.
Inputs: Inputs:
- **grads** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is as same as the `params` - **grads** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the same as the `params`
in optimizer. in optimizer.
Outputs: Outputs:

View File

@ -39,8 +39,8 @@ def _update_run_op(beta1, beta2, eps, global_step, lr, weight_decay, param, m, v
Update parameters. Update parameters.
Args: Args:
beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0. eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0.
lr (Tensor): Learning rate. lr (Tensor): Learning rate.
weight_decay (Number): Weight decay. Should be equal to or greater than 0. weight_decay (Number): Weight decay. Should be equal to or greater than 0.
@ -122,8 +122,8 @@ def _update_run_op_graph_kernel(beta1, beta2, eps, global_step, lr, weight_decay
Update parameters. Update parameters.
Args: Args:
beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0. eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0.
lr (Tensor): Learning rate. lr (Tensor): Learning rate.
weight_decay (Number): Weight decay. Should be equal to or greater than 0. weight_decay (Number): Weight decay. Should be equal to or greater than 0.
@ -184,7 +184,7 @@ def _check_param_value(beta1, beta2, eps, prim_name):
class Lamb(Optimizer): class Lamb(Optimizer):
""" """
Lamb Dynamic LR. Lamb Dynamic Learning Rate.
LAMB is an optimization algorithm employing a layerwise adaptive large batch LAMB is an optimization algorithm employing a layerwise adaptive large batch
optimization technique. Refer to the paper `LARGE BATCH OPTIMIZATION FOR DEEP LEARNING: TRAINING BERT IN 76 optimization technique. Refer to the paper `LARGE BATCH OPTIMIZATION FOR DEEP LEARNING: TRAINING BERT IN 76
@ -214,16 +214,16 @@ class Lamb(Optimizer):
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
in the value of 'order_params' should be in one of group parameters. in the value of 'order_params' should be in one of group parameters.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
beta1 (float): The exponential decay rate for the 1st moment estimates. Default: 0.9. beta1 (float): The exponential decay rate for the 1st moment estimations. Default: 0.9.
Should be in range (0.0, 1.0). Should be in range (0.0, 1.0).
beta2 (float): The exponential decay rate for the 2nd moment estimates. Default: 0.999. beta2 (float): The exponential decay rate for the 2nd moment estimations. Default: 0.999.
Should be in range (0.0, 1.0). Should be in range (0.0, 1.0).
eps (float): Term added to the denominator to improve numerical stability. Default: 1e-6. eps (float): Term added to the denominator to improve numerical stability. Default: 1e-6.
Should be greater than 0. Should be greater than 0.

View File

@ -58,12 +58,12 @@ class LARS(Optimizer):
epsilon (float): Term added to the denominator to improve numerical stability. Default: 1e-05. epsilon (float): Term added to the denominator to improve numerical stability. Default: 1e-05.
coefficient (float): Trust coefficient for calculating the local learning rate. Default: 0.001. coefficient (float): Trust coefficient for calculating the local learning rate. Default: 0.001.
use_clip (bool): Whether to use clip operation for calculating the local learning rate. Default: False. use_clip (bool): Whether to use clip operation for calculating the local learning rate. Default: False.
lars_filter (Function): A function to determine whether apply lars algorithm. Default: lars_filter (Function): A function to determine whether apply the LARS algorithm. Default:
lambda x: 'LayerNorm' not in x.name and 'bias' not in x.name. lambda x: 'LayerNorm' not in x.name and 'bias' not in x.name.
Inputs: Inputs:
- **gradients** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is - **gradients** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the
as same as the `params` in optimizer. as same as the `params` in the optimizer.
Outputs: Outputs:
Union[Tensor[bool], tuple[Parameter]], it depends on the output of `optimizer`. Union[Tensor[bool], tuple[Parameter]], it depends on the output of `optimizer`.

View File

@ -127,26 +127,26 @@ class LazyAdam(Optimizer):
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
in the value of 'order_params' should be in one of group parameters. in the value of 'order_params' should be in one of group parameters.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
Default: 1e-3. Default: 1e-3.
beta1 (float): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). Default: beta1 (float): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0).
0.9. Default: 0.9.
beta2 (float): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). Default: beta2 (float): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0).
0.999. Default: 0.999.
eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default: eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default:
1e-8. 1e-8.
use_locking (bool): Whether to enable a lock to protect updating variable tensors. use_locking (bool): Whether to enable a lock to protect updating variable tensors.
If True, updating of the var, m, and v tensors will be protected by a lock. If true, updates of the var, m, and v tensors will be protected by a lock.
If False, the result is unpredictable. Default: False. If false, the result is unpredictable. Default: False.
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
If True, updates the gradients using NAG. If true, update the gradients using NAG.
If False, updates the gradients without using NAG. Default: False. If true, update the gradients without using NAG. Default: False.
weight_decay (float): Weight decay (L2 penalty). Default: 0.0. weight_decay (float): Weight decay (L2 penalty). Default: 0.0.
loss_scale (float): A floating point value for the loss scale. Should be equal to or greater than 1. Default: loss_scale (float): A floating point value for the loss scale. Should be equal to or greater than 1. Default:
1.0. 1.0.

View File

@ -83,12 +83,12 @@ class Momentum(Optimizer):
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
in the value of 'order_params' should be in one of group parameters. in the value of 'order_params' should be in one of group parameters.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
momentum (float): Hyperparameter of type float, means momentum for the moving average. momentum (float): Hyperparameter of type float, means momentum for the moving average.
It should be at least 0.0. It should be at least 0.0.

View File

@ -40,8 +40,6 @@ class Optimizer(Cell):
""" """
Base class for all optimizers. Base class for all optimizers.
This class defines the API to add Ops to train a model.
Note: Note:
This class defines the API to add Ops to train a model. Never use This class defines the API to add Ops to train a model. Never use
this class directly, but instead instantiate one of its subclasses. this class directly, but instead instantiate one of its subclasses.
@ -55,12 +53,12 @@ class Optimizer(Cell):
To improve parameter groups performance, the customized order of parameters can be supported. To improve parameter groups performance, the customized order of parameters can be supported.
Args: Args:
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning
rate. When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then rate. When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
parameters (Union[list[Parameter], list[dict]]): When the `parameters` is a list of `Parameter` which will be parameters (Union[list[Parameter], list[dict]]): When the `parameters` is a list of `Parameter` which will be
updated, the element in `parameters` should be class `Parameter`. When the `parameters` is a list of `dict`, updated, the element in `parameters` should be class `Parameter`. When the `parameters` is a list of `dict`,
@ -84,8 +82,8 @@ class Optimizer(Cell):
type of `loss_scale` input is int, it will be converted to float. Default: 1.0. type of `loss_scale` input is int, it will be converted to float. Default: 1.0.
Raises: Raises:
ValueError: If the learning_rate is a Tensor, but the dims of tensor is greater than 1. ValueError: If the learning_rate is a Tensor, but the dimension of tensor is greater than 1.
TypeError: If the learning_rate is not any of the three types: float, Tensor, Iterable. TypeError: If the learning_rate is not any of the three types: float, Tensor, nor Iterable.
""" """
def __init__(self, learning_rate, parameters, weight_decay=0.0, loss_scale=1.0): def __init__(self, learning_rate, parameters, weight_decay=0.0, loss_scale=1.0):
@ -179,7 +177,7 @@ class Optimizer(Cell):
An approach to reduce the overfitting of a deep learning neural network model. An approach to reduce the overfitting of a deep learning neural network model.
Args: Args:
gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape with gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape as
`self.parameters`. `self.parameters`.
Returns: Returns:
@ -204,7 +202,7 @@ class Optimizer(Cell):
network. network.
Args: Args:
gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape with gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape as
`self.parameters`. `self.parameters`.
Returns: Returns:

View File

@ -87,22 +87,22 @@ class ProximalAdagrad(Optimizer):
in the value of 'order_params' should be in one of group parameters. in the value of 'order_params' should be in one of group parameters.
accum (float): The starting value for accumulators, must be zero or positive values. Default: 0.1. accum (float): The starting value for accumulators, must be zero or positive values. Default: 0.1.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
Default: 0.001. Default: 0.001.
l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0. l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0.
l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0. l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0.
use_locking (bool): If True use locks for update operation. Default: False. use_locking (bool): If True, use locks for updating operation. Default: False.
loss_scale (float): Value for the loss scale. It should be greater than 0.0. Default: 1.0. loss_scale (float): Value for the loss scale. It should be greater than 0.0. Default: 1.0.
weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0. weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0.
Inputs: Inputs:
- **grads** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is as same as the `params` - **grads** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the same as the `params`
in optimizer. in optimizer.
Outputs: Outputs:

View File

@ -106,12 +106,12 @@ class RMSProp(Optimizer):
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
in the value of 'order_params' should be in one of group parameters. in the value of 'order_params' should be in one of group parameters.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
Default: 0.1. Default: 0.1.
decay (float): Decay rate. Should be equal to or greater than 0. Default: 0.9. decay (float): Decay rate. Should be equal to or greater than 0. Default: 0.9.

View File

@ -78,12 +78,12 @@ class SGD(Optimizer):
the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which
in the value of 'order_params' should be in one of group parameters. in the value of 'order_params' should be in one of group parameters.
learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate.
When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then
the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule,
use dynamic learning rate, the i-th learning rate will be calculated during the process of training use dynamic learning rate, the i-th learning rate will be calculated during the process of training
according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero
dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be
equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float.
Default: 0.1. Default: 0.1.
momentum (float): A floating point value the momentum. should be at least 0.0. Default: 0.0. momentum (float): A floating point value the momentum. should be at least 0.0. Default: 0.0.

View File

@ -138,9 +138,9 @@ class TrainOneStepCell(Cell):
r""" r"""
Network training package class. Network training package class.
Wraps the network with an optimizer. The resulting Cell be trained with input *inputs. Wraps the network with an optimizer. The resulting Cell is trained with input *inputs.
Backward graph will be created in the construct function to do parameter updating. Different The backward graph will be created in the construct function to update the parameter. Different
parallel modes are available to run the training. parallel modes are available for training.
Args: Args:
network (Cell): The training network. network (Cell): The training network.
@ -231,14 +231,14 @@ class DataWrapper(Cell):
class GetNextSingleOp(Cell): class GetNextSingleOp(Cell):
""" """
Cell to run get next operation. Cell to run for getting the next operation.
Args: Args:
dataset_types (list[:class:`mindspore.dtype`]): The types of dataset. dataset_types (list[:class:`mindspore.dtype`]): The types of dataset.
dataset_shapes (list[tuple[int]]): The shapes of dataset. dataset_shapes (list[tuple[int]]): The shapes of dataset.
queue_name (str): Queue name to fetch the data. queue_name (str): Queue name to fetch the data.
Detailed information, please refer to `ops.operations.GetNext`. For detailed information, refer to `ops.operations.GetNext`.
""" """
def __init__(self, dataset_types, dataset_shapes, queue_name): def __init__(self, dataset_types, dataset_shapes, queue_name):
@ -360,7 +360,7 @@ class ParameterUpdate(Cell):
param (Parameter): The parameter to be updated manually. param (Parameter): The parameter to be updated manually.
Raises: Raises:
KeyError: If parameter with the specified name do not exist. KeyError: If parameter with the specified name does not exist.
Examples: Examples:
>>> network = Net() >>> network = Net()

View File

@ -329,7 +329,7 @@ class DistributedGradReducer(Cell):
def construct(self, grads): def construct(self, grads):
""" """
In some circumstances, the data precision of grads could be mixed with float16 and float32. Thus, the Under certain circumstances, the data precision of grads could be mixed with float16 and float32. Thus, the
result of AllReduce is unreliable. To solve the problem, grads should be cast to float32 before AllReduce, result of AllReduce is unreliable. To solve the problem, grads should be cast to float32 before AllReduce,
and cast back after the operation. and cast back after the operation.

View File

@ -54,8 +54,8 @@ class DynamicLossScaleUpdateCell(Cell):
Dynamic Loss scale update cell. Dynamic Loss scale update cell.
For loss scaling training, the initial loss scaling value will be set to be `loss_scale_value`. For loss scaling training, the initial loss scaling value will be set to be `loss_scale_value`.
In every training step, the loss scaling value will be updated by loss scaling value/`scale_factor` In each training step, the loss scaling value will be updated by loss scaling value/`scale_factor`
when there is overflow. And it will be increased by loss scaling value * `scale_factor` if there is no when there is an overflow. And it will be increased by loss scaling value * `scale_factor` if there is no
overflow for a continuous `scale_window` steps. This cell is used for Graph mode training in which all overflow for a continuous `scale_window` steps. This cell is used for Graph mode training in which all
logic will be executed on device side(Another training mode is normal(non-sink) mode in which some logic will be logic will be executed on device side(Another training mode is normal(non-sink) mode in which some logic will be
executed on host). executed on host).
@ -133,7 +133,7 @@ class FixedLossScaleUpdateCell(Cell):
""" """
Static scale update cell, the loss scaling value will not be updated. Static scale update cell, the loss scaling value will not be updated.
For usage please refer to `DynamicLossScaleUpdateCell`. For usage, refer to `DynamicLossScaleUpdateCell`.
Args: Args:
loss_scale_value (float): Init loss scale. loss_scale_value (float): Init loss scale.

View File

@ -57,7 +57,7 @@ class _TupleGetItemTensor(base.TupleGetItemTensor_):
data (tuple): A tuple of items. data (tuple): A tuple of items.
index (Tensor): The index in tensor. index (Tensor): The index in tensor.
Outputs: Outputs:
Type, is same as the element type of data. Type, is the same as the element type of data.
""" """
def __init__(self, name): def __init__(self, name):
@ -81,7 +81,7 @@ def _tuple_getitem_by_number(data, number_index):
number_index (Number): Index in scalar. number_index (Number): Index in scalar.
Outputs: Outputs:
Type, is same as the element type of data. Type, is the same as the element type of data.
""" """
return F.tuple_getitem(data, number_index) return F.tuple_getitem(data, number_index)
@ -96,7 +96,7 @@ def _tuple_getitem_by_slice(data, slice_index):
slice_index (Slice): Index in slice. slice_index (Slice): Index in slice.
Outputs: Outputs:
Tuple, element type is same as the element type of data. Tuple, element type is the same as the element type of data.
""" """
return _tuple_slice(data, slice_index) return _tuple_slice(data, slice_index)
@ -111,7 +111,7 @@ def _tuple_getitem_by_tensor(data, tensor_index):
tensor_index (Tensor): Index to select item. tensor_index (Tensor): Index to select item.
Outputs: Outputs:
Type, is same as the element type of data. Type, is the same as the element type of data.
""" """
return _tuple_get_item_tensor(data, tensor_index) return _tuple_get_item_tensor(data, tensor_index)
@ -126,7 +126,7 @@ def _list_getitem_by_number(data, number_index):
number_index (Number): Index in scalar. number_index (Number): Index in scalar.
Outputs: Outputs:
Type is same as the element type of data. Type is the same as the element type of data.
""" """
return F.list_getitem(data, number_index) return F.list_getitem(data, number_index)
@ -186,7 +186,7 @@ def _tensor_getitem_by_slice(data, slice_index):
slice_index (Slice): Index in slice. slice_index (Slice): Index in slice.
Outputs: Outputs:
Tensor, element type is same as the element type of data. Tensor, element type is the same as the element type of data.
""" """
return compile_utils.tensor_index_by_slice(data, slice_index) return compile_utils.tensor_index_by_slice(data, slice_index)
@ -201,7 +201,7 @@ def _tensor_getitem_by_tensor(data, tensor_index):
tensor_index (Tensor): An index expressed by tensor. tensor_index (Tensor): An index expressed by tensor.
Outputs: Outputs:
Tensor, element type is same as the element type of data. Tensor, element type is the same as the element type of data.
""" """
return compile_utils.tensor_index_by_tensor(data, tensor_index) return compile_utils.tensor_index_by_tensor(data, tensor_index)
@ -216,7 +216,7 @@ def _tensor_getitem_by_tuple(data, tuple_index):
tuple_index (tuple): Index in tuple. tuple_index (tuple): Index in tuple.
Outputs: Outputs:
Tensor, element type is same as the element type of data. Tensor, element type is the same as the element type of data.
""" """
return compile_utils.tensor_index_by_tuple(data, tuple_index) return compile_utils.tensor_index_by_tuple(data, tuple_index)

View File

@ -32,7 +32,7 @@ def _list_setitem_with_string(data, number_index, value):
number_index (Number): Index of data. number_index (Number): Index of data.
Outputs: Outputs:
list, type is same as the element type of data. list, type is the same as the element type of data.
""" """
return F.list_setitem(data, number_index, value) return F.list_setitem(data, number_index, value)
@ -48,7 +48,7 @@ def _list_setitem_with_number(data, number_index, value):
value (Number): Value given. value (Number): Value given.
Outputs: Outputs:
list, type is same as the element type of data. list, type is the same as the element type of data.
""" """
return F.list_setitem(data, number_index, value) return F.list_setitem(data, number_index, value)
@ -64,7 +64,7 @@ def _list_setitem_with_Tensor(data, number_index, value):
value (Tensor): Value given. value (Tensor): Value given.
Outputs: Outputs:
list, type is same as the element type of data. list, type is the same as the element type of data.
""" """
return F.list_setitem(data, number_index, value) return F.list_setitem(data, number_index, value)
@ -80,7 +80,7 @@ def _list_setitem_with_List(data, number_index, value):
value (list): Value given. value (list): Value given.
Outputs: Outputs:
list, type is same as the element type of data. list, type is the same as the element type of data.
""" """
return F.list_setitem(data, number_index, value) return F.list_setitem(data, number_index, value)
@ -96,7 +96,7 @@ def _list_setitem_with_Tuple(data, number_index, value):
value (list): Value given. value (list): Value given.
Outputs: Outputs:
list, type is same as the element type of data. list, type is the same as the element type of data.
""" """
return F.list_setitem(data, number_index, value) return F.list_setitem(data, number_index, value)

View File

@ -158,18 +158,18 @@ class ExtractImagePatches(PrimitiveWithInfer):
The input tensor must be a 4-D tensor and the data format is NHWC. The input tensor must be a 4-D tensor and the data format is NHWC.
Args: Args:
ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or list of int, ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or a list of integers,
and the format is [1, ksize_row, ksize_col, 1]. and the format is [1, ksize_row, ksize_col, 1].
strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches, strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches,
should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1]. should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1].
rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dim rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dimension
pixel positions, should be a tuple or list of int, and the format is [1, rate_row, rate_col, 1]. pixel positions, should be a tuple or a list of integers, and the format is [1, rate_row, rate_col, 1].
padding (str): The type of padding algorithm, is a string whose value is "same" or "valid", padding (str): The type of padding algorithm, is a string whose value is "same" or "valid",
not case sensitive. Default: "valid". not case sensitive. Default: "valid".
- same: Means that the patch can take the part beyond the original image, and this part is filled with 0. - same: Means that the patch can take the part beyond the original image, and this part is filled with 0.
- valid: Means that the patch area taken must be completely contained in the original image. - valid: Means that the taken patch area must be completely covered in the original image.
Inputs: Inputs:
- **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_row, in_col, in_depth] and - **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_row, in_col, in_depth] and
@ -177,7 +177,7 @@ class ExtractImagePatches(PrimitiveWithInfer):
Outputs: Outputs:
Tensor, a 4-D tensor whose data type is same as 'input_x', Tensor, a 4-D tensor whose data type is same as 'input_x',
and the shape is [out_batch, out_row, out_col, out_depth], the out_batch is same as the in_batch. and the shape is [out_batch, out_row, out_col, out_depth], the out_batch is the same as the in_batch.
""" """
@prim_attr_register @prim_attr_register
@ -436,8 +436,8 @@ class MatrixDiag(PrimitiveWithInfer):
Returns a batched diagonal tensor with a given batched diagonal values. Returns a batched diagonal tensor with a given batched diagonal values.
Inputs: Inputs:
- **x** (Tensor) - A tensor which to be element-wise multi by `assist`. It can be of the following data types: - **x** (Tensor) - A tensor which to be element-wise multi by `assist`. It can be one of the following data
float32, float16, int32, int8, uint8. types: float32, float16, int32, int8, and uint8.
- **assist** (Tensor) - A eye tensor of the same type as `x`. It's rank must greater than or equal to 2 and - **assist** (Tensor) - A eye tensor of the same type as `x`. It's rank must greater than or equal to 2 and
it's last dimension must equal to the second to last dimension. it's last dimension must equal to the second to last dimension.
@ -490,7 +490,7 @@ class MatrixDiagPart(PrimitiveWithInfer):
Returns the batched diagonal part of a batched tensor. Returns the batched diagonal part of a batched tensor.
Inputs: Inputs:
- **x** (Tensor) - The batched tensor. It can be of the following data types: - **x** (Tensor) - The batched tensor. It can be one of the following data types:
float32, float16, int32, int8, uint8. float32, float16, int32, int8, uint8.
- **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`. - **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`.
@ -531,7 +531,7 @@ class MatrixSetDiag(PrimitiveWithInfer):
Modify the batched diagonal part of a batched tensor. Modify the batched diagonal part of a batched tensor.
Inputs: Inputs:
- **x** (Tensor) - The batched tensor. It can be of the following data types: - **x** (Tensor) - The batched tensor. It can be one of the following data types:
float32, float16, int32, int8, uint8. float32, float16, int32, int8, uint8.
- **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`. - **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`.
- **diagonal** (Tensor) - The diagonal values. - **diagonal** (Tensor) - The diagonal values.

View File

@ -178,8 +178,8 @@ class FakeQuantPerLayer(PrimitiveWithInfer):
quant_delay (int): Quantilization delay parameter. Before delay step in training time not update quant_delay (int): Quantilization delay parameter. Before delay step in training time not update
simulate quantization aware funcion. After delay step in training time begin simulate the aware simulate quantization aware funcion. After delay step in training time begin simulate the aware
quantize funcion. Default: 0. quantize funcion. Default: 0.
symmetric (bool): Quantization algorithm use symmetric or not. Default: False. symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): Quantization algorithm use narrow range or not. Default: False. narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
training (bool): Training the network or not. Default: True. training (bool): Training the network or not. Default: True.
Inputs: Inputs:
@ -318,8 +318,8 @@ class FakeQuantPerChannel(PrimitiveWithInfer):
quant_delay (int): Quantilization delay parameter. Before delay step in training time not quant_delay (int): Quantilization delay parameter. Before delay step in training time not
update the weight data to simulate quantize operation. After delay step in training time update the weight data to simulate quantize operation. After delay step in training time
begin simulate the quantize operation. Default: 0. begin simulate the quantize operation. Default: 0.
symmetric (bool): Quantization algorithm use symmetric or not. Default: False. symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): Quantization algorithm use narrow range or not. Default: False. narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
training (bool): Training the network or not. Default: True. training (bool): Training the network or not. Default: True.
channel_axis (int): Quantization by channel axis. Ascend backend only supports 0 or 1. Default: 1. channel_axis (int): Quantization by channel axis. Ascend backend only supports 0 or 1. Default: 1.

View File

@ -3359,7 +3359,7 @@ class InplaceUpdate(PrimitiveWithInfer):
indices (Union[int, tuple]): Indices into the left-most dimension of `x`. indices (Union[int, tuple]): Indices into the left-most dimension of `x`.
Inputs: Inputs:
- **x** (Tensor) - A tensor which to be inplace updated. It can be of the following data types: - **x** (Tensor) - A tensor which to be inplace updated. It can be one of the following data types:
float32, float16, int32. float32, float16, int32.
- **v** (Tensor) - A tensor of the same type as `x`. Same dimension size as `x` except - **v** (Tensor) - A tensor of the same type as `x`. Same dimension size as `x` except
the first dimension, which must be the same as the size of `indices`. the first dimension, which must be the same as the size of `indices`.
@ -3474,7 +3474,7 @@ class TransShape(PrimitiveWithInfer):
- **out_shape** (tuple[int]) - The shape of output data. - **out_shape** (tuple[int]) - The shape of output data.
Outputs: Outputs:
Tensor, a tensor whose data type is same as 'input_x', and the shape is same as the `out_shape`. Tensor, a tensor whose data type is same as 'input_x', and the shape is the same as the `out_shape`.
""" """
@prim_attr_register @prim_attr_register
def __init__(self): def __init__(self):

View File

@ -31,7 +31,7 @@ class ScalarCast(PrimitiveWithInfer):
- **input_y** (mindspore.dtype) - The type should cast to be. Only constant value is allowed. - **input_y** (mindspore.dtype) - The type should cast to be. Only constant value is allowed.
Outputs: Outputs:
Scalar. The type is same as the python type corresponding to `input_y`. Scalar. The type is the same as the python type corresponding to `input_y`.
Examples: Examples:
>>> scalar_cast = P.ScalarCast() >>> scalar_cast = P.ScalarCast()

View File

@ -132,7 +132,7 @@ class TensorAdd(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1067,7 +1067,7 @@ class Sub(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1105,7 +1105,7 @@ class Mul(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1144,7 +1144,7 @@ class SquaredDifference(_MathBinaryOp):
float16, float32, int32 or bool. float16, float32, int32 or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1333,7 +1333,7 @@ class Pow(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1618,7 +1618,7 @@ class Minimum(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1656,7 +1656,7 @@ class Maximum(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1694,7 +1694,7 @@ class RealDiv(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1733,7 +1733,7 @@ class Div(_MathBinaryOp):
is a number or a bool, the second input should be a tensor whose data type is number or bool. is a number or a bool, the second input should be a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Raises: Raises:
@ -1772,7 +1772,7 @@ class DivNoNan(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Raises: Raises:
@ -1814,7 +1814,7 @@ class FloorDiv(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1844,7 +1844,7 @@ class TruncateDiv(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1873,7 +1873,7 @@ class TruncateMod(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -1900,7 +1900,7 @@ class Mod(_MathBinaryOp):
the second input should be a tensor whose data type is number. the second input should be a tensor whose data type is number.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Raises: Raises:
@ -1967,7 +1967,7 @@ class FloorMod(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -2025,7 +2025,7 @@ class Xdivy(_MathBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is float16, float32 or bool. a bool when the first input is a tensor or a tensor whose data type is float16, float32 or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -2059,7 +2059,7 @@ class Xlogy(_MathBinaryOp):
The value must be positive. The value must be positive.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, Tensor, the shape is the same as the shape after broadcasting,
and the data type is the one with high precision or high digits among the two inputs. and the data type is the one with high precision or high digits among the two inputs.
Examples: Examples:
@ -2219,7 +2219,7 @@ class Equal(_LogicBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is bool. Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32) >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32)
@ -2250,7 +2250,7 @@ class ApproximateEqual(_LogicBinaryOp):
- **x2** (Tensor) - A tensor of the same type and shape as 'x1'. - **x2** (Tensor) - A tensor of the same type and shape as 'x1'.
Outputs: Outputs:
Tensor, the shape is same as the shape of 'x1', and the data type is bool. Tensor, the shape is the same as the shape of 'x1', and the data type is bool.
Examples: Examples:
>>> x1 = Tensor(np.array([1, 2, 3]), mindspore.float32) >>> x1 = Tensor(np.array([1, 2, 3]), mindspore.float32)
@ -2328,7 +2328,7 @@ class NotEqual(_LogicBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is bool. Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32) >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32)
@ -2364,7 +2364,7 @@ class Greater(_LogicBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is bool. Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
@ -2399,7 +2399,7 @@ class GreaterEqual(_LogicBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is bool. Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
@ -2434,7 +2434,7 @@ class Less(_LogicBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is bool. Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
@ -2469,7 +2469,7 @@ class LessEqual(_LogicBinaryOp):
a bool when the first input is a tensor or a tensor whose data type is number or bool. a bool when the first input is a tensor or a tensor whose data type is number or bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is bool. Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32)
@ -2495,7 +2495,7 @@ class LogicalNot(PrimitiveWithInfer):
- **input_x** (Tensor) - The input tensor whose dtype is bool. - **input_x** (Tensor) - The input tensor whose dtype is bool.
Outputs: Outputs:
Tensor, the shape is same as the `input_x`, and the dtype is bool. Tensor, the shape is the same as the `input_x`, and the dtype is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_) >>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_)
@ -2533,7 +2533,7 @@ class LogicalAnd(_LogicBinaryOp):
a tensor whose data type is bool. a tensor whose data type is bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting, and the data type is bool. Tensor, the shape is the same as the shape after broadcasting, and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_) >>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_)
@ -2563,7 +2563,7 @@ class LogicalOr(_LogicBinaryOp):
a tensor whose data type is bool. a tensor whose data type is bool.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is bool. Tensor, the shape is the same as the shape after broadcasting,and the data type is bool.
Examples: Examples:
>>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_) >>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_)
@ -3182,7 +3182,7 @@ class Atan2(_MathBinaryOp):
- **input_y** (Tensor) - The input tensor. - **input_y** (Tensor) - The input tensor.
Outputs: Outputs:
Tensor, the shape is same as the shape after broadcasting,and the data type is same as `input_x`. Tensor, the shape is the same as the shape after broadcasting,and the data type is same as `input_x`.
Examples: Examples:
>>> input_x = Tensor(np.array([[0, 1]]), mindspore.float32) >>> input_x = Tensor(np.array([[0, 1]]), mindspore.float32)

View File

@ -100,7 +100,7 @@ class Softmax(PrimitiveWithInfer):
Softmax operation. Softmax operation.
Applies the Softmax operation to the input tensor on the specified axis. Applies the Softmax operation to the input tensor on the specified axis.
Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i` Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i`
the Softmax function is shown as follows: the Softmax function is shown as follows:
.. math:: .. math::
@ -151,7 +151,7 @@ class LogSoftmax(PrimitiveWithInfer):
Log Softmax activation function. Log Softmax activation function.
Applies the Log Softmax function to the input tensor on the specified axis. Applies the Log Softmax function to the input tensor on the specified axis.
Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i` Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i`
the Log Softmax function is shown as follows: the Log Softmax function is shown as follows:
.. math:: .. math::
@ -429,7 +429,7 @@ class HSwish(PrimitiveWithInfer):
.. math:: .. math::
\text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6}, \text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6},
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
Inputs: Inputs:
- **input_data** (Tensor) - The input of HSwish, data type should be float16 or float32. - **input_data** (Tensor) - The input of HSwish, data type should be float16 or float32.
@ -502,7 +502,7 @@ class HSigmoid(PrimitiveWithInfer):
.. math:: .. math::
\text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})), \text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})),
where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor.
Inputs: Inputs:
- **input_data** (Tensor) - The input of HSigmoid, data type should be float16 or float32. - **input_data** (Tensor) - The input of HSigmoid, data type should be float16 or float32.
@ -2234,7 +2234,7 @@ class DropoutDoMask(PrimitiveWithInfer):
shape of `input_x` must be same as the value of `DropoutGenMask`'s input `shape`. If input wrong `mask`, shape of `input_x` must be same as the value of `DropoutGenMask`'s input `shape`. If input wrong `mask`,
the output of `DropoutDoMask` are unpredictable. the output of `DropoutDoMask` are unpredictable.
- **keep_prob** (Tensor) - The keep rate, between 0 and 1, e.g. keep_prob = 0.9, - **keep_prob** (Tensor) - The keep rate, between 0 and 1, e.g. keep_prob = 0.9,
means dropping out 10% of input units. The value of `keep_prob` is same as the input `keep_prob` of means dropping out 10% of input units. The value of `keep_prob` is the same as the input `keep_prob` of
`DropoutGenMask`. `DropoutGenMask`.
Outputs: Outputs:
@ -2674,9 +2674,9 @@ class Pad(PrimitiveWithInfer):
Args: Args:
paddings (tuple): The shape of parameter `paddings` is (N, 2). N is the rank of input data. All elements of paddings (tuple): The shape of parameter `paddings` is (N, 2). N is the rank of input data. All elements of
paddings are int type. For `D` th dimension of input, paddings[D, 0] indicates how many sizes to be paddings are int type. For the input in `D` th dimension, paddings[D, 0] indicates how many sizes to be
extended ahead of the `D` th dimension of the input tensor, and paddings[D, 1] indicates how many sizes to extended ahead of the input tensor in the `D` th dimension, and paddings[D, 1] indicates how many sizes to
be extended behind of the `D` th dimension of the input tensor. be extended behind of the input tensor in the `D` th dimension.
Inputs: Inputs:
- **input_x** (Tensor) - The input tensor. - **input_x** (Tensor) - The input tensor.
@ -2733,9 +2733,9 @@ class MirrorPad(PrimitiveWithInfer):
- **input_x** (Tensor) - The input tensor. - **input_x** (Tensor) - The input tensor.
- **paddings** (Tensor) - The paddings tensor. The value of `paddings` is a matrix(list), - **paddings** (Tensor) - The paddings tensor. The value of `paddings` is a matrix(list),
and its shape is (N, 2). N is the rank of input data. All elements of paddings and its shape is (N, 2). N is the rank of input data. All elements of paddings
are int type. For `D` th dimension of input, paddings[D, 0] indicates how many sizes to be are int type. For the input in `D` th dimension, paddings[D, 0] indicates how many sizes to be
extended ahead of the `D` th dimension of the input tensor, and paddings[D, 1] indicates extended ahead of the input tensor in the `D` th dimension, and paddings[D, 1] indicates how many sizes to
how many sizes to be extended behind of the `D` th dimension of the input tensor. be extended behind of the input tensor in the `D` th dimension.
Outputs: Outputs:
Tensor, the tensor after padding. Tensor, the tensor after padding.
@ -2880,11 +2880,11 @@ class Adam(PrimitiveWithInfer):
Args: Args:
use_locking (bool): Whether to enable a lock to protect updating variable tensors. use_locking (bool): Whether to enable a lock to protect updating variable tensors.
If True, updating of the var, m, and v tensors will be protected by a lock. If true, updates of the var, m, and v tensors will be protected by a lock.
If False, the result is unpredictable. Default: False. If false, the result is unpredictable. Default: False.
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
If True, updates the gradients using NAG. If true, update the gradients using NAG.
If False, updates the gradients without using NAG. Default: False. If true, update the gradients without using NAG. Default: False.
Inputs: Inputs:
- **var** (Tensor) - Weights to be updated. - **var** (Tensor) - Weights to be updated.
@ -2894,8 +2894,8 @@ class Adam(PrimitiveWithInfer):
- **beta1_power** (float) - :math:`beta_1^t` in the updating formula. - **beta1_power** (float) - :math:`beta_1^t` in the updating formula.
- **beta2_power** (float) - :math:`beta_2^t` in the updating formula. - **beta2_power** (float) - :math:`beta_2^t` in the updating formula.
- **lr** (float) - :math:`l` in the updating formula. - **lr** (float) - :math:`l` in the updating formula.
- **beta1** (float) - The exponential decay rate for the 1st moment estimates. - **beta1** (float) - The exponential decay rate for the 1st moment estimations.
- **beta2** (float) - The exponential decay rate for the 2nd moment estimates. - **beta2** (float) - The exponential decay rate for the 2nd moment estimations.
- **epsilon** (float) - Term added to the denominator to improve numerical stability. - **epsilon** (float) - Term added to the denominator to improve numerical stability.
- **gradient** (Tensor) - Gradients. Has the same type as `var`. - **gradient** (Tensor) - Gradients. Has the same type as `var`.
@ -2974,11 +2974,11 @@ class FusedSparseAdam(PrimitiveWithInfer):
Args: Args:
use_locking (bool): Whether to enable a lock to protect updating variable tensors. use_locking (bool): Whether to enable a lock to protect updating variable tensors.
If True, updating of the var, m, and v tensors will be protected by a lock. If true, updates of the var, m, and v tensors will be protected by a lock.
If False, the result is unpredictable. Default: False. If false, the result is unpredictable. Default: False.
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
If True, updates the gradients using NAG. If true, update the gradients using NAG.
If False, updates the gradients without using NAG. Default: False. If true, update the gradients without using NAG. Default: False.
Inputs: Inputs:
- **var** (Parameter) - Parameters to be updated. With float32 data type. - **var** (Parameter) - Parameters to be updated. With float32 data type.
@ -2989,8 +2989,8 @@ class FusedSparseAdam(PrimitiveWithInfer):
- **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type. - **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type.
- **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type. - **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type.
- **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type. - **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type.
- **beta1** (Tensor) - The exponential decay rate for the 1st moment estimates. With float32 data type. - **beta1** (Tensor) - The exponential decay rate for the 1st moment estimations. With float32 data type.
- **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimates. With float32 data type. - **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimations. With float32 data type.
- **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type. - **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type.
- **gradient** (Tensor) - Gradient value. With float32 data type. - **gradient** (Tensor) - Gradient value. With float32 data type.
- **indices** (Tensor) - Gradient indices. With int32 data type. - **indices** (Tensor) - Gradient indices. With int32 data type.
@ -3108,11 +3108,11 @@ class FusedSparseLazyAdam(PrimitiveWithInfer):
Args: Args:
use_locking (bool): Whether to enable a lock to protect updating variable tensors. use_locking (bool): Whether to enable a lock to protect updating variable tensors.
If True, updating of the var, m, and v tensors will be protected by a lock. If true, updates of the var, m, and v tensors will be protected by a lock.
If False, the result is unpredictable. Default: False. If false, the result is unpredictable. Default: False.
use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients.
If True, updates the gradients using NAG. If true, update the gradients using NAG.
If False, updates the gradients without using NAG. Default: False. If true, update the gradients without using NAG. Default: False.
Inputs: Inputs:
- **var** (Parameter) - Parameters to be updated. With float32 data type. - **var** (Parameter) - Parameters to be updated. With float32 data type.
@ -3123,8 +3123,8 @@ class FusedSparseLazyAdam(PrimitiveWithInfer):
- **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type. - **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type.
- **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type. - **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type.
- **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type. - **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type.
- **beta1** (Tensor) - The exponential decay rate for the 1st moment estimates. With float32 data type. - **beta1** (Tensor) - The exponential decay rate for the 1st moment estimations. With float32 data type.
- **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimates. With float32 data type. - **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimations. With float32 data type.
- **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type. - **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type.
- **gradient** (Tensor) - Gradient value. With float32 data type. - **gradient** (Tensor) - Gradient value. With float32 data type.
- **indices** (Tensor) - Gradient indices. With int32 data type. - **indices** (Tensor) - Gradient indices. With int32 data type.
@ -3227,7 +3227,7 @@ class FusedSparseFtrl(PrimitiveWithInfer):
l2 (float): l2 regularization strength, must be greater than or equal to zero. l2 (float): l2 regularization strength, must be greater than or equal to zero.
lr_power (float): Learning rate power controls how the learning rate decreases during training, lr_power (float): Learning rate power controls how the learning rate decreases during training,
must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero. must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero.
use_locking (bool): Use locks for update operation if True . Default: False. use_locking (bool): Use locks for updating operation if True . Default: False.
Inputs: Inputs:
- **var** (Parameter) - The variable to be updated. The data type must be float32. - **var** (Parameter) - The variable to be updated. The data type must be float32.
@ -3320,7 +3320,7 @@ class FusedSparseProximalAdagrad(PrimitiveWithInfer):
var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0) var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0)
Args: Args:
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
Inputs: Inputs:
- **var** (Parameter) - Variable tensor to be updated. The data type must be float32. - **var** (Parameter) - Variable tensor to be updated. The data type must be float32.
@ -3415,7 +3415,7 @@ class KLDivLoss(PrimitiveWithInfer):
\end{cases} \end{cases}
Args: Args:
reduction (str): Specifies the reduction to apply to the output. reduction (str): Specifies the reduction to be applied to the output.
Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'. Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'.
Inputs: Inputs:
@ -3487,7 +3487,7 @@ class BinaryCrossEntropy(PrimitiveWithInfer):
\end{cases} \end{cases}
Args: Args:
reduction (str): Specifies the reduction to apply to the output. reduction (str): Specifies the reduction to be applied to the output.
Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'. Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'.
Inputs: Inputs:
@ -3575,9 +3575,9 @@ class ApplyAdaMax(PrimitiveWithInfer):
With float32 or float16 data type. With float32 or float16 data type.
- **lr** (Union[Number, Tensor]) - Learning rate, :math:`l` in the updating formula, should be scalar. - **lr** (Union[Number, Tensor]) - Learning rate, :math:`l` in the updating formula, should be scalar.
With float32 or float16 data type. With float32 or float16 data type.
- **beta1** (Union[Number, Tensor]) - The exponential decay rate for the 1st moment estimates, - **beta1** (Union[Number, Tensor]) - The exponential decay rate for the 1st moment estimations,
should be scalar. With float32 or float16 data type. should be scalar. With float32 or float16 data type.
- **beta2** (Union[Number, Tensor]) - The exponential decay rate for the 2nd moment estimates, - **beta2** (Union[Number, Tensor]) - The exponential decay rate for the 2nd moment estimations,
should be scalar. With float32 or float16 data type. should be scalar. With float32 or float16 data type.
- **epsilon** (Union[Number, Tensor]) - A small value added for numerical stability, should be scalar. - **epsilon** (Union[Number, Tensor]) - A small value added for numerical stability, should be scalar.
With float32 or float16 data type. With float32 or float16 data type.
@ -3939,7 +3939,7 @@ class SparseApplyAdagrad(PrimitiveWithInfer):
Args: Args:
lr (float): Learning rate. lr (float): Learning rate.
update_slots (bool): If `True`, `accum` will be updated. Default: True. update_slots (bool): If `True`, `accum` will be updated. Default: True.
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
Inputs: Inputs:
- **var** (Parameter) - Variable to be updated. The data type must be float16 or float32. - **var** (Parameter) - Variable to be updated. The data type must be float16 or float32.
@ -4099,7 +4099,7 @@ class ApplyProximalAdagrad(PrimitiveWithInfer):
var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0) var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0)
Args: Args:
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
Inputs: Inputs:
- **var** (Parameter) - Variable to be updated. The data type should be float16 or float32. - **var** (Parameter) - Variable to be updated. The data type should be float16 or float32.
@ -4195,7 +4195,7 @@ class SparseApplyProximalAdagrad(PrimitiveWithInfer):
var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0) var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0)
Args: Args:
use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False.
Inputs: Inputs:
- **var** (Parameter) - Variable tensor to be updated. The data type must be float16 or float32. - **var** (Parameter) - Variable tensor to be updated. The data type must be float16 or float32.
@ -4697,7 +4697,7 @@ class ApplyFtrl(PrimitiveWithInfer):
Update relevant entries according to the FTRL scheme. Update relevant entries according to the FTRL scheme.
Args: Args:
use_locking (bool): Use locks for update operation if True . Default: False. use_locking (bool): Use locks for updating operation if True . Default: False.
Inputs: Inputs:
- **var** (Parameter) - The variable to be updated. The data type should be float16 or float32. - **var** (Parameter) - The variable to be updated. The data type should be float16 or float32.
@ -4788,7 +4788,7 @@ class SparseApplyFtrl(PrimitiveWithInfer):
l2 (float): l2 regularization strength, must be greater than or equal to zero. l2 (float): l2 regularization strength, must be greater than or equal to zero.
lr_power (float): Learning rate power controls how the learning rate decreases during training, lr_power (float): Learning rate power controls how the learning rate decreases during training,
must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero. must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero.
use_locking (bool): Use locks for update operation if True . Default: False. use_locking (bool): Use locks for updating operation if True . Default: False.
Inputs: Inputs:
- **var** (Parameter) - The variable to be updated. The data type must be float16 or float32. - **var** (Parameter) - The variable to be updated. The data type must be float16 or float32.
@ -4967,8 +4967,8 @@ class ConfusionMulGrad(PrimitiveWithInfer):
axis (Union[int, tuple[int], list[int]]): The dimensions to reduce. axis (Union[int, tuple[int], list[int]]): The dimensions to reduce.
Default:(), reduce all dimensions. Only constant value is allowed. Default:(), reduce all dimensions. Only constant value is allowed.
keep_dims (bool): keep_dims (bool):
- If True, keep these reduced dimensions and the length is 1. - If true, keep these reduced dimensions and the length is 1.
- If False, don't keep these dimensions. Default:False. - If false, don't keep these dimensions. Default:False.
Inputs: Inputs:
- **input_0** (Tensor) - The input Tensor. - **input_0** (Tensor) - The input Tensor.
@ -5094,9 +5094,9 @@ class CTCLoss(PrimitiveWithInfer):
Calculates the CTC(Connectionist Temporal Classification) loss. Also calculates the gradient. Calculates the CTC(Connectionist Temporal Classification) loss. Also calculates the gradient.
Args: Args:
preprocess_collapse_repeated (bool): If True, repeated labels are collapsed prior to the CTC calculation. preprocess_collapse_repeated (bool): If true, repeated labels are collapsed prior to the CTC calculation.
Default: False. Default: False.
ctc_merge_repeated (bool): If False, during CTC calculation, repeated non-blank labels will not be merged ctc_merge_repeated (bool): If false, during CTC calculation, repeated non-blank labels will not be merged
and are interpreted as individual labels. This is a simplfied version of CTC. and are interpreted as individual labels. This is a simplfied version of CTC.
Default: True. Default: True.
ignore_longer_outputs_than_inputs (bool): If True, sequences with longer outputs than inputs will be ignored. ignore_longer_outputs_than_inputs (bool): If True, sequences with longer outputs than inputs will be ignored.
@ -5192,7 +5192,7 @@ class BasicLSTMCell(PrimitiveWithInfer):
keep_prob (float): If not 1.0, append `Dropout` layer on the outputs of each keep_prob (float): If not 1.0, append `Dropout` layer on the outputs of each
LSTM layer except the last layer. Default 1.0. The range of dropout is [0.0, 1.0]. LSTM layer except the last layer. Default 1.0. The range of dropout is [0.0, 1.0].
forget_bias (float): Add forget bias to forget gate biases in order to decrease former scale. Default to 1.0. forget_bias (float): Add forget bias to forget gate biases in order to decrease former scale. Default to 1.0.
state_is_tuple (bool): If True, state is tensor tuple, containing h and c; If False, one tensor, state_is_tuple (bool): If true, state is tensor tuple, containing h and c; If false, one tensor,
need split first. Default to True. need split first. Default to True.
activation (str): Activation. Default to "tanh". activation (str): Activation. Default to "tanh".

View File

@ -496,12 +496,11 @@ def convert_quant_network(network,
per_channel (bool, list or tuple): Quantization granularity based on layer or on channel. If `True` per_channel (bool, list or tuple): Quantization granularity based on layer or on channel. If `True`
then base on per channel otherwise base on per layer. The first element represent weights then base on per channel otherwise base on per layer. The first element represent weights
and second element represent data flow. Default: (False, False) and second element represent data flow. Default: (False, False)
symmetric (bool, list or tuple): Quantization algorithm use symmetric or not. If `True` then base on symmetric (bool, list or tuple): Whether the quantization algorithm is symmetric or not. If `True` then base on
symmetric otherwise base on asymmetric. The first element represent weights and second symmetric otherwise base on asymmetric. The first element represent weights and second
element represent data flow. Default: (False, False) element represent data flow. Default: (False, False)
narrow_range (bool, list or tuple): Quantization algorithm use narrow range or not. If `True` then base narrow_range (bool, list or tuple): Whether the quantization algorithm uses narrow range or not.
on narrow range otherwise base on off narrow range. The first element represent weights and The first element represents weights and the second element represents data flow. Default: (False, False)
second element represent data flow. Default: (False, False)
Returns: Returns:
Cell, Network which has change to quantization aware training network cell. Cell, Network which has change to quantization aware training network cell.

View File

@ -31,8 +31,8 @@ def cal_quantization_params(input_min,
input_max (numpy.ndarray): The dimension of channel or 1. input_max (numpy.ndarray): The dimension of channel or 1.
data_type (numpy type) : Can ben numpy int8, numpy uint8. data_type (numpy type) : Can ben numpy int8, numpy uint8.
num_bits (int): Quantization number bit, support 4 and 8bit. Default: 8. num_bits (int): Quantization number bit, support 4 and 8bit. Default: 8.
symmetric (bool): Quantization algorithm use symmetric or not. Default: False. symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False.
narrow_range (bool): Quantization algorithm use narrow range or not. Default: False. narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False.
Returns: Returns:
scale (numpy.ndarray): quantization param. scale (numpy.ndarray): quantization param.

View File

@ -34,7 +34,7 @@ pkg_dir = os.path.join(pwd, 'build/package')
def _read_file(filename): def _read_file(filename):
with open(os.path.join(pwd, filename)) as f: with open(os.path.join(pwd, filename), encoding='UTF-8') as f:
return f.read() return f.read()