!29134 optimizes the documentation of Conv1d, Conv2d, Conv3d, Gather, etc.

Merge pull request !29134 from wangshuide/code_docs_wsd_master1
This commit is contained in:
i-robot 2022-01-20 07:47:45 +00:00 committed by Gitee
commit 863daec7fe
No known key found for this signature in database
GPG Key ID: 173E9B9CA92EEF8F
12 changed files with 590 additions and 449 deletions

View File

@ -13,7 +13,7 @@ mindspore.nn.Conv1d
\sum_{k = 0}^{C_{in} - 1} \text{ccor}({\text{weight}(C_{\text{out}_j}, k), \text{X}(N_i, k)})
其中, :math:`ccor``cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_ :math:`C_{in}` 为输入空间维度, :math:`out_{j}` 对应输出的第 :math:`j` 个空间维度,:math:`j` 的范围在 :math:`[0C_{out}-1]` 内,
:math:`\text{weight}(C_{\text{out}_j}, k)` 是shape为 :math:`(kernel_size)` 的卷积核切片,其中 :math:`\text{kernel_size}` 是卷积核的宽度。 :math:`\text{bias}` 为偏置参数。
:math:`\text{weight}(C_{\text{out}_j}, k)` 是shape为 :math:`(kernel_size)` 的卷积核切片,其中 :math:`\text{kernel_size}` 是卷积核的宽度。 :math:`\text{bias}` 为偏置参数 :math:`\text{X}` 为输入Tensor
完整卷积核的shape为 :math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size})` ,其中 `group` 是在空间维度上分割输入 `x` 的组数。
详细介绍请参考论文 `Gradient Based Learning Applied to Document Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_
@ -22,18 +22,18 @@ mindspore.nn.Conv1d
- **in_channels** (int) - Conv1d层输入Tensor的空间维度。
- **out_channels** (int) - Conv1d层输出Tensor的空间维度。
- **kernel_size** (int) - 指定一维卷积核的宽度。
- **stride** (int) - 卷积核的移动步长默认值1。
- **stride** (int) - 一维卷积核的移动步长默认值1。
- **pad_mode** (str) - 指定填充模式。可选值为 "same""valid""pad"。默认值:"same"。
- same输出的宽度与输入整除 `stride` 后的值相同。若设置该模式,`padding` 的值必须为0。
- valid在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式`padding` 的值必须为0。
- pad对输入进行填充。在输入对两侧填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- pad对输入进行填充。在输入对两侧填充 `padding` 大小的0。如果设置此模式 `padding` 的值必须大于或等于0。
- **padding** (int) - 输入两侧填充的数量。值应该要大于等于0默认值0。
- **dilation** (int) - 卷积核膨胀尺寸。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。 `k` 取值范围为[1, L]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_ channels``out_channels` 必须可被 `group` 整除。默认值1。
- **dilation** (int) - 一维卷积核膨胀尺寸。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。 k 取值范围为[1, L]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_channels``out_channels` 必须可被 `group` 整除。默认值1。
- **has_bias** (bool) - Conv1d层是否添加偏置参数。默认值False。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重矩阵的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重参数的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **bias_init** (Union[Tensor, str, Initializer, numbers.Number]) - 偏置参数的初始化方法。可以使用的初始化方法与"weight_init"相同。更多细节请参考Initializer的值。默认值"zeros"。
**输入:**

View File

@ -16,18 +16,18 @@ mindspore.nn.Conv1dTranspose
- **in_channels** (int) - Conv1dTranspose层输入Tensor的空间维度。
- **out_channels** (int) - Conv1dTranspose层输出Tensor的空间维度。
- **kernel_size** (int) - 指定一维卷积核的宽度。
- **stride** (int) - 卷积核的移动步长默认值1。
- **stride** (int) - 一维卷积核的移动步长默认值1。
- **pad_mode** (str) - 指定填充模式。可选值为"same"、"valid"、"pad"。默认值:"same"。
- same输出的宽度与输入整除 `stride` 后的值相同。若设置该模式, `padding` 的值必须为0。
- valid在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式`padding` 的值必须为0。
- pad对输入进行填充。 在输入对两侧填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- pad对输入进行填充。在输入对两侧填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- **padding** (int) - 输入两侧填充的数量。默认值0。
- **dilation** (int) - 卷积核膨胀尺寸。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。 `k` 取值范围为[1, L]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_ channels``out_channels` 必须可被 `group` 整除。当 `group` 大于1时暂不支持Ascend平台。默认值1。
- **dilation** (int) - 一维卷积核膨胀尺寸。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。 k 取值范围为[1, L]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_channels``out_channels` 必须可被 `group` 整除。当 `group` 大于1时暂不支持Ascend平台。默认值1。
- **has_bias** (bool) - Conv1dTranspose层是否添加偏置参数。默认值False。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重矩阵的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重参数的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **bias_init** (Union[Tensor, str, Initializer, numbers.Number]) - 偏置参数的初始化方法。可以使用的初始化方法与"weight_init"相同。更多细节请参考Initializer的值。默认值"zeros"。
**输入:**

View File

@ -12,7 +12,7 @@ mindspore.nn.Conv3d
\sum_{k = 0}^{C_{in} - 1} \text{ccor}({\text{weight}(C_{\text{out}_j}, k), \text{X}(N_i, k)})
其中,:math:`ccor``cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_ :math:`C_{in}` 为输入空间维度, :math:`out_{j}` 对应输出的第 :math:`j` 个空间维度,:math:`j` 的范围在 :math:`[0C_{out}-1]` 内,
:math:`\text{weight}(C_{\text{out}_j}, k)`是shape为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` 的卷积核切片,其中 :math:`\text{kernel_size[0]}` , :math:`\text{kernel_size[1]}和 :math:`\text{kernel_size[2]}` 是卷积核的深度、高度和宽度。 :math:`\text{bias}` 为偏置参数。
:math:`\text{weight}(C_{\text{out}_j}, k)`是shape为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` 的卷积核切片,其中 :math:`\text{kernel_size[0]}` , :math:`\text{kernel_size[1]}和 :math:`\text{kernel_size[2]}` 是卷积核的深度、高度和宽度。 :math:`\text{bias}` 为偏置参数 :math:`\text{X}` 为输入Tensor
完整卷积核的shape为 :math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` ,其中 `group` 是在空间维度上分割输入 `x` 的组数。
详细介绍请参考论文 `Gradient Based Learning Applied to Document Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_
@ -21,18 +21,18 @@ mindspore.nn.Conv3d
- **in_channels** (int) - Conv3d层输入Tensor的空间维度。
- **out_channels** (int) - Conv3d层输出Tensor的空间维度。
- **kernel_size** (Union[int, tuple[int]]) - 指定三维卷积核的深度、高度和宽度。数据类型为int或包含三个整数的tuple。一个整数表示卷积核的深度、高度和宽度均为该值该值。包含三个整数的tuple分别表示卷积核的深度、高度和宽度。
- **stride** (Union[int, tuple[int]]) - 卷积核的移动步长。数据类型为整型或三个整型的tuple。一个整数表示在深度、高度和宽度方向的移动步长均为该值。三个整数的tuple分别表示在深度、高度和宽度方向的移动步长。默认值1。
- **stride** (Union[int, tuple[int]]) - 三维卷积核的移动步长。数据类型为整型或三个整型的tuple。一个整数表示在深度、高度和宽度方向的移动步长均为该值。三个整数的tuple分别表示在深度、高度和宽度方向的移动步长。默认值1。
- **pad_mode** (str) - 指定填充模式。可选值为"same"、"valid"、"pad"。默认值:"same"。
- same输出的深度、高度和宽度分别与输入整除 `stride` 后的值相同。若设置该模式,`padding` 的值必须为0。
- valid在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式`padding` 的值必须为0。
- pad对输入进行填充。 在输入的前后、垂直和左右方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- pad对输入进行填充。 在输入的深度、高度和宽度方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- **padding** (Union(int, tuple[int])) - 输入的前后、垂直和左右方向上填充的数量。数据类型为int或包含6个整数的tuple。如果 `padding` 是一个整数,则前部、后部、顶部,底部,左边和右边的填充都等于 `padding` 。如果 `padding` 是6个整数的tuple则前部、尾部、顶部、底部、左边和右边的填充分别等于填充padding[0]、padding[1]、padding[2]、padding[3]、padding[4]和padding[5]。值应该要大于等于0默认值0。
- **dilation** (Union[int, tuple[int]]) - 卷积核膨胀尺寸。数据类型为int或三个整数的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。前后、垂直和左右方向上的 k ,其取值范围分别为[1, D]、[1, H]和[1, W]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_ channels``out_channels` 必须可被 `group` 整除。默认值1。当前仅支持1
- **padding** (Union(int, tuple[int])) - 输入的深度、高度和宽度方向上填充的数量。数据类型为int或包含6个整数的tuple。如果 `padding` 是一个整数,则前部、后部、顶部,底部,左边和右边的填充都等于 `padding` 。如果 `padding` 是6个整数的tuple则前部、尾部、顶部、底部、左边和右边的填充分别等于填充padding[0]、padding[1]、padding[2]、padding[3]、padding[4]和padding[5]。值应该要大于等于0默认值0。
- **dilation** (Union[int, tuple[int]]) - 三维卷积核膨胀尺寸。数据类型为int或三个整数的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。深度、高度和宽度方向上的 k ,其取值范围分别为[1, D]、[1, H]和[1, W]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_channels``out_channels` 必须可被 `group` 整除。默认值1。当前仅支持1。
- **has_bias** (bool) - Conv3d层是否添加偏置参数。默认值False。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重矩阵的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重参数的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **bias_init** (Union[Tensor, str, Initializer, numbers.Number]) - 偏置参数的初始化方法。可以使用的初始化方法与"weight_init"相同。更多细节请参考Initializer的值。默认值"zeros"。
- **data_format** (str) - 数据格式的可选值。目前仅支持"NCDHW"。
@ -47,23 +47,35 @@ mindspore.nn.Conv3d
pad_mode为"same"时:
.. math::
D_{out} \left \lfloor{\frac{D_{in}}{\text{stride[0]}} + 1} \right \rfloor
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[1]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[2]}} + 1} \right \rfloor
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"valid"时:
.. math::
D_{out} \left \lfloor{\frac{D_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }{\text{stride[0]}} + 1} \right \rfloor
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }{\text{stride[1]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[2]} \times (\text{kernel_size[2]} - 1) }{\text{stride[2]}} + 1} \right \rfloor
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[2]} \times (\text{kernel_size[2]} - 1) }
{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"pad"时:
.. math::
D_{out} \left \lfloor{\frac{D_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times \text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor
H_{out} \left \lfloor{\frac{H_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times \text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} + padding[4] + padding[5] - (\text{dilation[2]} - 1) \times \text{kernel_size[2]} - 1 }{\text{stride[2]}} + 1} \right \rfloor
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[4] + padding[5] - (\text{dilation[2]} - 1) \times
\text{kernel_size[2]} - 1 }{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
**异常:**

View File

@ -8,7 +8,7 @@ mindspore.nn.Conv3dTranspose
计算三维转置卷积可以视为Conv3d对输入求梯度也称为反卷积实际不是真正的反卷积
输入的shape通常为 :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})` ,其中 :math:`N` 为batch size :math:`C` 是空间维度。:math:`D_{in}, H_{in}, W_{in}` 分别为特征层的深度、高度和宽度。
当Conv3d和ConvTranspose3d使用相同的参数初始化时`pad_mode` 设置为"pad",它们会在输入的前后、垂直和左右方向上填充 :math:`dilation * (kernel\_size - 1) - padding` 个零这种情况下它们的输入和输出shape是互逆的。
当Conv3d和ConvTranspose3d使用相同的参数初始化时`pad_mode` 设置为"pad",它们会在输入的深度、高度和宽度方向上填充 :math:`dilation * (kernel\_size - 1) - padding` 个零这种情况下它们的输入和输出shape是互逆的。
然而,当 `stride` 大于1时Conv3d会将多个输入的shape映射到同一个输出shape。反卷积网络可以参考 `Deconvolutional Networks <https://www.matthewzeiler.com/mattzeiler/deconvolutionalnetworks.pdf>`_
**参数:**
@ -16,18 +16,18 @@ mindspore.nn.Conv3dTranspose
- **in_channels** (int) - Conv3dTranspose层输入Tensor的空间维度。
- **out_channels** (int) - Conv3dTranspose层输出Tensor的空间维度。
- **kernel_size** (Union[int, tuple[int]]) - 指定三维卷积核的深度、高度和宽度。数据类型为int或包含三个整数的tuple。一个整数表示卷积核的深度、高度和宽度均为该值该值。包含三个整数的tuple分别表示卷积核的深度、高度和宽度。
- **stride** (Union[int, tuple[int]]) - 卷积核的移动步长。数据类型为整型或三个整型的tuple。一个整数表示在深度、高度和宽度方向的移动步长均为该值。三个整数的tuple分别表示在深度、高度和宽度方向的移动步长。默认值1。
- **stride** (Union[int, tuple[int]]) - 三维卷积核的移动步长。数据类型为整型或三个整型的tuple。一个整数表示在深度、高度和宽度方向的移动步长均为该值。三个整数的tuple分别表示在深度、高度和宽度方向的移动步长。默认值1。
- **pad_mode** (str) - 指定填充模式。可选值为"same"、"valid"、"pad"。默认值:"same"。
- same输出的深度、高度和宽度分别与输入整除 `stride` 后的值相同。若设置该模式,`padding` 的值必须为0。
- valid在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式`padding` 的值必须为0。
- pad对输入进行填充。 在输入的前后、垂直和左右方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- pad对输入进行填充。 在输入的深度、高度和宽度方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- **padding** (Union(int, tuple[int])) - 输入的前后、垂直和左右方向上填充的数量。数据类型为int或包含6个整数的tuple。如果 `padding` 是一个整数,则前部、后部、顶部,底部,左边和右边的填充都等于 `padding` 。如果 `padding` 是6个整数的tuple则前部、尾部、顶部、底部、左边和右边的填充分别等于填充padding[0]、padding[1]、padding[2]、padding[3]、padding[4]和padding[5]。值应该要大于等于0默认值0。
- **dilation** (Union[int, tuple[int]]) - 卷积核膨胀尺寸。数据类型为int或三个整数的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。前后、垂直和左右方向上的 k ,其取值范围分别为[1, D]、[1, H]和[1, W]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_ channels``out_channels` 必须可被 `group` 整除。当 `group` 大于1时暂不支持Ascend平台。默认值1。当前仅支持1
- **padding** (Union(int, tuple[int])) - 输入的深度、高度和宽度方向上填充的数量。数据类型为int或包含6个整数的tuple。如果 `padding` 是一个整数,则前部、后部、顶部,底部,左边和右边的填充都等于 `padding` 。如果 `padding` 是6个整数的tuple则前部、尾部、顶部、底部、左边和右边的填充分别等于填充padding[0]、padding[1]、padding[2]、padding[3]、padding[4]和padding[5]。值应该要大于等于0默认值0。
- **dilation** (Union[int, tuple[int]]) - 三维卷积核膨胀尺寸。数据类型为int或三个整数的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。深度、高度和宽度方向上的 k ,其取值范围分别为[1, D]、[1, H]和[1, W]。默认值1。
- **group** (int) - 将过滤器拆分为组, `in_channels``out_channels` 必须可被 `group` 整除。当 `group` 大于1时暂不支持Ascend平台。默认值1。当前仅支持1。
- **has_bias** (bool) - Conv3dTranspose层是否添加偏置参数。默认值False。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重矩阵的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重参数的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **bias_init** (Union[Tensor, str, Initializer, numbers.Number]) - 偏置参数的初始化方法。可以使用的初始化方法与"weight_init"相同。更多细节请参考Initializer的值。默认值"zeros"。
- **data_format** (str) - 数据格式的可选值。目前仅支持"NCDHW"。
@ -42,23 +42,35 @@ mindspore.nn.Conv3dTranspose
pad_mode为"same"时:
.. math::
D_{out} \left \lfloor{\frac{D_{in}}{\text{stride[0]}} + 1} \right \rfloor
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[1]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[2]}} + 1} \right \rfloor
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"valid"时:
.. math::
D_{out} \left \lfloor{\frac{D_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }{\text{stride[0]}} + 1} \right \rfloor
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }{\text{stride[1]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[2]} \times (\text{kernel_size[2]} - 1) }{\text{stride[2]}} + 1} \right \rfloor
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[2]} \times (\text{kernel_size[2]} - 1) }
{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"pad"时:
.. math::
D_{out} \left \lfloor{\frac{D_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times \text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor
H_{out} \left \lfloor{\frac{H_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times \text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} + padding[4] + padding[5] - (\text{dilation[2]} - 1) \times \text{kernel_size[2]} - 1 }{\text{stride[2]}} + 1} \right \rfloor
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[4] + padding[5] - (\text{dilation[2]} - 1) \times
\text{kernel_size[2]} - 1 }{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
**异常:**

View File

@ -12,7 +12,7 @@ mindspore.nn.Conv2d
\sum_{k = 0}^{C_{in} - 1} \text{ccor}({\text{weight}(C_{\text{out}_j}, k), \text{X}(N_i, k)})
其中,:math:`ccor``cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_ :math:`C_{in}` 为输入空间维度, :math:`out_{j}` 对应输出的第 :math:`j` 个空间维度,:math:`j` 的范围在 :math:`[0C_{out}-1]` 内,
:math:`\text{weight}(C_{\text{out}_j}, k)`是shape为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})` 的卷积核切片,其中 :math:`\text{kernel_size[0]}`:math:`\text{kernel_size[1]}` 是卷积核的高度和宽度。 :math:`\text{bias}` 为偏置参数。
:math:`\text{weight}(C_{\text{out}_j}, k)`是shape为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})` 的卷积核切片,其中 :math:`\text{kernel_size[0]}`:math:`\text{kernel_size[1]}` 分别是卷积核的高度和宽度。 :math:`\text{bias}` 为偏置参数 :math:`\text{X}` 为输入Tensor
完整卷积核的shape为 :math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]})` ,其中 `group` 是在空间维度上分割输入 `x` 的组数。
详细介绍请参考论文 `Gradient Based Learning Applied to Document Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_
@ -21,18 +21,18 @@ mindspore.nn.Conv2d
- **in_channels** (`int`) Conv2d层输入Tensor的空间维度。
- **out_channels** (`dict`) - Conv2d层输出Tensor的空间维度。
- **kernel_size** (`Union[int, tuple[int]]`) 指定二维卷积核的高度和宽度。数据类型为整型或两个整型的tuple。一个整数表示卷积核的高度和宽度均为该值。两个整数的tuple分别表示卷积核的高度和宽度。
- **stride** (`Union[int, tuple[int]]`) 卷积核的移动步长。数据类型为整型或两个整型的tuple。一个整数表示在高度和宽度方向的移动步长均为该值。两个整数的tuple分别表示在高度和宽度方向的移动步长。默认值1。
- **stride** (`Union[int, tuple[int]]`) 二维卷积核的移动步长。数据类型为整型或两个整型的tuple。一个整数表示在高度和宽度方向的移动步长均为该值。两个整数的tuple分别表示在高度和宽度方向的移动步长。默认值1。
- **pad_mode** (`str`) 指定填充模式。可选值为"same"、"valid"、"pad"。默认值:"same"。
- **same**:输出的高度和宽度分别与输入整除 `stride` 后的值相同。若设置该模式,`padding` 的值必须为0。
- **valid**:在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式,则 `padding` 的值必须为0。
- **pad**:对输入进行填充。 在输入的垂直和水平方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- **pad**:对输入进行填充。在输入的高度和宽度方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- **padding** (`Union[int, tuple[int]]`) 输入的垂直和水平方向上填充的数量。数据类型为int或包含4个整数的tuple。如果 `padding` 是一个整数,那么上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有4个整数的tuple那么上、下、左、右的填充分别等于 `padding[0]``padding[1]``padding[2]``padding[3]` 。值应该要大于等于0默认值0。
- **dilation** (`Union[int, tuple[int]]`) 卷积核膨胀尺寸。数据类型为整型或具有两个整型的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。垂直和水平方向上的 k ,其取值范围分别为[1, H]和[1, W]。默认值1。
- **group** (`int`) 将过滤器拆分为组, `in_ channels``out_channels` 必须可被 `group` 整除。如果组数等于 `in_channels``out_channels` 这个二维卷积层也被称为二维深度卷积层。默认值1.
- **padding** (`Union[int, tuple[int]]`) 输入的高度和宽度方向上填充的数量。数据类型为int或包含4个整数的tuple。如果 `padding` 是一个整数,那么上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有4个整数的tuple那么上、下、左、右的填充分别等于 `padding[0]``padding[1]``padding[2]``padding[3]` 。值应该要大于等于0默认值0。
- **dilation** (`Union[int, tuple[int]]`) 二维卷积核膨胀尺寸。数据类型为整型或具有两个整型的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。垂直和水平方向上的 k ,其取值范围分别为[1, H]和[1, W]。默认值1。
- **group** (`int`) 将过滤器拆分为组, `in_channels``out_channels` 必须可被 `group` 整除。如果组数等于 `in_channels``out_channels` 这个二维卷积层也被称为二维深度卷积层。默认值1.
- **has_bias** (`bool`) Conv2d层是否添加偏置参数。默认值False。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重矩阵的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重参数的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **bias_init** (Union[Tensor, str, Initializer, numbers.Number]) - 偏置参数的初始化方法。可以使用的初始化方法与"weight_init"相同。更多细节请参考Initializer的值。默认值"zeros"。
- **data_format** (`str`) 数据格式的可选值有"NHWC""NCHW"。默认值:"NCHW"。
@ -47,20 +47,30 @@ mindspore.nn.Conv2d
pad_mode为"same"时:
.. math::
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[0]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[1]}} + 1} \right \rfloor
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"valid"时:
.. math::
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }{\text{stride[0]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }{\text{stride[1]}} + 1} \right \rfloor
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"pad"时:
.. math::
H_{out} \left \lfloor{\frac{H_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times \text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times \text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
**异常:**

View File

@ -8,7 +8,7 @@ mindspore.nn.Conv2dTranspose
计算二维转置卷积可以视为Conv2d对输入求梯度也称为反卷积实际不是真正的反卷积
输入的shape通常为 :math:`(N, C, H, W)` ,其中 :math:`N` 是batch size:math:`C` 是空间维度, :math:`H_{in}, W_{in}` 分别为特征层的高度和宽度。
当Conv2d和ConvTranspose2d使用相同的参数初始化时`pad_mode` 设置为"pad",它们会在输入的垂直和水平方向上填充 :math:`dilation * (kernel\_size - 1) - padding` 个零这种情况下它们的输入和输出shape是互逆的。
当Conv2d和ConvTranspose2d使用相同的参数初始化时`pad_mode` 设置为"pad",它们会在输入的高度和宽度方向上填充 :math:`dilation * (kernel\_size - 1) - padding` 个零这种情况下它们的输入和输出shape是互逆的。
然而,当 `stride` 大于1时Conv2d会将多个输入的shape映射到同一个输出shape。反卷积网络可以参考 `Deconvolutional Networks <https://www.matthewzeiler.com/mattzeiler/deconvolutionalnetworks.pdf>`_
**参数:**
@ -16,18 +16,18 @@ mindspore.nn.Conv2dTranspose
- **in_channels** (`int`) Conv2dTranspose层输入Tensor的空间维度。
- **out_channels** (`dict`) - Conv2dTranspose层输出Tensor的空间维度。
- **kernel_size** (`Union[int, tuple[int]]`) 指定二维卷积核的高度和宽度。数据类型为整型或两个整型的tuple。一个整数表示卷积核的高度和宽度均为该值。两个整数的tuple分别表示卷积核的高度和宽度。
- **stride** (`Union[int, tuple[int]]`) 卷积核的移动步长。数据类型为整型或两个整型的tuple。一个整数表示在高度和宽度方向的移动步长均为该值。两个整数的tuple分别表示在高度和宽度方向的移动步长。默认值1。
- **stride** (`Union[int, tuple[int]]`) 二维卷积核的移动步长。数据类型为整型或两个整型的tuple。一个整数表示在高度和宽度方向的移动步长均为该值。两个整数的tuple分别表示在高度和宽度方向的移动步长。默认值1。
- **pad_mode** (`str`) 指定填充模式。可选值为"same"、"valid"、"pad"。默认值:"same"。
- **same**:输出的高度和宽度分别与输入整除 `stride` 后的值相同。若设置该模式,`padding` 的值必须为0。
- **valid**:在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式,则 `padding` 的值必须为0。
- **pad**:对输入进行填充。 在输入的垂直和水平方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- **pad**:对输入进行填充。在输入的高度和宽度方向上填充 `padding` 大小的0。如果设置此模式 `padding` 必须大于或等于0。
- **padding** (`Union[int, tuple[int]]`) 输入的垂直和水平方向上填充的数量。数据类型为int或包含4个整数的tuple。如果 `padding` 是一个整数,那么上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有4个整数的tuple那么上、下、左、右的填充分别等于 `padding[0]``padding[1]``padding[2]``padding[3]` 。值应该要大于等于0默认值0。
- **dilation** (`Union[int, tuple[int]]`) 卷积核膨胀尺寸。数据类型为整型或具有两个整型的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。垂直和水平方向上的 k ,其取值范围分别为[1, H]和[1, W]。默认值1。
- **group** (`int`) 将过滤器拆分为组, `in_ channels``out_channels` 必须可被 `group` 整除。如果组数等于 `in_channels``out_channels` 这个二维卷积层也被称为二维深度卷积层。默认值1.
- **padding** (`Union[int, tuple[int]]`) 输入的高度和宽度方向上填充的数量。数据类型为整型或包含四个整数的tuple。如果 `padding` 是一个整数,那么上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有个整数的tuple那么上、下、左、右的填充分别等于 `padding[0]``padding[1]``padding[2]``padding[3]` 。值应该要大于等于0默认值0。
- **dilation** (`Union[int, tuple[int]]`) 二维卷积核膨胀尺寸。数据类型为整型或具有两个整型的tuple。若 :math:`k > 1` 则kernel间隔 `k` 个元素进行采样。高度和宽度方向上的 k ,其取值范围分别为[1, H]和[1, W]。默认值1。
- **group** (`int`) 将过滤器拆分为组, `in_channels``out_channels` 必须可被 `group` 整除。如果组数等于 `in_channels``out_channels` 这个二维卷积层也被称为二维深度卷积层。默认值1.
- **has_bias** (`bool`) Conv2dTranspose层是否添加偏置参数。默认值False。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重矩阵的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **weight_init** (Union[Tensor, str, Initializer, numbers.Number]) - 权重参数的初始化方法。它可以是TensorstrInitializer或numbers.Number。当使用str时可选"TruncatedNormal""Normal""Uniform""HeUniform"和"XavierUniform"分布以及常量"One"和"Zero"分布的值,可接受别名"xavier_uniform""he_uniform""ones"和"zeros"。上述字符串大小写均可。更多细节请参考Initializer的值。默认值"normal"。
- **bias_init** (Union[Tensor, str, Initializer, numbers.Number]) - 偏置参数的初始化方法。可以使用的初始化方法与"weight_init"相同。更多细节请参考Initializer的值。默认值"zeros"。
**输入:**
@ -41,20 +41,30 @@ mindspore.nn.Conv2dTranspose
pad_mode为"same"时:
.. math::
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[0]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[1]}} + 1} \right \rfloor
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"valid"时:
.. math::
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }{\text{stride[0]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }{\text{stride[1]}} + 1} \right \rfloor
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode为"pad"时:
.. math::
H_{out} \left \lfloor{\frac{H_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times \text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor
W_{out} \left \lfloor{\frac{W_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times \text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
**异常:**

View File

@ -5,15 +5,21 @@ mindspore.ops.Gather
返回输入Tensor在指定 `axis``input_indices` 索引对应的元素组成的切片。
下图展示了Gather常用的计算过程
.. image:: api_img/Gather.png
其中params代表输入`input_params`indices代表要切片的索引`input_indices`
.. note::
input_indices的值必须在 `[0, input_param.shape[axis])` 范围内,超出该范围结果未定义。
**输入:**
- **input_params** (Tensor) - 原始Tensorshape为 :math:`(x_1, x_2, ..., x_R)`
- **input_indices** (Tensor) - 要切片的索引Tensorshape为 :math:`(y_1, y_2, ..., y_S)` 。指定原始Tensor中要切片的索引。数据类型必须是int32或int64。
- **axis** (int) - 指定要切片的维度索引。
.. note::
input_indices的值必须在 `[0, input_param.shape[axis])` 范围内,超出该范围则报错。
**输出:**
Tensorshape为 :math:`input\_params.shape[:axis] + input\_indices.shape + input\_params.shape[axis + 1:]`
@ -21,4 +27,5 @@ mindspore.ops.Gather
**异常:**
- **TypeError** - `axis` 不是int。
- **TypeError** - `input_params` 不是Tensor。
- **TypeError** - `input_indices` 不是int类型的Tensor。

BIN
docs/api_img/Gather.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

View File

@ -48,7 +48,7 @@ class Conv2dBnAct(Cell):
dilation (int): Specifies the dilation rate to use for dilated convolution. If set to be :math:`k > 1`,
there will be :math:`k - 1` pixels skipped for each sampling location. Its value must be greater than
or equal to 1 and lower than any one of the height and width of the `x`. Default: 1.
group (int): Splits filter into groups, `in_ channels` and `out_channels` must be
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by the number of groups. Default: 1.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.

View File

@ -114,79 +114,74 @@ class Conv2d(_Conv):
r"""
2D convolution layer.
Applies a 2D convolution over an input tensor which is typically of shape :math:`(N, C_{in}, H_{in}, W_{in})`,
where :math:`N` is batch size, :math:`C_{in}` is channel number, and :math:`H_{in}, W_{in}` are height and width.
For each batch of shape :math:`(C_{in}, H_{in}, W_{in})`, the formula is defined as:
Calculates the 2D convolution on the input tensor which is typically of shape :math:`(N, C_{in}, H_{in}, W_{in})`,
where :math:`N` is batch size, :math:`C_{in}` is a number of channels,
:math:`H_{in}, W_{in}` are the height and width of the feature layer respectively.
For the tensor of each batch, its shape is :math:`(C_{in}, H_{in}, W_{in})`, the formula is defined as:
.. math::
out_j = \sum_{i=0}^{C_{in} - 1} ccor(W_{ij}, X_i) + b_j,
\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) +
\sum_{k = 0}^{C_{in} - 1} \text{ccor}({\text{weight}(C_{\text{out}_j}, k), \text{X}(N_i, k)})
where :math:`ccor` is the cross-correlation operator, :math:`C_{in}` is the input channel number, :math:`j` ranges
from :math:`0` to :math:`C_{out} - 1`, :math:`W_{ij}` corresponds to the :math:`i`-th channel of the :math:`j`-th
filter and :math:`out_{j}` corresponds to the :math:`j`-th channel of the output. :math:`W_{ij}` is a slice
of kernel and it has shape :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})`,
where :math:`\text{kernel_size[0]}` and :math:`\text{kernel_size[1]}` are the height and width of
the convolution kernel. The full kernel has shape
:math:`(C_{out}, C_{in} // \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]})`,
where group is the group number to split the input `x` in the channel dimension.
where :math:`ccor` is the `cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_,
:math:`C_{in}` is the channel number of the input, :math:`out_{j}` corresponds to the jth channel of
the output and :math:`j` is in the range of :math:`[0C_{out}-1]`. :math:`\text{weight}(C_{\text{out}_j}, k)`
is a convolution kernel slice with shape :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})`,
where :math:`\text{kernel_size[0]}` and :math:`\text{kernel_size[1]}` are the height and width of the convolution
kernel respectively. :math:`\text{bias}` is the bias parameter and :math:`\text{X}` is the input tensor.
The shape of full convolution kernel is
:math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]})`,
where `group` is the number of groups to split the input `x` in the channel dimension.
If the 'pad_mode' is set to be "valid", the output height and width will be
:math:`\left \lfloor{1 + \frac{H_{in} + \text{padding[0]} + \text{padding[1]} - \text{kernel_size[0]} -
(\text{kernel_size[0]} - 1) \times (\text{dilation[0]} - 1) }{\text{stride[0]}}} \right \rfloor` and
:math:`\left \lfloor{1 + \frac{W_{in} + \text{padding[2]} + \text{padding[3]} - \text{kernel_size[1]} -
(\text{kernel_size[1]} - 1) \times (\text{dilation[1]} - 1) }{\text{stride[1]}}} \right \rfloor` respectively.
The first introduction can be found in paper `Gradient Based Learning Applied to Document Recognition
<http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_.
For more details, please refers to the paper `Gradient Based Learning Applied to Document
Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_.
Args:
in_channels (int): The number of input channel :math:`C_{in}`.
out_channels (int): The number of output channel :math:`C_{out}`.
kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the height
and width of the 2D convolution window. Single int means the value is for both the height and the width of
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
width of the kernel.
stride (Union[int, tuple[int]]): The distance of kernel moving, an int number that represents
the height and width of movement are both strides, or a tuple of two int numbers that
represent height and width of movement respectively. Default: 1.
in_channels (int): The channel number of the input tensor of the Conv2d layer.
out_channels (int): The channel number of the output tensor of the Conv2d layer.
kernel_size (Union[int, tuple[int]]): Specifies the height and width of the 2D convolution kernel.
The data type is an integer or a tuple of two integers. An integer represents the height
and width of the convolution kernel. A tuple of two integers represents the height
and width of the convolution kernel respectively.
stride (Union[int, tuple[int]]): The movement stride of the 2D convolution kernel.
The data type is an integer or a tuple of two integers. An integer represents the movement step size
in both height and width directions. A tuple of two integers represents the movement step size in the height
and width directions respectively. Default: 1.
pad_mode (str): Specifies padding mode. The optional values are
"same", "valid", "pad". Default: "same".
- same: Adopts the way of completion. The height and width of the output will be the same as
the input `x`. The total number of padding will be calculated in horizontal and vertical
directions and evenly distributed to top and bottom, left and right if possible. Otherwise, the
last extra padding will be done from the bottom and the right side. If this mode is set, `padding`
must be 0.
- same: The width of the output is the same as the value of the input divided by `stride`.
If this mode is set, the value of `padding` must be 0.
- valid: Adopts the way of discarding. The possible largest height and width of output will be returned
without padding. Extra pixels will be discarded. If this mode is set, `padding`
must be 0.
- valid: Returns a valid calculated output without padding. Excess pixels that do not satisfy the
calculation will be discarded. If this mode is set, the value of `padding` must be 0.
- pad: Implicit paddings on both sides of the input `x`. The number of `padding` will be padded to the input
Tensor borders. `padding` must be greater than or equal to 0.
- pad: Pads the input. Padding `padding` size of zero on both sides of the input.
If this mode is set, the value of `padding` must be greater than or equal to 0.
padding (Union[int, tuple[int]]): Implicit paddings on both sides of the input `x`. If `padding` is one integer,
the paddings of top, bottom, left and right are the same, equal to padding. If `padding` is a tuple
with four integers, the paddings of top, bottom, left and right will be equal to padding[0],
padding[1], padding[2], and padding[3] accordingly. Default: 0.
dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate
to use for dilated convolution. If set to be :math:`k > 1`, there will
be :math:`k - 1` pixels skipped for each sampling location. Its value must
be greater or equal to 1 and bounded by the height and width of the
input `x`. Default: 1.
group (int): Splits filter into groups, `in_ channels` and `out_channels` must be
divisible by the number of groups. If the group is equal to `in_channels` and `out_channels`,
padding (Union[int, tuple[int]]): The number of padding on the height and width directions of the input.
The data type is an integer or a tuple of four integers. If `padding` is an integer,
then the top, bottom, left, and right padding are all equal to `padding`.
If `padding` is a tuple of 4 integers, then the top, bottom, left, and right padding
is equal to `padding[0]`, `padding[1]`, `padding[2]`, and `padding[3]` respectively.
The value should be greater than or equal to 0. Default: 0.
dilation (Union[int, tuple[int]]): Dilation size of 2D convolution kernel.
The data type is an integer or a tuple of two integers. If :math:`k > 1`, the kernel is sampled
every `k` elements. The value of `k` on the height and width directions is in range of [1, H]
and [1, W] respectively. Default: 1.
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by `group`. If the group is equal to `in_channels` and `out_channels`,
this 2D convolution layer also can be called 2D depthwise convolution layer. Default: 1.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.
It can be a Tensor, a string, an Initializer or a number. When a string is specified,
has_bias (bool): Whether the Conv2d layer has a bias parameter. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of weight parameter.
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified,
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of
Initializer for more details. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Possible
Initializer and string are the same as 'weight_init'. Refer to the values of
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of bias parameter.
Available initialization methods are the same as 'weight_init'. Refer to the values of
Initializer for more details. Default: 'zeros'.
data_format (str): The optional value for data format, is 'NHWC' or 'NCHW'.
Default: 'NCHW'.
@ -198,6 +193,34 @@ class Conv2d(_Conv):
Outputs:
Tensor of shape :math:`(N, C_{out}, H_{out}, W_{out})` or :math:`(N, H_{out}, W_{out}, C_{out})`.
pad_mode is 'same':
.. math::
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'valid':
.. math::
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'pad':
.. math::
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
Raises:
TypeError: If `in_channels`, `out_channels` or `group` is not an int.
TypeError: If `kernel_size`, `stride`, `padding` or `dilation` is neither an int not a tuple.
@ -298,75 +321,82 @@ class Conv1d(_Conv):
r"""
1D convolution layer.
Applies a 1D convolution over an input tensor which is typically of shape :math:`(N, C_{in}, W_{in})`,
where :math:`N` is batch size and :math:`C_{in}` is channel number. For each batch of shape
:math:`(C_{in}, W_{in})`, the formula is defined as:
Calculates the 1D convolution on the input tensor which is typically of shape :math:`(N, C_{in}, L_{in})`,
where :math:`N` is batch size, :math:`C_{in}` is a number of channels and :math:`L_{in}` is a length of sequence.
For the tensor of each batch, its shape is :math:`(C_{in}, L_{in})`, the formula is defined as:
.. math::
out_j = \sum_{i=0}^{C_{in} - 1} ccor(W_{ij}, X_i) + b_j,
\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) +
\sum_{k = 0}^{C_{in} - 1} \text{ccor}({\text{weight}(C_{\text{out}_j}, k), \text{X}(N_i, k)})
where :math:`ccor` is the cross correlation operator, :math:`C_{in}` is the input channel number, :math:`j` ranges
from :math:`0` to :math:`C_{out} - 1`, :math:`W_{ij}` corresponds to the :math:`i`-th channel of the :math:`j`-th
filter and :math:`out_{j}` corresponds to the :math:`j`-th channel of the output. :math:`W_{ij}` is a slice
of kernel and it has shape :math:`(\text{ks_w})`, where :math:`\text{ks_w}` is the width of the convolution kernel.
The full kernel has shape :math:`(C_{out}, C_{in} // \text{group}, \text{ks_w})`, where group is the group number
to split the input `x` in the channel dimension.
where :math:`ccor` is the `cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_,
:math:`C_{in}` is the channel number of the input, :math:`out_{j}` corresponds to the jth channel of
the output and :math:`j` is in the range of :math:`[0C_{out}-1]`. :math:`\text{weight}(C_{\text{out}_j}, k)`
is a convolution kernel slice with shape :math:`(kernel_size)`, where :math:`\text{kernel_size}` is the width of
the convolution kernel. :math:`\text{bias}` is the bias parameter, and :math:`\text{X}` is the input tensor.
The shape of full convolution kernel is :math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size})`,
where `group` is the number of groups to split the input `x` in the channel dimension.
If the 'pad_mode' is set to be "valid", the output width will be
:math:`\left \lfloor{1 + \frac{W_{in} + 2 \times \text{padding} - \text{ks_w} -
(\text{ks_w} - 1) \times (\text{dilation} - 1) }{\text{stride}}} \right \rfloor` respectively.
The first introduction of convolution layer can be found in paper `Gradient Based Learning Applied to Document
For more details, please refers to the paper `Gradient Based Learning Applied to Document
Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_.
Args:
in_channels (int): The number of input channel :math:`C_{in}`.
out_channels (int): The number of output channel :math:`C_{out}`.
kernel_size (int): The data type is int. Specifies the
width of the 1D convolution window.
stride (int): The distance of kernel moving, an int number that represents
the width of movement. Default: 1.
in_channels (int): The channel number of the input tensor of the Conv1d layer.
out_channels (int): The channel number of the output tensor of the Conv1d layer.
kernel_size (int): Specifies the width of the 1D convolution kernel.
stride (int): The movement stride of the 1D convolution kernel. Default: 1.
pad_mode (str): Specifies padding mode. The optional values are
"same", "valid", "pad". Default: "same".
- same: Adopts the way of completion. The output width will be the same as the input `x`.
The total number of padding will be calculated in the horizontal
direction and evenly distributed to left and right if possible. Otherwise, the
last extra padding will be done from the bottom and the right side. If this mode is set, `padding`
must be 0.
- same: The width of the output is the same as the value of the input divided by `stride`.
If this mode is set, the value of `padding` must be 0.
- valid: Adopts the way of discarding. The possible largest width of the output will be returned
without padding. Extra pixels will be discarded. If this mode is set, `padding`
must be 0.
- valid: Returns a valid calculated output without padding. Excess pixels that do not satisfy the
calculation will be discarded. If this mode is set, the value of `padding` must be 0.
- pad: Implicit paddings on both sides of the input `x`. The number of `padding` will be padded to the input
Tensor borders. `padding` must be greater than or equal to 0.
- pad: Pads the input. Padding `padding` size of zero on both sides of the input.
If this mode is set, the value of `padding` must be greater than or equal to 0.
padding (int): Implicit paddings on both sides of the input `x`. Default: 0.
dilation (int): The data type is int. Specifies the dilation rate
to use for dilated convolution. If set to be :math:`k > 1`, there will
be :math:`k - 1` pixels skipped for each sampling location. Its value must
be greater or equal to 1 and bounded by the height and width of the
input `x`. Default: 1.
group (int): Splits filter into groups, `in_ channels` and `out_channels` must be
divisible by the number of groups. Default: 1.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): An initializer for the convolution kernel.
It can be a Tensor, a string, an Initializer or a number. When a string is specified,
padding (int): The number of padding on both sides of input.
The value should be greater than or equal to 0. Default: 0.
dilation (int): Dilation size of 1D convolution kernel. If :math:`k > 1`, the kernel is sampled
every `k` elements. The value of `k` is in range of [1, L]. Default: 1.
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by `group`. Default: 1.
has_bias (bool): Whether the Conv1d layer has a bias parameter. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of weight parameter.
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified,
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of
Initializer for more details. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Possible
Initializer and string are the same as 'weight_init'. Refer to the values of
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of bias parameter.
Available initialization methods are the same as 'weight_init'. Refer to the values of
Initializer for more details. Default: 'zeros'.
Inputs:
- **x** (Tensor) - Tensor of shape :math:`(N, C_{in}, W_{in})`.
- **x** (Tensor) - Tensor of shape :math:`(N, C_{in}, L_{in})`.
Outputs:
Tensor of shape :math:`(N, C_{out}, W_{out})`.
Tensor of shape :math:`(N, C_{out}, L_{out})`.
pad_mode is 'same':
.. math::
L_{out} \left \lfloor{\frac{L_{in}}{\text{stride}} + 1} \right \rfloor
pad_mode is 'valid':
.. math::
L_{out} \left \lfloor{\frac{L_{in} - \text{dilation} \times (\text{kernel_size} - 1) }
{\text{stride}} + 1} \right \rfloor
pad_mode is 'pad':
.. math::
L_{out} \left \lfloor{\frac{L_{in} + 2 \times padding - (\text{dilation} - 1) \times
\text{kernel_size} - 1 }{\text{stride}} + 1} \right \rfloor
Raises:
TypeError: If `in_channels`, `out_channels`, `kernel_size`, `stride`, `padding` or `dilation` is not an int.
@ -487,75 +517,76 @@ class Conv3d(_Conv):
r"""
3D convolution layer.
Applies a 3D convolution over an input tensor which is typically of shape
:math:`(N, C_{in}, D_{in}, H_{in}, W_{in})` and output shape
:math:`(N, C_{out}, D_{out}, H_{out}, W_{out})`. where :math:`N` is batch size. :math:`C` is channel number.
the formula is defined as:
Calculates the 3D convolution on the input tensor which is typically of shape
:math:`(N, C_{in}, D_{in}, H_{in}, W_{in})`,
where :math:`N` is batch size, :math:`C_{in}` is a number of channels,
:math:`D_{in}, H_{in}, W_{in}` are the depth, height and width of the feature layer respectively.
For the tensor of each batch, its shape is :math:`(C_{in}, D_{in}, H_{in}, W_{in})`, the formula is defined as:
.. math::
\operatorname{out}\left(N_{i}, C_{\text {out}_j}\right)=\operatorname{bias}\left(C_{\text {out}_j}\right)+
\sum_{k=0}^{C_{in}-1} ccor(\text {weight}\left(C_{\text {out}_j}, k\right),
\operatorname{input}\left(N_{i}, k\right))
\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) +
\sum_{k = 0}^{C_{in} - 1} \text{ccor}({\text{weight}(C_{\text{out}_j}, k), \text{X}(N_i, k)})
where :math:`ccor` is the cross-correlation operator.
where :math:`ccor` is the `cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_,
:math:`C_{in}` is the channel number of the input, :math:`out_{j}` corresponds to the jth channel of
the output and :math:`j` is in the range of :math:`[0C_{out}-1]`. :math:`\text{weight}(C_{\text{out}_j}, k)`
is a convolution kernel slice with shape
:math:`(\text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})`,
where :math:`\text{kernel_size[0]}`, :math:`\text{kernel_size[1]}` and :math:`\text{kernel_size[2]}` are
the depth, height and width of the convolution kernel respectively. :math:`\text{bias}` is the bias parameter
and :math:`\text{X}` is the input tensor.
The shape of full convolution kernel is
:math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})`,
where `group` is the number of groups to split the input `x` in the channel dimension.
If the 'pad_mode' is set to be "valid", the output depth, height and width will be
:math:`\left \lfloor{1 + \frac{D_{in} + \text{padding[0]} + \text{padding[1]} - \text{kernel_size[0]} -
(\text{kernel_size[0]} - 1) \times (\text{dilation[0]} - 1) }{\text{stride[0]}}} \right \rfloor` and
:math:`\left \lfloor{1 + \frac{H_{in} + \text{padding[2]} + \text{padding[3]} - \text{kernel_size[1]} -
(\text{kernel_size[1]} - 1) \times (\text{dilation[1]} - 1) }{\text{stride[1]}}} \right \rfloor` and
:math:`\left \lfloor{1 + \frac{W_{in} + \text{padding[4]} + \text{padding[5]} - \text{kernel_size[2]} -
(\text{kernel_size[2]} - 1) \times (\text{dilation[2]} - 1) }{\text{stride[2]}}} \right \rfloor` respectively.
For more details, please refers to the paper `Gradient Based Learning Applied to Document
Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_.
Args:
in_channels (int): The number of input channel :math:`C_{in}`.
out_channels (int): The number of output channel :math:`C_{out}`.
kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 3 integers.
Specifies the depth, height and width of the 3D convolution window.
Single int means the value is for the depth, height and the width of the kernel.
A tuple of 3 ints means the first value is for the depth, the second value is for the height and the
other is for the width of the kernel.
stride (Union[int, tuple[int]]): The distance of kernel moving, an int number that represents
the depth, height and width of movement are both strides, or a tuple of three int numbers that
represent depth, height and width of movement respectively. Default: 1.
in_channels (int): The channel number of the input tensor of the Conv3d layer.
out_channels (int): The channel number of the output tensor of the Conv3d layer.
kernel_size (Union[int, tuple[int]]): Specifies the depth, height and width of the 3D convolution kernel.
The data type is an integer or a tuple of three integers. An integer represents the depth, height
and width of the convolution kernel. A tuple of three integers represents the depth, height
and width of the convolution kernel respectively.
stride (Union[int, tuple[int]]): The movement stride of the 3D convolution kernel.
The data type is an integer or a tuple of three integers. An integer represents the movement step size
in depth, height and width directions. A tuple of three integers represents the movement step size
in the depth, height and width directions respectively. Default: 1.
pad_mode (str): Specifies padding mode. The optional values are
"same", "valid", "pad". Default: "same".
- same: Adopts the way of completion. The depth, height and width of the output will be the same as
the input `x`. The total number of padding will be calculated in depth, horizontal and vertical
directions and evenly distributed to head and tail, top and bottom, left and right if possible.
Otherwise, the last extra padding will be done from the tail, bottom and the right side.
If this mode is set, `padding` must be 0.
- same: The width of the output is the same as the value of the input divided by `stride`.
If this mode is set, the value of `padding` must be 0.
- valid: Adopts the way of discarding. The possible largest depth, height and width of output
will be returned without padding. Extra pixels will be discarded. If this mode is set, `padding`
must be 0.
- valid: Returns a valid calculated output without padding. Excess pixels that do not satisfy the
calculation will be discarded. If this mode is set, the value of `padding` must be 0.
- pad: Implicit paddings on both sides of the input `x` in depth, height, width. The number of `padding`
will be padded to the input Tensor borders. `padding` must be greater than or equal to 0.
- pad: Pads the input. Padding `padding` size of zero on both sides of the input.
If this mode is set, the value of `padding` must be greater than or equal to 0.
padding (Union(int, tuple[int])): Implicit paddings on both sides of the input `x`.
The data type is int or a tuple of 6 integers. Default: 0. If `padding` is an integer,
the paddings of head, tail, top, bottom, left and right are the same, equal to padding.
If `paddings` is a tuple of six integers, the padding of head, tail, top, bottom, left and right equal to
padding[0], padding[1], padding[2], padding[3], padding[4] and padding[5] correspondingly.
dilation (Union[int, tuple[int]]): The data type is int or a tuple of 3 integers
:math:`(dilation_d, dilation_h, dilation_w)`. Currently, dilation on depth only supports the case of 1.
Specifies the dilation rate to use for dilated convolution. If set to be :math:`k > 1`,
there will be :math:`k - 1` pixels skipped for each sampling location.
Its value must be greater or equal to 1 and bounded by the height and width of the input `x`. Default: 1.
group (int): Splits filter into groups, `in_ channels` and `out_channels` must be
divisible by the number of groups. Default: 1. Only 1 is currently supported.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.
It can be a Tensor, a string, an Initializer or a number. When a string is specified,
padding (Union(int, tuple[int])): The number of padding on the depth, height and width directions of the input.
The data type is an integer or a tuple of six integers. If `padding` is an integer,
then the head, tail, top, bottom, left, and right padding are all equal to `padding`.
If `padding` is a tuple of six integers, then the head, tail, top, bottom, left, and right padding
is equal to `padding[0]`, `padding[1]`, `padding[2]`, `padding[3]`, `padding[4]` and `padding[5]`
respectively. The value should be greater than or equal to 0. Default: 0.
dilation (Union[int, tuple[int]]): Dilation size of 3D convolution kernel.
The data type is an integer or a tuple of three integers. If :math:`k > 1`, the kernel is sampled
every `k` elements. The value of `k` on the depth, height and width directions is in range of
[1, D], [1, H] and [1, W] respectively. Default: 1.
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by `group`. Default: 1. Only 1 is currently supported.
has_bias (bool): Whether the Conv3d layer has a bias parameter. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of weight parameter.
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified,
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of
Initializer for more details. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Possible
Initializer and string are the same as 'weight_init'. Refer to the values of
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of bias parameter.
Available initialization methods are the same as 'weight_init'. Refer to the values of
Initializer for more details. Default: 'zeros'.
data_format (str): The optional value for data format. Currently only support "NCDHW".
@ -564,7 +595,41 @@ class Conv3d(_Conv):
Currently input data type only support float16 and float32.
Outputs:
Tensor, the value that applied 3D convolution. The shape is :math:`(N, C_{out}, D_{out}, H_{out}, W_{out})`.
Tensor of shape is :math:`(N, C_{out}, D_{out}, H_{out}, W_{out})`.
pad_mode is 'same':
.. math::
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'valid':
.. math::
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[2]} \times (\text{kernel_size[2]} - 1) }
{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'pad':
.. math::
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[4] + padding[5] - (\text{dilation[2]} - 1) \times
\text{kernel_size[2]} - 1 }{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
Raises:
TypeError: If `in_channels`, `out_channels` or `group` is not an int.
@ -663,102 +728,64 @@ class Conv3d(_Conv):
class Conv3dTranspose(_Conv):
r"""
Compute a 3D transposed convolution, which is also known as a deconvolution
(although it is not an actual deconvolution).
The transposed convolution operator multiplies each input value element-wise by a learnable kernel,
and sums over the outputs from all input feature planes.
This module can be seen as the gradient of Conv3d with respect to its input.
3D transposed convolution layer.
`x` is typically of shape :math:`(N, C, D, H, W)`, where :math:`N` is batch size, :math:`C` is channel number,
:math:`D` is the characteristic depth, :math:`H` is the height of the characteristic layer,
and :math:`W` is the width of the characteristic layer.
The calculation process of transposed convolution is equivalent to the reverse calculation of convolution.
Calculates a 3D transposed convolution, which can be regarded as Conv3d for the gradient of the input.
It also called deconvolution (although it is not an actual deconvolution).
The pad_mode argument effectively adds :math:`dilation * (kernel\_size - 1) - padding` amount of zero padding
to both sizes of the input. So that when a Conv3d and a ConvTranspose3d are initialized with same parameters,
they are inverses of each other in regard to the input and output shapes.
However, when stride > 1, Conv3d maps multiple input shapes to the same output shape.
ConvTranspose3d provide padding argument to increase the calculated output shape on one or more side.
The input is typically of shape :math:`(N, C, D, H, W)`, where :math:`N` is batch size, :math:`C` is a number of
channels, :math:`D_{in}, H_{in}, W_{in}` are the depth, height and width of the feature layer respectively.
The height and width of output are defined as:
if the 'pad_mode' is set to be "pad",
.. math::
D_{out} = (D_{in} - 1) \times \text{stride_d} - 2 \times \text{padding_d} + \text{dilation_d} \times
(\text{kernel_size_d} - 1) + \text{output_padding_d} + 1
H_{out} = (H_{in} - 1) \times \text{stride_h} - 2 \times \text{padding_h} + \text{dilation_h} \times
(\text{kernel_size_h} - 1) + \text{output_padding_h} + 1
W_{out} = (W_{in} - 1) \times \text{stride_w} - 2 \times \text{padding_w} + \text{dilation_w} \times
(\text{kernel_size_w} - 1) + \text{output_padding_w} + 1
if the 'pad_mode' is set to be "SAME",
.. math::
D_{out} = (D_{in} + \text{stride_d} - 1)/\text{stride_d} \\
H_{out} = (H_{in} + \text{stride_h} - 1)/\text{stride_h} \\
W_{out} = (W_{in} + \text{stride_w} - 1)/\text{stride_w}
if the 'pad_mode' is set to be "VALID",
.. math::
D_{out} = (D_{in} - 1) \times \text{stride_d} + \text{dilation_d} \times
(\text{kernel_size_d} - 1) + 1 \\
H_{out} = (H_{in} - 1) \times \text{stride_h} + \text{dilation_h} \times
(\text{kernel_size_h} - 1) + 1 \\
W_{out} = (W_{in} - 1) \times \text{stride_w} + \text{dilation_w} \times
(\text{kernel_size_w} - 1) + 1
When Conv3d and Conv3dTranspose are initialized with the same parameters, and `pad_mode` is set to 'pad',
:math:`dilation * (kernel\_size - 1) - padding` amount of zero will be paded to the depth, height and width
directions of the input, they are inverses of each other in regard to the input and output shapes in this case.
However, when `stride` > 1, Conv2d maps multiple input shapes to the same output shape. Deconvolutional network
can refer to `Deconvolutional Networks <https://www.matthewzeiler.com/matzeiler/deconvolutionalnetworks.pdf>`_.
Args:
in_channels (int): The number of input channel :math:`C_{in}`.
out_channels (int): The number of output channel :math:`C_{out}`.
kernel_size (Union[int, tuple[int]]): The kernel size of the 3D convolution.
stride (Union[int, tuple[int]]): The distance of kernel moving, an int number that represents
the depth, height and width of movement are both strides, or a tuple of three int numbers that
represent depth, height and width of movement respectively. Its value must be equal to or greater than 1.
Default: 1.
pad_mode (str): Select the mode of the pad. The optional values are
"pad", "same", "valid". Default: "same".
in_channels (int): The channel number of the input tensor of the Conv3dTranspose layer.
out_channels (int): The channel number of the output tensor of the Conv3dTranspose layer.
kernel_size (Union[int, tuple[int]]): Specifies the depth, height and width of the 3D convolution kernel.
The data type is an integer or a tuple of three integers. An integer represents the depth, height
and width of the convolution kernel. A tuple of three integers represents the depth, height
and width of the convolution kernel respectively.
stride (Union[int, tuple[int]]): The movement stride of the 3D convolution kernel.
The data type is an integer or a tuple of three integers. An integer represents the movement step size
in depth, height and width directions. A tuple of three integers represents the movement step size
in the depth, height and width directions respectively. Default: 1.
pad_mode (str): Specifies padding mode. The optional values are
"same", "valid", "pad". Default: "same".
- same: Adopts the way of completion. The depth, height and width of the output will be the same as
the input `x`. The total number of padding will be calculated in depth, horizontal and vertical
directions and evenly distributed to head and tail, top and bottom, left and right if possible.
Otherwise, the last extra padding will be done from the tail, bottom and the right side.
If this mode is set, `padding` and `output_padding` must be 0.
- same: The width of the output is the same as the value of the input divided by `stride`.
If this mode is set, the value of `padding` must be 0.
- valid: Adopts the way of discarding. The possible largest depth, height and width of output
will be returned without padding. Extra pixels will be discarded. If this mode is set, `padding`
and `output_padding` must be 0.
- valid: Returns a valid calculated output without padding. Excess pixels that do not satisfy the
calculation will be discarded. If this mode is set, the value of `padding` must be 0.
- pad: Implicit paddings on both sides of the input `x` in depth, height, width. The number of `pad` will
be padded to the input Tensor borders. `padding` must be greater than or equal to 0.
- pad: Pads the input. Padding `padding` size of zero on both sides of the input.
If this mode is set, the value of `padding` must be greater than or equal to 0.
padding (Union(int, tuple[int])): The pad value to be filled. Default: 0. If `padding` is an integer,
the paddings of head, tail, top, bottom, left and right are the same, equal to padding.
If `padding` is a tuple of six integers, the padding of head, tail, top, bottom, left and right equal to
padding[0], padding[1], padding[2], padding[3], padding[4] and padding[5] correspondingly.
dilation (Union(int, tuple[int])): The data type is int or a tuple of 3 integers
:math:`(dilation_d, dilation_h, dilation_w)`. Currently, dilation on depth only supports the case of 1.
Specifies the dilation rate to use for dilated convolution. If set to be :math:`k > 1`,
there will be :math:`k - 1` pixels skipped for each sampling location.
Its value must be greater or equal to 1 and bounded by the height and width of the input `x`. Default: 1.
group (int): Splits filter into groups, `in_ channels` and `out_channels` must be
divisible by the number of groups. Default: 1. Only 1 is currently supported.
output_padding (Union(int, tuple[int])): Add extra size to each dimension of the output. Default: 0.
Must be greater than or equal to 0.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.
It can be a Tensor, a string, an Initializer or a number. When a string is specified,
padding (Union(int, tuple[int])): The number of padding on the depth, height and width directions of the input.
The data type is an integer or a tuple of six integers. If `padding` is an integer,
then the head, tail, top, bottom, left, and right padding are all equal to `padding`.
If `padding` is a tuple of six integers, then the head, tail, top, bottom, left, and right padding
is equal to `padding[0]`, `padding[1]`, `padding[2]`, `padding[3]`, `padding[4]` and `padding[5]`
respectively. The value should be greater than or equal to 0. Default: 0.
dilation (Union(int, tuple[int])): Dilation size of 3D convolution kernel.
The data type is an integer or a tuple of three integers. If :math:`k > 1`, the kernel is sampled
every `k` elements. The value of `k` on the depth, height and width directions is in range of
[1, D], [1, H] and [1, W] respectively. Default: 1.
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by `group`. Default: 1. Only 1 is currently supported.
has_bias (bool): Whether the Conv3dTranspose layer has a bias parameter. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of weight parameter.
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified,
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of
Initializer for more details. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Possible
Initializer and string are the same as 'weight_init'. Refer to the values of
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of bias parameter.
Available initialization methods are the same as 'weight_init'. Refer to the values of
Initializer for more details. Default: 'zeros'.
data_format (str): The optional value for data format. Currently only support 'NCDHW'.
@ -769,6 +796,40 @@ class Conv3dTranspose(_Conv):
Outputs:
Tensor, the shape is :math:`(N, C_{out}, D_{out}, H_{out}, W_{out})`.
pad_mode is 'same':
.. math::
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'valid':
.. math::
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[2]} \times (\text{kernel_size[2]} - 1) }
{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'pad':
.. math::
\begin{array}{ll} \\
D_{out} \left \lfloor{\frac{D_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
H_{out} \left \lfloor{\frac{H_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[4] + padding[5] - (\text{dilation[2]} - 1) \times
\text{kernel_size[2]} - 1 }{\text{stride[2]}} + 1} \right \rfloor \\
\end{array}
Supported Platforms:
``Ascend`` ``GPU``
@ -890,89 +951,62 @@ class Conv2dTranspose(_Conv):
r"""
2D transposed convolution layer.
Compute a 2D transposed convolution, which is also known as a deconvolution
(although it is not an actual deconvolution).
This module can be seen as the gradient of Conv2d with respect to its input.
Calculates a 2D transposed convolution, which can be regarded as Conv2d for the gradient of the input.
It also called deconvolution (although it is not an actual deconvolution).
`x` is typically of shape :math:`(N, C, H, W)`, where :math:`N` is batch size, :math:`C` is channel number,
:math:`H` is the height of the characteristic layer and :math:`W` is the width of the characteristic layer.
The input is typically of shape :math:`(N, C, H, W)`, where :math:`N` is batch size, :math:`C` is a number of
channels, :math:`H_{in}, W_{in}` are the height and width of the feature layer respectively.
The pad_mode argument effectively adds :math:`dilation * (kernel\_size - 1) - padding` amount of zero padding
to both sizes of the input. So that when a Conv2d and a ConvTranspose2d are initialized with same parameters,
they are inverses of each other in regard to the input and output shapes.
However, when stride > 1, Conv2d maps multiple input shapes to the same output shape.
ConvTranspose2d provide padding argument to increase the calculated output shape on one or more side.
The height and width of output are defined as:
if the 'pad_mode' is set to be "pad",
.. math::
H_{out} = (H_{in} - 1) \times \text{stride[0]} - \left (\text{padding[0]} + \text{padding[1]}\right ) +
\text{dilation[0]} \times (\text{kernel_size[0]} - 1) + 1
W_{out} = (W_{in} - 1) \times \text{stride[1]} - \left (\text{padding[2]} + \text{padding[3]}\right ) +
\text{dilation[1]} \times (\text{kernel_size[1]} - 1) + 1
if the 'pad_mode' is set to be "SAME",
.. math::
H_{out} = (H_{in} + \text{stride[0]} - 1)/\text{stride[0]} \\
W_{out} = (W_{in} + \text{stride[1]} - 1)/\text{stride[1]}
if the 'pad_mode' is set to be "VALID",
.. math::
H_{out} = (H_{in} - 1) \times \text{stride[0]} + \text{dilation[0]} \times
(\text{ks_w[0]} - 1) + 1 \\
W_{out} = (W_{in} - 1) \times \text{stride[1]} + \text{dilation[1]} \times
(\text{ks_w[1]} - 1) + 1
where :math:`\text{kernel_size[0]}` is the height of the convolution kernel and :math:`\text{kernel_size[1]}`
is the width of the convolution kernel.
When Conv2d and Conv2dTranspose are initialized with the same parameters, and `pad_mode` is set to 'pad',
:math:`dilation * (kernel\_size - 1) - padding` amount of zero will be paded to the height and width
directions of the input, they are inverses of each other in regard to the input and output shapes in this case.
However, when `stride` > 1, Conv2d maps multiple input shapes to the same output shape. Deconvolutional network
can refer to `Deconvolutional Networks <https://www.matthewzeiler.com/matzeiler/deconvolutionalnetworks.pdf>`_.
Args:
in_channels (int): The number of channels in the input space.
out_channels (int): The number of channels in the output space.
kernel_size (Union[int, tuple]): int or a tuple of 2 integers, which specifies the height
and width of the 2D convolution window. Single int means the value is for both the height and the width of
the kernel. A tuple of 2 ints means the first value is for the height and the other is for the
width of the kernel.
stride (Union[int, tuple[int]]): The distance of kernel moving, an int number that represents
the height and width of movement are both strides, or a tuple of two int numbers that
represent height and width of movement respectively. Its value must be equal to or greater than 1.
in_channels (int): The channel number of the input tensor of the Conv2dTranspose layer.
out_channels (int): The channel number of the output tensor of the Conv2dTranspose layer.
kernel_size (Union[int, tuple]): Specifies the height and width of the 2D convolution kernel.
The data type is an integer or a tuple of two integers. An integer represents the height
and width of the convolution kernel. A tuple of two integers represents the height
and width of the convolution kernel respectively.
stride (Union[int, tuple[int]]): The movement stride of the 2D convolution kernel.
The data type is an integer or a tuple of two integers. An integer represents the movement step size
in both height and width directions. A tuple of two integers represents the movement step size in the height
and width directions respectively. Default: 1.
pad_mode (str): Specifies padding mode. The optional values are
"same", "valid", "pad". Default: "same".
- same: The width of the output is the same as the value of the input divided by `stride`.
If this mode is set, the value of `padding` must be 0.
- valid: Returns a valid calculated output without padding. Excess pixels that do not satisfy the
calculation will be discarded. If this mode is set, the value of `padding` must be 0.
- pad: Pads the input. Padding `padding` size of zero on both sides of the input.
If this mode is set, the value of `padding` must be greater than or equal to 0.
padding (Union[int, tuple[int]]): The number of padding on the height and width directions of the input.
The data type is an integer or a tuple of four integers. If `padding` is an integer,
then the top, bottom, left, and right padding are all equal to `padding`.
If `padding` is a tuple of 4 integers, then the top, bottom, left, and right padding
is equal to `padding[0]`, `padding[1]`, `padding[2]`, and `padding[3]` respectively.
The value should be greater than or equal to 0. Default: 0.
dilation (Union[int, tuple[int]]): Dilation size of 2D convolution kernel.
The data type is an integer or a tuple of two integers. If :math:`k > 1`, the kernel is sampled
every `k` elements. The value of `k` on the height and width directions is in range of [1, H]
and [1, W] respectively. Default: 1.
group (int): Splits filter into groups, `in_channels` and `out_channels` must be divisible by `group`.
Default: 1.
pad_mode (str): Select the mode of the pad. The optional values are
"pad", "same", "valid". Default: "same".
- pad: Implicit paddings on both sides of the input `x`.
- same: Adopted the way of completion.
- valid: Adopted the way of discarding.
padding (Union[int, tuple[int]]): Implicit paddings on both sides of the input `x`. If `padding` is one integer,
the paddings of top, bottom, left and right are the same, equal to padding. If `padding` is a tuple
with four integers, the paddings of top, bottom, left and right will be equal to padding[0],
padding[1], padding[2], and padding[3] accordingly. Default: 0.
dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate
to use for dilated convolution. If set to be :math:`k > 1`, there will
be :math:`k - 1` pixels skipped for each sampling location. Its value must
be greater than or equal to 1 and bounded by the height and width of the
input `x`. Default: 1.
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by the number of groups. This does not support for Davinci devices when group > 1. Default: 1.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.
It can be a Tensor, a string, an Initializer or a number. When a string is specified,
has_bias (bool): Whether the Conv2dTranspose layer has a bias parameter. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of weight parameter.
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified,
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of
Initializer for more details. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Possible
Initializer and string are the same as 'weight_init'. Refer to the values of
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of bias parameter.
Available initialization methods are the same as 'weight_init'. Refer to the values of
Initializer for more details. Default: 'zeros'.
Inputs:
@ -981,6 +1015,34 @@ class Conv2dTranspose(_Conv):
Outputs:
Tensor of shape :math:`(N, C_{out}, H_{out}, W_{out})`.
pad_mode is 'same':
.. math::
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in}}{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in}}{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'valid':
.. math::
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} - \text{dilation[0]} \times (\text{kernel_size[0]} - 1) }
{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} - \text{dilation[1]} \times (\text{kernel_size[1]} - 1) }
{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
pad_mode is 'pad':
.. math::
\begin{array}{ll} \\
H_{out} \left \lfloor{\frac{H_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
W_{out} \left \lfloor{\frac{W_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
\end{array}
Raises:
TypeError: If `in_channels`, `out_channels` or `group` is not an int.
TypeError: If `kernel_size`, `stride`, `padding` or `dilation` is neither an int not a tuple.
@ -1099,70 +1161,74 @@ class Conv1dTranspose(_Conv):
r"""
1D transposed convolution layer.
Compute a 1D transposed convolution, which is also known as a deconvolution
(although it is not an actual deconvolution).
This module can be seen as the gradient of Conv1d with respect to its input.
Calculates a 1D transposed convolution, which can be regarded as Conv1d for the gradient of the input.
It also called deconvolution (although it is not an actual deconvolution).
`x` is typically of shape :math:`(N, C, W)`, where :math:`N` is batch size, :math:`C` is channel number and
:math:`W` is the characteristic length.
The input is typically of shape :math:`(N, C, L)`, where :math:`N` is batch size, :math:`C` is a number of channels
and :math:`L_{in}` is a length of sequence.
The padding argument effectively adds :math:`dilation * (kernel\_size - 1) - padding` amount of zero padding to
both sizes of the input. So that when a Conv1d and a ConvTranspose1d are initialized with same parameters,
they are inverses of each other in regard to the input and output shapes. However, when stride > 1,
Conv1d maps multiple input shapes to the same output shape.
The width of output is defined as:
.. math::
W_{out} = \begin{cases}
(W_{in} - 1) \times \text{stride} - 2 \times \text{padding} + \text{dilation} \times
(\text{ks_w} - 1) + 1, & \text{if pad_mode='pad'}\\
(W_{in} + \text{stride} - 1)/\text{stride}, & \text{if pad_mode='same'}\\
(W_{in} - 1) \times \text{stride} + \text{dilation} \times
(\text{ks_w} - 1) + 1, & \text{if pad_mode='valid'}
\end{cases}
where :math:`\text{ks_w}` is the width of the convolution kernel.
When Conv1d and ConvTranspose1d are initialized with the same parameters, and `pad_mode` is set to 'pad',
:math:`dilation * (kernel\_size - 1) - padding` amount of zero will be paded to both sizes of input,
they are inverses of each other in regard to the input and output shapes in this case.
However, when `stride` > 1, Conv1d maps multiple input shapes to the same output shape. Deconvolutional network
can refer to `Deconvolutional Networks <https://www.matthewzeiler.com/matzeiler/deconvolutionalnetworks.pdf>`_.
Args:
in_channels (int): The number of channels in the input space.
out_channels (int): The number of channels in the output space.
kernel_size (int): int, which specifies the width of the 1D convolution window.
stride (int): The distance of kernel moving, an int number that represents
the width of movement. Default: 1.
pad_mode (str): Select the mode of the pad. The optional values are
"pad", "same", "valid". Default: "same".
in_channels (int): The channel number of the input tensor of the Conv1dTranspose layer.
out_channels (int): The channel number of the output tensor of the Conv1dTranspose layer.
kernel_size (int): Specifies the width of the 1D convolution kernel.
stride (int): The movement stride of the 1D convolution kernel. Default: 1.
pad_mode (str): Specifies padding mode. The optional values are
"same", "valid", "pad". Default: "same".
- pad: Implicit paddings on both sides of the input `x`.
- same: The width of the output is the same as the value of the input divided by `stride`.
If this mode is set, the value of `padding` must be 0.
- same: Adopted the way of completion.
- valid: Returns a valid calculated output without padding. Excess pixels that do not satisfy the
calculation will be discarded. If this mode is set, the value of `padding` must be 0.
- valid: Adopted the way of discarding.
padding (int): Implicit paddings on both sides of the input `x`. Default: 0.
dilation (int): The data type is int. Specifies the dilation rate
to use for dilated convolution. If set to be :math:`k > 1`, there will
be :math:`k - 1` pixels skipped for each sampling location. Its value must
be greater or equal to 1 and bounded by the width of the
input `x`. Default: 1.
- pad: Pads the input. Padding `padding` size of zero on both sides of the input.
If this mode is set, the value of `padding` must be greater than or equal to 0.
padding (int): The number of padding on both sides of input.
The value should be greater than or equal to 0. Default: 0.
dilation (int): Dilation size of 1D convolution kernel. If :math:`k > 1`, the kernel is sampled
every `k` elements. The value of `k` is in range of [1, L]. Default: 1.
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by the number of groups. This is not support for Davinci devices when group > 1. Default: 1.
has_bias (bool): Specifies whether the layer uses a bias vector. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel.
divisible by `group`. When `group` > 1, the Ascend platform is not supported yet. Default: 1.
has_bias (bool): Whether the Conv1dTranspose layer has a bias parameter. Default: False.
weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of weight parameter.
It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified,
values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well
as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones'
and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of
Initializer for more details. Default: 'normal'.
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Possible
Initializer and string are the same as 'weight_init'. Refer to the values of
bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initialization method of bias parameter.
Available initialization methods are the same as 'weight_init'. Refer to the values of
Initializer for more details. Default: 'zeros'.
Inputs:
- **x** (Tensor) - Tensor of shape :math:`(N, C_{in}, W_{in})`.
- **x** (Tensor) - Tensor of shape :math:`(N, C_{in}, L_{in})`.
Outputs:
Tensor of shape :math:`(N, C_{out}, W_{out})`.
Tensor of shape :math:`(N, C_{out}, L_{out})`.
pad_mode is 'same':
.. math::
L_{out} \left \lfloor{\frac{L_{in}}{\text{stride}} + 1} \right \rfloor
pad_mode is 'valid':
.. math::
L_{out} \left \lfloor{\frac{L_{in} - \text{dilation} \times (\text{kernel_size} - 1) }
{\text{stride}} + 1} \right \rfloor
pad_mode is 'pad':
.. math::
L_{out} \left \lfloor{\frac{L_{in} + 2 \times padding - (\text{dilation} - 1) \times
\text{kernel_size} - 1 }{\text{stride}} + 1} \right \rfloor
Raises:
TypeError: If `in_channels`, `out_channels`, `kernel_size`, `stride`, `padding` or `dilation` is not an int.

View File

@ -611,7 +611,7 @@ class Conv2dBnFoldQuantOneConv(Cell):
pad_mode (str): Specifies padding mode. The optional values are "same", "valid", "pad". Default: "same".
padding (Union[int, tuple[int]]): Implicit paddings on both sides of the `x`. Default: 0.
dilation (Union[int, tuple[int]]): Specifies the dilation rate to use for dilated convolution. Default: 1.
group (int): Splits filter into groups, `in_ channels` and `out_channels` must be
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by the number of groups. Default: 1.
eps (float): Parameters for Batch Normalization. Default: 1e-5.
momentum (float): Parameters for Batch Normalization op. Default: 0.997.
@ -849,7 +849,7 @@ class Conv2dBnFoldQuant(Cell):
pad_mode (str): Specifies padding mode. The optional values are "same", "valid", "pad". Default: "same".
padding (Union[int, tuple[int]]): Implicit paddings on both sides of the `x`. Default: 0.
dilation (Union[int, tuple[int]]): Specifies the dilation rate to use for dilated convolution. Default: 1.
group (int): Splits filter into groups, `in_ channels` and `out_channels` must be
group (int): Splits filter into groups, `in_channels` and `out_channels` must be
divisible by the number of groups. Default: 1.
eps (float): Parameters for Batch Normalization. Default: 1e-5.
momentum (float): Parameters for Batch Normalization op. Default: 0.997.

View File

@ -845,7 +845,17 @@ class Unique(Primitive):
class Gather(Primitive):
r"""
Returns the slice of the input Tensor corresponding to the elements of `input_indices` on the specified `axis`.
Returns the slice of the input tensor corresponding to the elements of `input_indices` on the specified `axis`.
The following figure shows the calculation process of Gather commonly:
.. image:: api_img/Gather.png
where params represents the input `input_params`, and indices represents the index to be sliced `input_indices`.
.. note::
The value of input_indices must be in the range of `[0, input_param.shape[axis])`, the result is undefined
out of range.
Inputs:
- **input_params** (Tensor) - The original Tensor. The shape of tensor is :math:`(x_1, x_2, ..., x_R)`.
@ -853,36 +863,50 @@ class Gather(Primitive):
Specifies the indices of elements of the original Tensor. The data type can be int32 or int64.
- **axis** (int) - Specifies the dimension index to gather indices.
.. note::
The value of input_indices must be in the range of `[0, input_param.shape[axis])`, and report an error if it
exceeds this range.
Outputs:
Tensor, the shape of tensor is
:math:`input\_params.shape[:axis] + input\_indices.shape + input\_params.shape[axis + 1:]`.
Raises:
TypeError: If `axis` is not an int.
TypeError: If `input_indices` is not an int type Tensor.
TypeError: If `input_indices` is not an int.
TypeError: If `input_params` is not a tensor.
TypeError: If `input_indices` is not a tensor of type int.
Supported Platforms:
``Ascend`` ``GPU`` ``CPU``
Examples:
>>> input_params = Tensor(np.array([[1, 2, 7, 42], [3, 4, 54, 22], [2, 2, 55, 3]]), mindspore.float32)
>>> input_indices = Tensor(np.array([1, 2]), mindspore.int32)
>>> axis = 1
>>> output = ops.Gather()(input_params, input_indices, axis)
>>> print(output)
[[ 2. 7.]
[ 4. 54.]
[ 2. 55.]]
>>> # case1: input_indices is a Tensor with shape (5, ).
>>> input_params = Tensor(np.array([1, 2, 3, 4, 5, 6, 7]), mindspore.float32)
>>> input_indices = Tensor(np.array([0, 2, 4, 2, 6]), mindspore.int32)
>>> axis = 0
>>> output = ops.Gather()(input_params, input_indices, axis)
>>> print(output)
[[3. 4. 54. 22.]
[2. 2. 55. 3.]]
[1. 3. 5. 3. 7.]
>>> # case2: input_indices is a Tensor with shape (2, 2). When the input_params has one dimension, the output shape is equal to the input_indices shape.
>>> input_indices = Tensor(np.array([[0, 2], [2, 6]]), mindspore.int32)
>>> axis = 0
>>> output = ops.Gather()(input_params, input_indices, axis)
>>> print(output)
[[ 1. 3.]
[ 3. 7.]]
>>> # case3: input_indices is a Tensor with shape (2, ). input_params is a Tensor with shape (3, 4) and axis is 0.
>>> input_params = Tensor(np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]), mindspore.float32)
>>> input_indices = Tensor(np.array([0, 2]), mindspore.int32)
>>> axis = 0
>>> output = ops.Gather()(input_params, input_indices, axis)
>>> print(output)
[[1. 2. 3. 4.]
[9. 10. 11. 12.]]
>>> # case4: input_indices is a Tensor with shape (2, ). input_params is a Tensor with shape (3, 4) and axis is 1.
>>> input_params = Tensor(np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]), mindspore.float32)
>>> input_indices = Tensor(np.array([0, 2]), mindspore.int32)
>>> axis = 1
>>> output = ops.Gather()(input_params, input_indices, axis)
>>> print(output)
[[1. 3.]
[5. 7.]
[9. 11.]]
"""
@prim_attr_register