!50119 [API] Add bias to ops.conv2d
Merge pull request !50119 from shaojunsong/conv_bias
This commit is contained in:
commit
e4f2b77af5
|
@ -34,6 +34,9 @@ mindspore/mindspore/python/mindspore/ops/function/nn_func.py:max_unpool3d
|
|||
mindspore/mindspore/python/mindspore/ops/function/nn_func.py:max_unpool2d
|
||||
mindspore/mindspore/python/mindspore/ops/function/nn_func.py:max_unpool1d
|
||||
mindspore/mindspore/python/mindspore/ops/function/nn_func.py:pad
|
||||
mindspore/mindspore/python/mindspore/ops/function/nn_func.py:conv1d
|
||||
mindspore/mindspore/python/mindspore/ops/function/nn_func.py:conv3d
|
||||
mindspore/mindspore/python/mindspore/ops/function/nn_func.py:conv2d
|
||||
mindspore/mindspore/python/mindspore/ops/function/math_func.py:cov
|
||||
mindspore/mindspore/python/mindspore/ops/function/math_func.py:norm
|
||||
mindspore/mindspore/python/mindspore/ops/function/math_func.py:einsum
|
||||
|
|
|
@ -25,6 +25,7 @@ MindSpore中 `mindspore.ops` 接口与上一版本相比,新增、删除和支
|
|||
mindspore.ops.batch_norm
|
||||
mindspore.ops.bias_add
|
||||
mindspore.ops.ctc_greedy_decoder
|
||||
mindspore.ops.conv1d
|
||||
mindspore.ops.conv2d
|
||||
mindspore.ops.conv3d
|
||||
mindspore.ops.deformable_conv2d
|
||||
|
|
|
@ -0,0 +1,48 @@
|
|||
mindspore.ops.conv1d
|
||||
====================
|
||||
|
||||
.. py:function:: mindspore.ops.conv1d(input, weight, bias=None, stride=1, pad_mode="valid", padding=0, dilation=1, groups=1)
|
||||
|
||||
对输入Tensor计算一维卷积。该Tensor的常见shape为 :math:`(N, C_{in}, W_{in})` ,其中 :math:`N` 为batch size,:math:`C_{in}` 为通道数, :math:`W_{in}` 分别为特征层的宽度, :math:`X_i` 为 :math:`i^{th}` 输入值, :math:`b_i` 为 :math:`i^{th}` 输入值的偏置项。对于每个batch中的Tensor,其shape为 :math:`(C_{in}, W_{in})` ,公式定义如下:
|
||||
|
||||
.. math::
|
||||
out_j = \sum_{i=0}^{C_{in} - 1} ccor(W_{ij}, X_i) + b_j,
|
||||
|
||||
其中, :math:`ccor` 为 `cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_ , :math:`C_{in}` 为输入通道数, :math:`j` 的范围从 :math:`0` 到 :math:`C_{out} - 1` , :math:`W_{ij}` 对应第 :math:`j` 个过滤器的第 :math:`i` 个通道, :math:`out_{j}` 对应输出的第 :math:`j` 个通道。 :math:`W_{ij}` 为卷积核的切片,其shape为 :math:`(\text{kernel_size[0]},\text{kernel_size[1]})` ,其中 :math:`\text{kernel_size[0]}` 和 :math:`\text{kernel_size[1]}` 是卷积核的高度和宽度。完整卷积核的shape为 :math:`(C_{out}, C_{in} / \text{groups}, \text{kernel_size[0]}, \text{kernel_size[1]})` ,其中 `groups` 是在通道上分割输入 `input` 的组数。
|
||||
|
||||
如果 `pad_mode` 设置为"valid",则输出宽度为 :math:`\left \lfloor{1 + \frac{W_{in} + \text{padding[0]} - \text{kernel_size} - (\text{kernel_size} - 1) \times(\text{dilation} - 1)}{\text { stride }}} \right \rfloor` 。
|
||||
其中, :math:`dialtion` 为卷积核元素之间的间距, :math:`stride` 为移动步长, :math:`padding` 为添加到输入两侧的零填充。
|
||||
对于取其他值的 `pad_mode` 时候的输出高度和宽度的计算,请参考 :class:`mindspore.nn.Conv1d` 里的计算公式。
|
||||
|
||||
请参考论文 `Gradient Based Learning Applied to Document Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_ 。更详细的介绍,参见: `ConvNets <http://cs231n.github.io/convolutional-networks/>`_ 。
|
||||
|
||||
.. note::
|
||||
在Ascend平台上,目前只支持深度卷积场景下的分组卷积运算。也就是说,当 `groups>1` 的场景下,必须要满足 `C_{in}` = `C_{out}` = `groups` 的约束条件。
|
||||
|
||||
参数:
|
||||
- **input** (Tensor) - shape为 :math:`(N, C_{in}, W_{in})` 的Tensor。
|
||||
- **weight** (Tensor) - shape为 :math:`(C_{out}, C_{in}, W_{kernel}})` ,则卷积核shape为 :math:`(W_{kernel})` 。
|
||||
- **bias** (Tensor) - 偏置Tensor,shape为 :math:`(C_{out})` 的Tensor。如果 `bias` 是None,将不会添加偏置。默认值:None。
|
||||
- **pad_mode** (str,可选) - 指定填充模式。取值为"same","valid",或"pad"。默认值:"valid"。
|
||||
|
||||
- **same**: 输出的宽度与输入整除 `stride` 后的值相同。填充将被均匀地添加到两侧,剩余填充量将被添加到维度末端。若设置该模式,`padding` 的值必须为0。
|
||||
- **valid**: 在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式,则 `padding` 的值必须为0。
|
||||
- **pad**: 对输入 `input` 进行填充。在输入上填充 `padding` 大小的0。如果设置此模式, `padding` 必须大于或等于0。
|
||||
|
||||
- **padding** (Union(int, tuple[int]),可选) - 输入 `input` 的宽度方向上填充的数量。数据类型为int或包含1个int组成的tuple。表示宽度方向的 `padding` 数量(左右两边均为该值)。值必须大于等于0,默认值:0。
|
||||
- **stride** (Union(int, tuple[int]),可选) - 卷积核移动的步长,数据类型为int或1个int组成的tuple。表示在宽度方向的移动步长。默认值:1。
|
||||
- **dilation** (Union(int, tuple[int]),可选) - 卷积核元素间的间隔。数据类型为int或1个int组成的tuple。若 :math:`k > 1` ,则卷积核间隔 `k` 个元素进行采样。垂直和水平方向上的 `k` ,其取值范围为[1, W]。默认值:1。
|
||||
- **groups** (int,可选) - 将过滤器拆分为组。默认值:1。
|
||||
|
||||
返回:
|
||||
Tensor,卷积后的值。shape为 :math:`(N, C_{out}, W_{out})` 。
|
||||
|
||||
异常:
|
||||
- **TypeError** - `stride` 、 `padding` 或 `dilation` 既不是int也不是tuple。
|
||||
- **TypeError** - `groups` 不是int。
|
||||
- **TypeError** - `bias` 不是Tensor。
|
||||
- **ValueError** - `bias` 的shape不是 :math:`(C_{out})` 。
|
||||
- **ValueError** - `stride` 或 `diation` 小于1。
|
||||
- **ValueError** - `pad_mode` 不是"same"、"valid"或"pad"。
|
||||
- **ValueError** - `padding` 是一个长度不等于4的tuple。
|
||||
- **ValueError** - `pad_mode` 不等于"pad"时,`padding` 大于0。
|
|
@ -1,14 +1,14 @@
|
|||
mindspore.ops.conv2d
|
||||
====================
|
||||
|
||||
.. py:function:: mindspore.ops.conv2d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, group=1)
|
||||
.. py:function:: mindspore.ops.conv2d(input, weight, bias=None, stride=1, pad_mode="valid", padding=0, dilation=1, groups=1)
|
||||
|
||||
对输入Tensor计算二维卷积。该Tensor的常见shape为 :math:`(N, C_{in}, H_{in}, W_{in})` ,其中 :math:`N` 为batch size,:math:`C_{in}` 为通道数, :math:`H_{in}, W_{in}` 分别为特征层的高度和宽度, :math:`X_i` 为 :math:`i^{th}` 输入值, :math:`b_i` 为 :math:`i^{th}` 输入值的偏置项。对于每个batch中的Tensor,其shape为 :math:`(C_{in}, H_{in}, W_{in})` ,公式定义如下:
|
||||
|
||||
.. math::
|
||||
out_j = \sum_{i=0}^{C_{in} - 1} ccor(W_{ij}, X_i) + b_j,
|
||||
|
||||
其中, :math:`ccor` 为 `cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_ , :math:`C_{in}` 为输入通道数, :math:`j` 的范围从 :math:`0` 到 :math:`C_{out} - 1` , :math:`W_{ij}` 对应第 :math:`j` 个过滤器的第 :math:`i` 个通道, :math:`out_{j}` 对应输出的第 :math:`j` 个通道。 :math:`W_{ij}` 为卷积核的切片,其shape为 :math:`(\text{kernel_size[0]},\text{kernel_size[1]})` ,其中 :math:`\text{kernel_size[0]}` 和 :math:`\text{kernel_size[1]}` 是卷积核的高度和宽度。完整卷积核的shape为 :math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]})` ,其中 `group` 是在通道上分割输入 `inputs` 的组数。
|
||||
其中, :math:`ccor` 为 `cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_ , :math:`C_{in}` 为输入通道数, :math:`j` 的范围从 :math:`0` 到 :math:`C_{out} - 1` , :math:`W_{ij}` 对应第 :math:`j` 个过滤器的第 :math:`i` 个通道, :math:`out_{j}` 对应输出的第 :math:`j` 个通道。 :math:`W_{ij}` 为卷积核的切片,其shape为 :math:`(\text{kernel_size[0]},\text{kernel_size[1]})` ,其中 :math:`\text{kernel_size[0]}` 和 :math:`\text{kernel_size[1]}` 是卷积核的高度和宽度。完整卷积核的shape为 :math:`(C_{out}, C_{in} / \text{groups}, \text{kernel_size[0]}, \text{kernel_size[1]})` ,其中 `groups` 是在通道上分割输入 `input` 的组数。
|
||||
|
||||
如果 `pad_mode` 设置为"valid",则输出高度和宽度将分别为 :math:`\left \lfloor{1 + \frac{H_{in} + \text{padding[0]} + \text{padding[1]} - \text{kernel_size[0]} - (\text{kernel_size[0]} - 1) \times (\text{dilation[0]} - 1) }{\text{stride[0]}}} \right \rfloor` 和 :math:`\left \lfloor{1 + \frac{W_{in} + \text{padding[2]} + \text{padding[3]} - \text{kernel_size[1]} - (\text{kernel_size[1]} - 1) \times (\text{dilation[1]} - 1) }{\text{stride[1]}}} \right \rfloor` 。
|
||||
其中, :math:`dialtion` 为卷积核元素之间的间距, :math:`stride` 为移动步长, :math:`padding` 为添加到输入两侧的零填充。
|
||||
|
@ -17,29 +17,32 @@ mindspore.ops.conv2d
|
|||
请参考论文 `Gradient Based Learning Applied to Document Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_ 。更详细的介绍,参见: `ConvNets <http://cs231n.github.io/convolutional-networks/>`_ 。
|
||||
|
||||
.. note::
|
||||
在Ascend平台上,目前只支持深度卷积场景下的分组卷积运算。也就是说,当 `group>1` 的场景下,必须要满足 `C_{in}` = `C_{out}` = `group` 的约束条件。
|
||||
在Ascend平台上,目前只支持深度卷积场景下的分组卷积运算。也就是说,当 `groups>1` 的场景下,必须要满足 `C_{in}` = `C_{out}` = `groups` 的约束条件。
|
||||
|
||||
参数:
|
||||
- **inputs** (Tensor) - shape为 :math:`(N, C_{in}, H_{in}, W_{in})` 的Tensor。
|
||||
- **weight** (Tensor) - 设置卷积核的大小为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})` ,则shape为 :math:`(C_{out}, C_{in}, \text{kernel_size[0]}, \text{kernel_size[1]})` 。
|
||||
- **input** (Tensor) - shape为 :math:`(N, C_{in}, H_{in}, W_{in})` 的Tensor。
|
||||
- **weight** (Tensor) - shape为 :math:`(C_{out}, C_{in} / \text{groups}, \text{kernel_size[0]}, \text{kernel_size[1]})` ,则卷积核的大小为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})` 。
|
||||
- **bias** (Tensor) - 偏置Tensor,shape为 :math:`(C_{out})` 的Tensor。如果 `bias` 是None,将不会添加偏置。默认值:None。
|
||||
- **stride** (Union(int, tuple[int]),可选) - 卷积核移动的步长,数据类型为int或两个int组成的tuple。一个int表示在高度和宽度方向的移动步长均为该值。两个int组成的tuple分别表示在高度和宽度方向的移动步长。默认值:1。
|
||||
- **pad_mode** (str,可选) - 指定填充模式。取值为"same","valid",或"pad"。默认值:"valid"。
|
||||
|
||||
- **same**: 输出的高度和宽度分别与输入整除 `stride` 后的值相同。填充将被均匀地添加到高和宽的两侧,剩余填充量将被添加到维度末端。若设置该模式,`padding` 的值必须为0。
|
||||
- **valid**: 在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式,则 `padding` 的值必须为0。
|
||||
- **pad**: 对输入 `inputs` 进行填充。在输入的高度和宽度方向上填充 `padding` 大小的0。如果设置此模式, `padding` 必须大于或等于0。
|
||||
- **pad**: 对输入 `input` 进行填充。在输入的高度和宽度方向上填充 `padding` 大小的0。如果设置此模式, `padding` 必须大于或等于0。
|
||||
|
||||
- **padding** (Union(int, tuple[int]),可选) - 输入 `inputs` 的高度和宽度方向上填充的数量。数据类型为int或包含4个int组成的tuple。如果 `padding` 是一个int,那么上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有4个int组成的tuple,那么上、下、左、右的填充分别等于 `padding[0]` 、 `padding[1]` 、 `padding[2]` 和 `padding[3]` 。值必须大于等于0,默认值:0。
|
||||
- **stride** (Union(int, tuple[int]),可选) - 卷积核移动的步长,数据类型为int或两个int组成的tuple。一个int表示在高度和宽度方向的移动步长均为该值。两个int组成的tuple分别表示在高度和宽度方向的移动步长。默认值:1。
|
||||
- **dilation** (Union(int, tuple[int]),可选) - 卷积核膨胀尺寸。数据类型为int或由2个int组成的tuple。若 :math:`k > 1` ,则卷积核间隔 `k` 个元素进行采样。垂直和水平方向上的 `k` ,其取值范围分别为[1, H]和[1, W]。默认值:1。
|
||||
- **group** (int,可选) - 将过滤器拆分为组。默认值:1。
|
||||
- **padding** (Union(int, tuple[int]),可选) - 输入 `input` 的高度和宽度方向上填充的数量。数据类型为int或包含2个int组成的tuple。如果 `padding` 是一个int,那么上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有2个int组成的tuple,那么上、下的填充为 `padding[0]` ,左、右的填充为 `padding[1]` 。值必须大于等于0,默认值:0。
|
||||
- **dilation** (Union(int, tuple[int]),可选) - 卷积核元素间的间隔。数据类型为int或由2个int组成的tuple。若 :math:`k > 1` ,则卷积核间隔 `k` 个元素进行采样。垂直和水平方向上的 `k` ,其取值范围分别为[1, H]和[1, W]。默认值:1。
|
||||
- **groups** (int,可选) - 将过滤器拆分为组。默认值:1。
|
||||
|
||||
返回:
|
||||
Tensor,卷积后的值。shape为 :math:`(N, C_{out}, H_{out}, W_{out})` 。
|
||||
|
||||
异常:
|
||||
- **TypeError** - `stride` 、 `padding` 或 `dilation` 既不是int也不是tuple。
|
||||
- **TypeError** - `out_channel` 或 `group` 不是int。
|
||||
- **TypeError** - `groups` 不是int。
|
||||
- **TypeError** - `bias` 不是Tensor。
|
||||
- **ValueError** - `bias` 的shape不是 :math:`(C_{out})` 。
|
||||
- **ValueError** - `stride` 或 `diation` 小于1。
|
||||
- **ValueError** - `pad_mode` 不是"same"、"valid"或"pad"。
|
||||
- **ValueError** - `padding` 是一个长度不等于4的tuple。
|
||||
- **ValueError** - `pad_mode` 不等于"pad",`padding` 不等于(0, 0, 0, 0)。
|
||||
- **ValueError** - `pad_mode` 不等于"pad"时,`padding` 大于0。
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
mindspore.ops.conv3d
|
||||
====================
|
||||
|
||||
.. py:function:: mindspore.ops.conv3d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, group=1)
|
||||
.. py:function:: mindspore.ops.conv3d(input, weight, bias=None, stride=1, pad_mode="valid", padding=0, dilation=1, groups=1)
|
||||
|
||||
对输入Tensor计算三维卷积。该Tensor的常见shape为 :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})` ,其中 :math:`N` 为batch size,:math:`C_{in}` 为通道数,:math:`D` 为深度, :math:`H_{in}, W_{in}` 分别为特征层的高度和宽度。 :math:`X_i` 为 :math:`i^{th}` 输入值, :math:`b_i` 为 :math:`i^{th}` 输入值的偏置项。对于每个batch中的Tensor,其shape为 :math:`(C_{in}, D_{in}, H_{in}, W_{in})` ,公式定义如下:
|
||||
|
||||
|
@ -17,24 +17,28 @@ mindspore.ops.conv3d
|
|||
:math:`\text{bias}` 是偏置参数, :math:`\text{X}` 是输入Tensor。
|
||||
完整卷积核的shape为 :math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` ,其中 `group` 是在通道上分割输入 `inputs` 的组数。
|
||||
|
||||
|
||||
详细内容请参考论文 `Gradient Based Learning Applied to Document Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_ 。
|
||||
|
||||
.. note::
|
||||
在Ascend平台上,目前只支持深度卷积场景下的分组卷积运算。也就是说,当 `group>1` 的场景下,必须要满足 :math:`C_{in} = C_{out} = group` 的约束条件。
|
||||
|
||||
1. 在Ascend平台上,目前只支持深度卷积场景下的分组卷积运算。也就是说,当 `groups>1` 的场景下,必须要满足 :math:`C_{in} = C_{out} = groups` 的约束条件。
|
||||
2. 在Ascend平台上,目前只支持 :math:`dialtion=1` 。
|
||||
|
||||
参数:
|
||||
- **inputs** (Tensor) - shape为 :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})` 的Tensor。
|
||||
- **weight** (Tensor) - 设置卷积核的大小为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` ,则shape为 :math:`(C_{out}, C_{in}, \text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` 。
|
||||
- **input** (Tensor) - shape为 :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})` 的Tensor。
|
||||
- **weight** (Tensor) - shape为 :math:`(C_{out}, C_{in} / \text{groups}, \text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` ,则卷积核的大小为 :math:`(\text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})` 。
|
||||
- **bias** (Tensor) - 偏置Tensor,shape为 :math:`(C_{out})` 的Tensor。如果 `bias` 是None,将不会添加偏置。默认值:None。
|
||||
- **stride** (Union[int, tuple[int]],可选) - 卷积核移动的步长,数据类型为int或两个int组成的tuple。一个int表示在高度和宽度方向的移动步长均为该值。两个int组成的tuple分别表示在高度和宽度方向的移动步长。默认值:1。
|
||||
- **pad_mode** (str,可选) - 指定填充模式。取值为"same","valid",或"pad"。默认值:"valid"。
|
||||
|
||||
- **same**: 输出的高度和宽度分别与输入整除 `stride` 后的值相同。填充将被均匀地添加到高和宽的两侧,剩余填充量将被添加到维度末端。若设置该模式,`padding` 的值必须为0。
|
||||
- **valid**: 在不填充的前提下返回有效计算所得的输出。不满足计算的多余像素会被丢弃。如果设置此模式,则 `padding` 的值必须为0。
|
||||
- **pad**: 对输入 `inputs` 进行填充。在输入的高度和宽度方向上填充 `padding` 大小的0。如果设置此模式, `padding` 必须大于或等于0。
|
||||
- **pad**: 对输入 `input` 进行填充。在输入的高度和宽度方向上填充 `padding` 大小的0。如果设置此模式, `padding` 必须大于或等于0。
|
||||
|
||||
- **padding** (Union[int, tuple[int]],可选) - 输入 `inputs` 的深度、高度和宽度方向上填充的数量。数据类型为int或包含6个int组成的tuple。如果 `padding` 是一个int,那么前、后、上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有6个int组成的tuple,那么前、后、上、下、左、右的填充分别等于 `padding[0]` 、 `padding[1]` 、 `padding[2]` 、 `padding[3]` 、`padding[4]` 和 `padding[5]` 。值必须大于等于0,默认值:0。
|
||||
- **stride** (Union[int, tuple[int]],可选) - 卷积核移动的步长,数据类型为int或两个int组成的tuple。一个int表示在高度和宽度方向的移动步长均为该值。两个int组成的tuple分别表示在高度和宽度方向的移动步长。默认值:1。
|
||||
- **dilation** (Union[int, tuple[int]],可选) - 卷积核膨胀尺寸。数据类型为int或由3个int组成的tuple。若 :math:`k > 1` ,则卷积核间隔 `k` 个元素进行采样。前后、垂直和水平方向上的 `k` ,其取值范围分别为[1, D]、[1, H]和[1, W]。默认值:1。
|
||||
- **group** (int,可选) - 将筛选器拆分为组,默认值:1。当前仅支持值为1。
|
||||
- **padding** (Union[int, tuple[int]],可选) - 输入 `input` 的深度、高度和宽度方向上填充的数量。数据类型为int或包含3个int组成的tuple。如果 `padding` 是一个int,那么前、后、上、下、左、右的填充都等于 `padding` 。如果 `padding` 是一个有6个int组成的tuple,那么前、后的填充为 `padding[0]` ,上、下的填充为 `padding[1]` ,左、右的填充为 `padding[2]` 。值必须大于等于0,默认值:0。
|
||||
- **dilation** (Union[int, tuple[int]],可选) - 卷积核元素间的间隔。数据类型为int或由3个int组成的tuple。若 :math:`k > 1` ,则卷积核间隔 `k` 个元素进行采样。前后、垂直和水平方向上的 `k` ,其取值范围分别为[1, D]、[1, H]和[1, W]。默认值:1。
|
||||
- **groups** (int,可选) - 将筛选器拆分为组,默认值:1。当前仅支持值为1。
|
||||
|
||||
返回:
|
||||
Tensor,卷积后的值。shape为 :math:`(N, C_{out}, D_{out}, H_{out}, W_{out})` 。
|
||||
|
@ -64,18 +68,20 @@ mindspore.ops.conv3d
|
|||
|
||||
.. math::
|
||||
\begin{array}{ll} \\
|
||||
D_{out} = \left \lfloor{\frac{D_{in} + padding[0] + padding[1] - (\text{dilation[0]} - 1) \times
|
||||
D_{out} = \left \lfloor{\frac{D_{in} + padding[0] + padding[0] - (\text{dilation[0]} - 1) \times
|
||||
\text{kernel_size[0]} - 1 }{\text{stride[0]}} + 1} \right \rfloor \\
|
||||
H_{out} = \left \lfloor{\frac{H_{in} + padding[2] + padding[3] - (\text{dilation[1]} - 1) \times
|
||||
H_{out} = \left \lfloor{\frac{H_{in} + padding[1] + padding[1] - (\text{dilation[1]} - 1) \times
|
||||
\text{kernel_size[1]} - 1 }{\text{stride[1]}} + 1} \right \rfloor \\
|
||||
W_{out} = \left \lfloor{\frac{W_{in} + padding[4] + padding[5] - (\text{dilation[2]} - 1) \times
|
||||
W_{out} = \left \lfloor{\frac{W_{in} + padding[2] + padding[2] - (\text{dilation[2]} - 1) \times
|
||||
\text{kernel_size[2]} - 1 }{\text{stride[2]}} + 1} \right \rfloor \\
|
||||
\end{array}
|
||||
|
||||
异常:
|
||||
- **TypeError** - `stride` 、 `padding` 或 `dilation` 既不是int也不是tuple。
|
||||
- **TypeError** - `out_channel` 或 `group` 不是int。
|
||||
- **TypeError** - `groups` 不是int。
|
||||
- **TypeError** - `bias` 不是Tensor。
|
||||
- **ValueError** - `bias` 的shape不是 :math:`(C_{out})` 。
|
||||
- **ValueError** - `stride` 或 `diation` 小于1。
|
||||
- **ValueError** - `pad_mode` 不是"same"、"valid"或"pad"。
|
||||
- **ValueError** - `padding` 是一个长度不等于6的tuple。
|
||||
- **ValueError** - `pad_mode` 不等于"pad",`padding` 不等于(0, 0, 0, 0, 0, 0)。
|
||||
- **ValueError** - `padding` 是一个长度不等于3的tuple。
|
||||
- **ValueError** - `pad_mode` 不等于"pad"时,`padding` 大于0。
|
||||
|
|
|
@ -25,6 +25,7 @@ Neural Network
|
|||
mindspore.ops.batch_norm
|
||||
mindspore.ops.bias_add
|
||||
mindspore.ops.ctc_greedy_decoder
|
||||
mindspore.ops.conv1d
|
||||
mindspore.ops.conv2d
|
||||
mindspore.ops.conv3d
|
||||
mindspore.ops.deformable_conv2d
|
||||
|
|
|
@ -483,6 +483,7 @@ from .nn_func import (
|
|||
ctc_loss,
|
||||
dropout,
|
||||
conv3d_transpose,
|
||||
conv1d,
|
||||
conv2d,
|
||||
sigmoid,
|
||||
logsigmoid,
|
||||
|
|
|
@ -4389,7 +4389,148 @@ def conv3d_transpose(inputs, weight, pad_mode='valid', padding=0, stride=1, dila
|
|||
return _conv_3d_transpose(inputs, weight)
|
||||
|
||||
|
||||
def conv2d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, group=1):
|
||||
def _manipulate_padding(padding, dim):
|
||||
"""convert padding to Conv2D padding"""
|
||||
ms_padding = ()
|
||||
if not isinstance(padding, (tuple, list)):
|
||||
raise TypeError(f"For 'conv{dim}d', 'padding' must be a tuple, list or int, but got {type(padding)}.")
|
||||
if len(padding) != dim:
|
||||
raise ValueError(f"For 'conv{dim}d', 'padding' must be a tuple or list of {dim} integers, but got {padding}.")
|
||||
for i in range(dim):
|
||||
ms_padding += (padding[i], padding[i])
|
||||
return ms_padding
|
||||
|
||||
|
||||
def _manipulate_dilation(dilation, dim=1):
|
||||
"""convert 1d dilation to 2d"""
|
||||
if isinstance(dilation, int):
|
||||
return 1, dilation
|
||||
if isinstance(dilation, (tuple, list)):
|
||||
if len(dilation) != 1:
|
||||
raise ValueError(f"For 'conv{dim}d', dilation must be a tuple/list with 1 element or int, \
|
||||
but got {dilation}.")
|
||||
return 1, dilation[0]
|
||||
return dilation
|
||||
|
||||
|
||||
def conv1d(input, weight, bias=None, stride=1, pad_mode="valid", padding=0, dilation=1, groups=1):
|
||||
r"""
|
||||
Applies a 1D convolution over an input tensor.
|
||||
The input tensor is typically of shape :math:`(N, C_{in}, W_{in})`,
|
||||
where :math:`N` is batch size, :math:`C` is channel number, :math:`H` is height, :math:`W` is width, :math:`X_i` is
|
||||
the :math:`i^{th}` input value and :math:`b_i` indicates the deviation value of the :math:`i^{th}` input value.
|
||||
For each batch of shape :math:`(C_{in}, W_{in})`, the formula is defined as:
|
||||
|
||||
.. math::
|
||||
|
||||
out_j = \sum_{i=0}^{C_{in} - 1} ccor(W_{j}, X_i) + b_j,
|
||||
|
||||
where :math:`ccor` is the `cross-correlation <https://en.wikipedia.org/wiki/Cross-correlation>`_ operator,
|
||||
:math:`C_{in}` is the input channel number, :math:`j` ranges
|
||||
from :math:`0` to :math:`C_{out} - 1`, :math:`W_{ij}` corresponds to the :math:`i`-th channel of the :math:`j`-th
|
||||
filter and :math:`out_{j}` corresponds to the :math:`j`-th channel of the output. :math:`W_{j}` is a slice
|
||||
of kernel, and it has shape :math:`(\text{kernal_size})`, where :math:`\text{kernel_size}` is the width of
|
||||
the convolution kernel. The full kernel has shape :math:`(C_{out}, C_{in} / \text{groups}, \text{kernel_size})`,
|
||||
where `groups` is the group number to split the input in the channel dimension.
|
||||
|
||||
If the `pad_mode` is set to be "valid", the output width will be :math:`\left \lfloor{
|
||||
1 + \frac{W_{in} + \text{padding[0]} - \text{kernel_size} - (\text{kernel_size} - 1) \times(\text{dilation} - 1)}
|
||||
{\text { stride }}} \right \rfloor`.
|
||||
|
||||
where :math:`dilation` is spacing between kernel elements, :math:`stride` is The step length of each step,
|
||||
:math:`padding` is zero-padding added to both sides of the input.
|
||||
For output width on other `pad_mode`, please refer to formula on `mindspore.nn.Conv1d
|
||||
<https://www.mindspore.cn/docs/en/master/api_python/nn/mindspore.nn.Conv2d.html>`_.
|
||||
|
||||
The first introduction can be found in paper `Gradient Based Learning Applied to Document Recognition
|
||||
<http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_. More detailed introduction can be found here:
|
||||
`ConvNets <http://cs231n.github.io/convolutional-networks/>`_ .
|
||||
|
||||
Note:
|
||||
On Ascend platform, only group convolution in depthwise convolution scenarios is supported.
|
||||
That is, when `groups>1`, condition `C_{in}` = `C_{out}` = `groups` must be satisfied.
|
||||
|
||||
Args:
|
||||
input (Tensor): Tensor of shape :math:`(N, C_{in}, W_{in})`.
|
||||
weight (Tensor): Tensor of shape
|
||||
:math:`(N, C_{in} / \text{groups}, \text{kernel_size})`, then the size of kernel is
|
||||
:math:`(\text{kernel_size})`.
|
||||
bias (Tensor): Bias Tensor with shape :math:`(C_{out})`. When bias is None, zeros will be used. Default: None.
|
||||
stride (Union(int, tuple[int]), optional): The distance of kernel moving, an int number or a tuple of one int
|
||||
that represents width of movement. Default: 1.
|
||||
pad_mode (str, optional): Specifies padding mode. The optional values are
|
||||
"same", "valid" and "pad". Default: "valid".
|
||||
|
||||
- same: Adopts the way of completion. The height and width of the output will be equal to
|
||||
the input `x` divided by stride. The padding will be evenly calculated in left and right possiblily.
|
||||
Otherwise, the last extra padding will be calculated from the right side.
|
||||
If this mode is set, `padding` must be 0.
|
||||
|
||||
- valid: Adopts the way of discarding. The possible largest width of output will be returned
|
||||
without padding. Extra pixels will be discarded. If this mode is set, `padding` must be 0.
|
||||
|
||||
- pad: Implicit paddings on both sides of the input `x`. The number of `padding` will be padded to the input
|
||||
Tensor borders. `padding` must be greater than or equal to 0.
|
||||
padding (Union(int, tuple[int]), optional): Implicit paddings on both sides of `input`, meaning the paddings of
|
||||
left and right are the same, equal to padding or padding[0] when padding is a tuple of 1 integer.
|
||||
Default: 0.
|
||||
dilation (Union(int, tuple[int]), optional): Gaps between kernel elements. The data type is int or a tuple of
|
||||
1 integer. Specifies the dilation rate to use for dilated convolution. If set to be :math:`k > 1`,
|
||||
there will be :math:`k - 1` pixels skipped for each sampling location. Its value must be greater than or
|
||||
equal to 1 and bounded by the width of `input`. Default: 1.
|
||||
groups (int, optional): Splits `input` into groups. Default: 1.
|
||||
|
||||
Returns:
|
||||
Tensor, the value that applied 1D convolution. The shape is :math:`(N, C_{out}, W_{out})`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `stride`, `padding` or `dilation` is neither an int nor a tuple.
|
||||
TypeError: If `bias` is not a Tensor.
|
||||
ValueError: If the shape of `bias` is not :math:`C_{out}` .
|
||||
ValueError: If `stride` or `dilation` is less than 1.
|
||||
ValueError: If `pad_mode` is not one of 'same', 'valid' or 'pad'.
|
||||
ValueError: If `padding` is a tuple whose length is not equal to 1.
|
||||
ValueError: If `pad_mode` is not equal to 'pad' and `padding` is greater than 0.
|
||||
|
||||
Supported Platforms:
|
||||
``Ascend`` ``GPU`` ``CPU``
|
||||
|
||||
Examples:
|
||||
>>> x = Tensor(np.arange(64).reshape((4, 4, 4)), mindspore.float32)
|
||||
>>> weight = Tensor(np.arange(8).rehspe((2, 2, 2)), mindspore.float32)
|
||||
>>> bias = Tensor([-0.12345, 2.7683], ms.float32)
|
||||
>>> output = ops.conv1d(x, weight, pad_mode='pad', padding=(1,), bias=bias, groups=2)
|
||||
>>> print(output.shape)
|
||||
(4, 2, 5)
|
||||
"""
|
||||
_expand = _get_cache_prim(P.ExpandDims)()
|
||||
expanded_input = _expand(input, 2)
|
||||
sqz = _get_cache_prim(P.Squeeze)(2)
|
||||
weight_shape = weight.shape
|
||||
out_channel = weight_shape[0]
|
||||
kernel_size = (1, weight_shape[2])
|
||||
expanded_weight = _expand(weight, 2)
|
||||
if isinstance(padding, int):
|
||||
padding = (0, 0, padding, padding)
|
||||
elif isinstance(padding, (tuple, list)):
|
||||
if len(padding) != 1:
|
||||
raise ValueError(f"For 'conv1d', padding must be a tuple or list with 1 element or int, but got {padding}.")
|
||||
padding = (0, 0, padding[0], padding[0])
|
||||
else:
|
||||
raise ValueError(f"For 'conv1d', padding must be a tuple, list or int, but got {type(padding)}.")
|
||||
dilation = _manipulate_dilation(dilation)
|
||||
conv = _get_cache_prim(P.Conv2D)(out_channel, kernel_size, 1, pad_mode, padding, stride, dilation, groups, "NCHW")
|
||||
conv_res = conv(expanded_input, expanded_weight)
|
||||
squeezed_conv_res = sqz(conv_res)
|
||||
if bias is None:
|
||||
return squeezed_conv_res
|
||||
if not isinstance(bias, Tensor):
|
||||
raise TypeError(f"For 'conv1d', the 'bias' must be a Tensor, but got {type(bias)}.")
|
||||
output = bias_add(squeezed_conv_res, bias)
|
||||
return output
|
||||
|
||||
|
||||
def conv2d(input, weight, bias=None, stride=1, pad_mode="valid", padding=0, dilation=1, groups=1):
|
||||
r"""
|
||||
Applies a 2D convolution over an input tensor.
|
||||
The input tensor is typically of shape :math:`(N, C_{in}, H_{in}, W_{in})`,
|
||||
|
@ -4405,10 +4546,10 @@ def conv2d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
:math:`C_{in}` is the input channel number, :math:`j` ranges
|
||||
from :math:`0` to :math:`C_{out} - 1`, :math:`W_{ij}` corresponds to the :math:`i`-th channel of the :math:`j`-th
|
||||
filter and :math:`out_{j}` corresponds to the :math:`j`-th channel of the output. :math:`W_{ij}` is a slice
|
||||
of kernel and it has shape :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})`, where :math:`\text{
|
||||
of kernel, and it has shape :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})`, where :math:`\text{
|
||||
kernel_size[0]}` and :math:`\text{kernel_size[1]}` are the height and width of the convolution kernel.
|
||||
The full kernel has shape :math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]})`,
|
||||
where `group` is the group number to split the input in the channel dimension.
|
||||
The full kernel has shape :math:`(C_{out}, C_{in} / \text{groups}, \text{kernel_size[0]}, \text{kernel_size[1]})`,
|
||||
where `groups` is the group number to split the input in the channel dimension.
|
||||
|
||||
If the `pad_mode` is set to be "valid", the output height and width will be :math:`\left \lfloor{
|
||||
1 + \frac{H_{in} + \text{padding[0]} + \text{padding[1]} - \text{kernel_size[0]} -
|
||||
|
@ -4417,7 +4558,7 @@ def conv2d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
:math:`\left \lfloor{1 + \frac{W_{in} + \text{padding[2]} + \text{padding[3]} - \text{kernel_size[1]} -
|
||||
(\text{kernel_size[1]} - 1) \times(\text{dilation[1]} - 1)} {\text { stride[1] }}} \right \rfloor` respectively.
|
||||
|
||||
Where :math:`dilation` is Spacing between kernel elements, :math:`stride` is The step length of each step,
|
||||
where :math:`dilation` is spacing between kernel elements, :math:`stride` is The step length of each step,
|
||||
:math:`padding` is zero-padding added to both sides of the input.
|
||||
For output height and width on other `pad_mode`, please refer to formula on `mindspore.nn.Conv2d
|
||||
<https://www.mindspore.cn/docs/en/master/api_python/nn/mindspore.nn.Conv2d.html>`_.
|
||||
|
@ -4428,12 +4569,17 @@ def conv2d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
|
||||
Note:
|
||||
On Ascend platform, only group convolution in depthwise convolution scenarios is supported.
|
||||
That is, when `group>1`, condition `C_{in}` = `C_{out}` = `group` must be satisfied.
|
||||
That is, when `groups>1`, condition `C_{in}` = `C_{out}` = `groups` must be satisfied.
|
||||
|
||||
Args:
|
||||
inputs (Tensor): Tensor of shape :math:`(N, C_{in}, H_{in}, W_{in})`.
|
||||
weight (Tensor): Set size of kernel is :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})`,
|
||||
then the shape is :math:`(C_{out}, C_{in}, \text{kernel_size[0]}, \text{kernel_size[1]})`.
|
||||
input (Tensor): Tensor of shape :math:`(N, C_{in}, H_{in}, W_{in})`.
|
||||
weight (Tensor): Tensor of shape
|
||||
:math:`(N, C_{in} / \text{groups}, \text{kernel_size[0]}, \text{kernel_size[1]})`, then the size of kernel
|
||||
is :math:`(\text{kernel_size[0]}, \text{kernel_size[1]})`.
|
||||
bias (Tensor): Bias Tensor with shape :math:`(C_{out})`. When bias is None, zeros will be used. Default: None.
|
||||
stride (Union(int, tuple[int]), optional): The distance of kernel moving, an int number that represents
|
||||
the height and width of movement are both strides, or a tuple of two int numbers that
|
||||
represent height and width of movement respectively. Default: 1.
|
||||
pad_mode (str, optional): Specifies padding mode. The optional values are
|
||||
"same", "valid" and "pad". Default: "valid".
|
||||
|
||||
|
@ -4449,27 +4595,25 @@ def conv2d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
Tensor borders. `padding` must be greater than or equal to 0.
|
||||
padding (Union(int, tuple[int]), optional): Implicit paddings on both sides of the input `x`.
|
||||
If `padding` is one integer, the paddings of top, bottom, left and right are the same, equal to padding.
|
||||
If `padding` is a tuple with four integers, the paddings of top, bottom, left and right will be equal
|
||||
to padding[0], padding[1], padding[2], and padding[3] accordingly. Default: 0.
|
||||
stride (Union(int, tuple[int]), optional): The distance of kernel moving, an int number that represents
|
||||
the height and width of movement are both strides, or a tuple of two int numbers that
|
||||
represent height and width of movement respectively. Default: 1.
|
||||
dilation (Union(int, tuple[int]), optional): The data type is int or a tuple of 2 integers.
|
||||
Specifies the dilation rate to use for dilated convolution. If set to be :math:`k > 1`, there will
|
||||
be :math:`k - 1` pixels skipped for each sampling location. Its value must
|
||||
If `padding` is a tuple with two integers, the padding of top adn bottom is padding[0], and the padding of
|
||||
left and right is padding[1]. Default: 0.
|
||||
dilation (Union(int, tuple[int]), optional): Gaps between kernel elements.The data type is int or a tuple of
|
||||
2 integers. Specifies the dilation rate to use for dilated convolution. If set to be :math:`k > 1`,
|
||||
there will be :math:`k - 1` pixels skipped for each sampling location. Its value must
|
||||
be greater than or equal to 1 and bounded by the height and width of the input `x`. Default: 1.
|
||||
group (int, optional): Splits inputs into groups. Default: 1.
|
||||
groups (int, optional): Splits `input` into groups. Default: 1.
|
||||
|
||||
Returns:
|
||||
Tensor, the value that applied 2D convolution. The shape is :math:`(N, C_{out}, H_{out}, W_{out})`.
|
||||
|
||||
Raises:
|
||||
TypeError: If `stride`, `padding` or `dilation` is neither an int nor a tuple.
|
||||
TypeError: If `out_channel` or `group` is not an int.
|
||||
TypeError: If `bias` is not a Tensor.
|
||||
ValueError: If the shape of `bias` is not :math:`C_{out}` .
|
||||
ValueError: If `stride` or `dilation` is less than 1.
|
||||
ValueError: If `pad_mode` is not one of 'same', 'valid' or 'pad'.
|
||||
ValueError: If `padding` is a tuple whose length is not equal to 4.
|
||||
ValueError: If `pad_mode` it not equal to 'pad' and `padding` is not equal to (0, 0, 0, 0).
|
||||
ValueError: If `padding` is a tuple whose length is not equal to 2.
|
||||
ValueError: If `pad_mode` is not equal to 'pad' and `padding` is greater than 0.
|
||||
|
||||
Supported Platforms:
|
||||
``Ascend`` ``GPU`` ``CPU``
|
||||
|
@ -4481,11 +4625,18 @@ def conv2d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
>>> print(output.shape)
|
||||
(10, 32, 30, 30)
|
||||
"""
|
||||
if isinstance(padding, (tuple, list)):
|
||||
padding = _manipulate_padding(padding, dim=2)
|
||||
weight_shape = weight.shape
|
||||
out_channel = weight_shape[0]
|
||||
kernel_size = weight_shape[2:4]
|
||||
conv = _get_cache_prim(P.Conv2D)(out_channel, kernel_size, 1, pad_mode, padding, stride, dilation, group, "NCHW")
|
||||
output = conv(inputs, weight)
|
||||
conv = _get_cache_prim(P.Conv2D)(out_channel, kernel_size, 1, pad_mode, padding, stride, dilation, groups, "NCHW")
|
||||
if bias is None:
|
||||
return conv(input, weight)
|
||||
if not isinstance(bias, Tensor):
|
||||
raise TypeError(f"For 'conv2d', the 'bias' must be a Tensor, but got {type(bias)}.")
|
||||
conv_result = conv(input, weight)
|
||||
output = bias_add(conv_result, bias)
|
||||
return output
|
||||
|
||||
|
||||
|
@ -4887,7 +5038,7 @@ def binary_cross_entropy(logits, labels, weight=None, reduction='mean'):
|
|||
return binary_cross_entropy_op(logits, labels, weight)
|
||||
|
||||
|
||||
def conv3d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, group=1):
|
||||
def conv3d(input, weight, bias=None, stride=1, pad_mode="valid", padding=0, dilation=1, groups=1):
|
||||
r"""
|
||||
Applies a 3D convolution over an input tensor. The input tensor is typically of shape
|
||||
:math:`(N, C_{in}, D_{in}, H_{in}, W_{in})` and output shape
|
||||
|
@ -4910,21 +5061,27 @@ def conv3d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
the depth, height and width of the convolution kernel respectively. :math:`\text{bias}` is the bias parameter
|
||||
and :math:`\text{X}` is the input tensor.
|
||||
The shape of full convolution kernel is
|
||||
:math:`(C_{out}, C_{in} / \text{group}, \text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})`,
|
||||
where `group` is the number of groups to split the input `x` in the channel dimension.
|
||||
:math:`(C_{out}, C_{in} / \text{groups}, \text{kernel_size[0]}, \text{kernel_size[1]}, \text{kernel_size[2]})`,
|
||||
where `groups` is the number of groups to split `input` in the channel dimension.
|
||||
|
||||
For more details, please refer to the paper `Gradient Based Learning Applied to Document
|
||||
Recognition <http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf>`_ .
|
||||
|
||||
Note:
|
||||
On Ascend platform, only group convolution in depthwise convolution scenarios is supported.
|
||||
That is, when `group>1`, condition :math:`C_{in} = C_{out} = group` must be satisfied.
|
||||
1. On Ascend platform, only group convolution in depthwise convolution scenarios is supported. That is, when
|
||||
`groups>1`, condition :math:`C_{in} = C_{out} = groups` must be satisfied.
|
||||
2. On Ascend dilation on depth only supports the case of 1.
|
||||
|
||||
Args:
|
||||
inputs (Tensor): Tensor of shape :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})`.
|
||||
input (Tensor): Tensor of shape :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})`.
|
||||
weight (Tensor): Set size of kernel is :math:`(\text{kernel_size[0]}, \text{kernel_size[1]},
|
||||
\text{kernel_size[2]})`, then the shape is :math:`(C_{out}, C_{in}, \text{kernel_size[0]},
|
||||
\text{kernel_size[1]}, \text{kernel_size[1]})`.
|
||||
bias (Tensor): Bias Tensor with shape `:math:`(C_{out})`. When bias is None, zeros will be used. Default:
|
||||
None.
|
||||
stride (Union[int, tuple[int]], optional): The distance of kernel moving, an int number that represents
|
||||
the height and width of movement are both strides, or a tuple of two int numbers that
|
||||
represent height and width of movement respectively. Default: 1
|
||||
pad_mode (str, optional): Specifies padding mode. The optional values are
|
||||
"same", "valid" and "pad". Default: "valid".
|
||||
|
||||
|
@ -4943,17 +5100,14 @@ def conv3d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
|
||||
padding (Union[int, tuple[int]], optional): The pad value to be filled. If `pad` is an integer,
|
||||
the paddings of head, tail, top, bottom, left and right are the same, equal to pad.
|
||||
If `pad` is a tuple of six integers, the padding of head, tail, top, bottom,
|
||||
left and right equal to pad[0], pad[1], pad[2], pad[3], pad[4] and pad[5] correspondingly. Default: 0.
|
||||
stride (Union[int, tuple[int]], optional): The distance of kernel moving, an int number that represents
|
||||
the height and width of movement are both strides, or a tuple of two int numbers that
|
||||
represent height and width of movement respectively. Default: 1.
|
||||
If `pad` is a tuple of 3 integers, the padding of head, tail, top, bottom,
|
||||
left and right equal to pad[0], pad[0], pad[1], pad[1], pad[2] and pad[2] correspondingly. Default: 0.
|
||||
dilation (Union[int, tuple[int]], optional): The data type is int or a tuple of 3 integers
|
||||
:math:`(dilation_d, dilation_h, dilation_w)`. Currently, dilation on depth only supports the case of 1
|
||||
on Ascend backend. Specifies the dilation rate to use for dilated convolution. If set :math:`k > 1`,
|
||||
there will be :math:`k - 1` pixels skipped for each sampling location.
|
||||
Its value must be greater than or equal to 1 and bounded by the height and width of the input. Default: 1.
|
||||
group (int, optional): Splits filter into groups. Default: 1.
|
||||
there will be :math:`k - 1` pixels skipped for each sampling location. Its value must be greater than or
|
||||
equal to 1 and bounded by the height and width of the input. Default: 1.
|
||||
groups (int, optional): Splits `input` into groups. Default: 1.
|
||||
|
||||
Returns:
|
||||
Tensor, the value that applied 3D convolution. The shape is :math:`(N, C_{out}, D_{out}, H_{out}, W_{out})`.
|
||||
|
@ -4992,12 +5146,14 @@ def conv3d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
\end{array}
|
||||
|
||||
Raises:
|
||||
TypeError: If `out_channel` or `group` is not an int.
|
||||
TypeError: If `out_channel` or `groups` is not an int.
|
||||
TypeError: If `stride`, `padding` or `dilation` is neither an int nor a tuple.
|
||||
TypeError: If `bias` is not a Tensor.
|
||||
ValueError: If the shape of `bias` is not :math:`C_{out}`.
|
||||
ValueError: If `stride` or `dilation` is less than 1.
|
||||
ValueError: If `pad_mode` is not one of 'same', 'valid' or 'pad'.
|
||||
ValueError: If `padding` is a tuple whose length is not equal to 4.
|
||||
ValueError: If `pad_mode` is not equal to 'pad' and `pad` is not equal to (0, 0, 0, 0, 0, 0).
|
||||
ValueError: If `pad_mode` is not equal to 'pad' and `pad` is greater than 0.
|
||||
|
||||
Supported Platforms:
|
||||
``Ascend`` ``GPU`` ``CPU``
|
||||
|
@ -5012,8 +5168,15 @@ def conv3d(inputs, weight, pad_mode="valid", padding=0, stride=1, dilation=1, gr
|
|||
weight_shape = weight.shape
|
||||
out_channel = weight_shape[0]
|
||||
kernel_size = weight_shape[2:5]
|
||||
conv = _get_cache_prim(P.Conv3D)(out_channel, kernel_size, 1, pad_mode, padding, stride, dilation, group, "NCDHW")
|
||||
output = conv(inputs, weight)
|
||||
if isinstance(padding, (list, tuple)):
|
||||
padding = _manipulate_padding(padding, dim=3)
|
||||
conv = _get_cache_prim(P.Conv3D)(out_channel, kernel_size, 1, pad_mode, padding, stride, dilation, groups, "NCDHW")
|
||||
if bias is None:
|
||||
return conv(input, weight)
|
||||
if not isinstance(bias, Tensor):
|
||||
raise TypeError(f"For 'conv3d', the 'bias' must be a Tensor, but got {type(bias)}.")
|
||||
conv_result = conv(input, weight)
|
||||
output = bias_add(conv_result, bias)
|
||||
return output
|
||||
|
||||
|
||||
|
@ -6314,6 +6477,7 @@ __all__ = [
|
|||
'ctc_greedy_decoder',
|
||||
'dropout',
|
||||
'conv3d_transpose',
|
||||
'conv1d',
|
||||
'conv2d',
|
||||
'sigmoid',
|
||||
'logsigmoid',
|
||||
|
|
|
@ -809,7 +809,7 @@ def test_grad_mutable_dynamic_len_sequence():
|
|||
def construct(self, input1, input2):
|
||||
x = (input1, input2)
|
||||
x = mutable(x, True)
|
||||
output = ops.conv2d(x[0], x[1], pad_mode="pad", padding=(2, 2, 3, 3))
|
||||
output = ops.conv2d(x[0], x[1], pad_mode="pad", padding=(2, 3))
|
||||
return output
|
||||
|
||||
context.set_context(mode=context.GRAPH_MODE)
|
||||
|
|
|
@ -0,0 +1,243 @@
|
|||
# Copyright 2023 Huawei Technologies Co., Ltd
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
import numpy as np
|
||||
import pytest
|
||||
import mindspore as ms
|
||||
import mindspore.nn as nn
|
||||
from mindspore import Tensor
|
||||
from mindspore import ops
|
||||
|
||||
|
||||
class Net1d(nn.Cell):
|
||||
def __init__(self, pad_mode="valid", padding=0, stride=1):
|
||||
super().__init__()
|
||||
self.pad_mode = pad_mode
|
||||
self.padding = padding
|
||||
self.stride = stride
|
||||
|
||||
def construct(self, x, weight, dilation=1, groups=1, bias=None):
|
||||
return ops.conv1d(x, weight, bias, self.stride, self.pad_mode, self.padding, dilation, groups)
|
||||
|
||||
|
||||
class Net2d(nn.Cell):
|
||||
def __init__(self, pad_mode="valid", padding=0, stride=1):
|
||||
super().__init__()
|
||||
self.pad_mode = pad_mode
|
||||
self.padding = padding
|
||||
self.stride = stride
|
||||
|
||||
def construct(self, x, weight, dilation=1, groups=1, bias=None):
|
||||
return ops.conv2d(x, weight, bias, self.stride, self.pad_mode, self.padding, dilation, groups)
|
||||
|
||||
|
||||
class Net3d(nn.Cell):
|
||||
def __init__(self, pad_mode="valid", padding=0, stride=1):
|
||||
super().__init__()
|
||||
self.pad_mode = pad_mode
|
||||
self.padding = padding
|
||||
self.stride = stride
|
||||
|
||||
def construct(self, x, weight, dilation=1, groups=1, bias=None):
|
||||
return ops.conv3d(x, weight, bias, self.stride, self.pad_mode, self.padding, dilation, groups)
|
||||
|
||||
|
||||
@pytest.mark.level0
|
||||
@pytest.mark.platform_x86_cpu
|
||||
@pytest.mark.platform_arm_cpu
|
||||
@pytest.mark.platform_x86_gpu_training
|
||||
@pytest.mark.platform_arm_ascend_training
|
||||
@pytest.mark.platform_x86_ascend_training
|
||||
@pytest.mark.env_onecard
|
||||
@pytest.mark.parametrize('mode', [ms.GRAPH_MODE, ms.PYNATIVE_MODE])
|
||||
def test_ops_conv1d(mode):
|
||||
"""
|
||||
Feature: ops.conv1d with padding = (1, )
|
||||
Description: Verify the result of conv1d
|
||||
Expectation: success
|
||||
"""
|
||||
ms.set_context(mode=mode)
|
||||
x = Tensor(np.arange(32).reshape((4, 2, 4)), ms.float32)
|
||||
weight = Tensor(np.arange(8).reshape((2, 2, 2)), ms.float32)
|
||||
bias = Tensor([-0.12345, 2.7683], ms.float32)
|
||||
net = Net1d(pad_mode='pad', padding=(1,))
|
||||
output = net(x, weight, bias=bias, groups=1)
|
||||
expected = np.array([[[11.8765, 23.8766, 29.8765, 35.8765, 13.8765],
|
||||
[30.7683, 66.7683, 88.7683, 110.7683, 56.7683]],
|
||||
[[43.8765, 71.8765, 77.8765, 83.8765, 29.8766],
|
||||
[126.7683, 242.7683, 264.7683, 286.7683, 136.7683]],
|
||||
[[75.8765, 119.8765, 125.8765, 131.8766, 45.8765],
|
||||
[222.7683, 418.7683, 440.7683, 462.7683, 216.7683]],
|
||||
[[107.8765, 167.8766, 173.8766, 179.8766, 61.8765],
|
||||
[318.7683, 594.7683, 616.7683, 638.7683, 296.7683]]])
|
||||
assert np.allclose(output.asnumpy(), expected, atol=1e-5, rtol=1e-5)
|
||||
|
||||
|
||||
@pytest.mark.level0
|
||||
@pytest.mark.platform_x86_cpu
|
||||
@pytest.mark.platform_arm_cpu
|
||||
@pytest.mark.platform_x86_gpu_training
|
||||
@pytest.mark.env_onecard
|
||||
@pytest.mark.parametrize('mode', [ms.GRAPH_MODE, ms.PYNATIVE_MODE])
|
||||
@pytest.mark.parametrize('pad_mode', ['valid', 'same', 'pad'])
|
||||
def test_ops_conv2d(mode, pad_mode):
|
||||
"""
|
||||
Feature: ops.conv2d
|
||||
Description: Verify the result of conv2d
|
||||
Expectation: success
|
||||
Note: There is a precision problem on Ascend, #I6PT9L
|
||||
"""
|
||||
ms.set_context(mode=mode)
|
||||
x = Tensor([[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
|
||||
[[9.0, 10.0, 11.0], [12.0, 13.0, 14.0], [15.0, 16.0, 17.0]]],
|
||||
[[[18.0, 19.0, 20.0], [21.0, 22.0, 23.0], [24.0, 25.0, 26.0]],
|
||||
[[27.0, 28.0, 29.0], [30.0, 31.0, 32.0], [33.0, 34.0, 35.0]]],
|
||||
[[[36.0, 37.0, 38.0], [39.0, 40.0, 41.0], [42.0, 43.0, 44.0]],
|
||||
[[45.0, 46.0, 47.0], [48.0, 49.0, 50.0], [51.0, 52.0, 53.0]]]], ms.float32)
|
||||
bias = Tensor([0.7297250824055579, 0.6472988621466479], ms.float32)
|
||||
if pad_mode == 'valid':
|
||||
net = Net2d(pad_mode)
|
||||
weight = Tensor([[[[-1.090221803810641]], [[-0.044567894776783905]]],
|
||||
[[[0.04005113957734308]], [[0.22892450020231897]]]], ms.float32)
|
||||
output = net(x, weight, bias=bias)
|
||||
expected = np.array([[[[0.3286140263080597, -0.8061756491661072, -1.940965175628662],
|
||||
[-3.0757548809051514, -4.210544586181641, -5.345334529876709],
|
||||
[-6.480123996734619, -7.6149139404296875, -8.749703407287598]],
|
||||
[[2.7076191902160645, 2.976594924926758, 3.245570659637451],
|
||||
[3.5145461559295654, 3.783521890640259, 4.052497386932373],
|
||||
[4.321473121643066, 4.59044885635376, 4.859424591064453]]],
|
||||
[[[-20.097599029541016, -21.232391357421875, -22.36717987060547],
|
||||
[-23.501968383789062, -24.63675880432129, -25.771547317504883],
|
||||
[-26.90633773803711, -28.041128158569336, -29.17591667175293]],
|
||||
[[7.54918098449707, 7.8181562423706055, 8.087132453918457],
|
||||
[8.356107711791992, 8.625083923339844, 8.894059181213379],
|
||||
[9.163034439086914, 9.432010650634766, 9.7009859085083]]],
|
||||
[[[-40.52381134033203, -41.658599853515625, -42.79339599609375],
|
||||
[-43.928184509277344, -45.06297302246094, -46.19776153564453],
|
||||
[-47.33255386352539, -48.467342376708984, -49.60213088989258]],
|
||||
[[12.390742301940918, 12.659717559814453, 12.928693771362305],
|
||||
[13.19766902923584, 13.466644287109375, 13.735620498657227],
|
||||
[14.004595756530762, 14.273571968078613, 14.542547225952148]]]])
|
||||
assert np.allclose(output.asnumpy(), expected, atol=1e-5, rtol=1e-5)
|
||||
elif pad_mode == 'pad':
|
||||
net = Net2d(pad_mode=pad_mode, padding=(1, 1), stride=2)
|
||||
weight = Tensor(np.arange(16).reshape((2, 2, 2, 2)), ms.float32)
|
||||
output = net(x, weight, bias=bias)
|
||||
expected = np.array([[[[63.7297, 145.7297], [186.7297, 380.7297]],
|
||||
[[135.6473, 337.6473], [474.6473, 1052.6473]]],
|
||||
[[[243.7297, 469.7297], [474.7297, 884.7297]],
|
||||
[[603.6473, 1237.6472], [1338.6472, 2708.6472]]],
|
||||
[[[423.7297, 793.7297], [762.7297, 1388.7297]],
|
||||
[[1071.6473, 2137.6475], [2202.6475, 4364.6475]]]])
|
||||
assert np.allclose(output.asnumpy(), expected, atol=1e-5, rtol=1e-5)
|
||||
else:
|
||||
net = Net2d(pad_mode=pad_mode)
|
||||
weight = Tensor(np.arange(16).reshape((2, 2, 2, 2)), ms.float32)
|
||||
output = net(x, weight, bias=bias)
|
||||
expected = np.array([[[[268.7297, 296.7297, 138.7297],
|
||||
[352.7297, 380.7297, 174.7297],
|
||||
[147.7297, 157.7297, 68.7297]],
|
||||
[[684.6473, 776.6473, 394.6473],
|
||||
[960.6473, 1052.6473, 526.6473],
|
||||
[499.6473, 541.6473, 268.6473]]],
|
||||
[[[772.7297, 800.7297, 354.7297],
|
||||
[856.7297, 884.7297, 390.7297],
|
||||
[327.7297, 337.7297, 140.7297]],
|
||||
[[2340.6472, 2432.6472, 1186.6472],
|
||||
[2616.6472, 2708.6472, 1318.6472],
|
||||
[1255.6472, 1297.6472, 628.6473]]],
|
||||
[[[1276.7297, 1304.7297, 570.7297],
|
||||
[1360.7297, 1388.7297, 606.7297],
|
||||
[507.7297, 517.7297, 212.7297]],
|
||||
[[3996.6475, 4088.6475, 1978.6473],
|
||||
[4272.6475, 4364.6475, 2110.6475],
|
||||
[2011.6473, 2053.6475, 988.6473]]]])
|
||||
assert np.allclose(output.asnumpy(), expected, atol=1e-5, rtol=1e-5)
|
||||
|
||||
|
||||
@pytest.mark.level0
|
||||
@pytest.mark.platform_x86_cpu
|
||||
@pytest.mark.platform_arm_cpu
|
||||
@pytest.mark.platform_x86_gpu_training
|
||||
@pytest.mark.platform_arm_ascend_training
|
||||
@pytest.mark.platform_x86_ascend_training
|
||||
@pytest.mark.env_onecard
|
||||
@pytest.mark.parametrize('mode', [ms.GRAPH_MODE, ms.PYNATIVE_MODE])
|
||||
def test_ops_conv3d(mode):
|
||||
"""
|
||||
Feature: ops.conv3d
|
||||
Description: Verify the result of conv3d
|
||||
Expectation: success
|
||||
"""
|
||||
ms.set_context(mode=mode)
|
||||
x = Tensor(np.arange(3 * 2 * 4 * 4 * 4).reshape((3, 2, 4, 4, 4)), ms.float32)
|
||||
weight = Tensor(np.arange(32).reshape((2, 2, 2, 2, 2)), ms.float32)
|
||||
bias = Tensor(np.array([-0.12345, 2.7683]), ms.float32)
|
||||
net = Net3d(pad_mode='valid')
|
||||
output = net(x, weight, bias=bias)
|
||||
expect_output = np.array([[[[[7439.8765, 7559.8765, 7679.8765],
|
||||
[7919.8765, 8039.8765, 8159.8765],
|
||||
[8399.8770, 8519.8770, 8639.8770]],
|
||||
[[9359.8770, 9479.8770, 9599.8770],
|
||||
[9839.8770, 9959.8770, 10079.8770],
|
||||
[10319.8770, 10439.8770, 10559.8770]],
|
||||
[[11279.8770, 11399.8770, 11519.8770],
|
||||
[11759.8770, 11879.8770, 11999.8770],
|
||||
[12239.8770, 12359.8770, 12479.8770]]],
|
||||
[[[18322.7695, 18698.7695, 19074.7695],
|
||||
[19826.7695, 20202.7695, 20578.7695],
|
||||
[21330.7695, 21706.7695, 22082.7695]],
|
||||
[[24338.7695, 24714.7695, 25090.7695],
|
||||
[25842.7695, 26218.7695, 26594.7695],
|
||||
[27346.7695, 27722.7695, 28098.7695]],
|
||||
[[30354.7695, 30730.7695, 31106.7695],
|
||||
[31858.7695, 32234.7695, 32610.7695],
|
||||
[33362.7695, 33738.7695, 34114.7695]]]],
|
||||
[[[[22799.8770, 22919.8770, 23039.8770],
|
||||
[23279.8770, 23399.8770, 23519.8770],
|
||||
[23759.8770, 23879.8770, 23999.8770]],
|
||||
[[24719.8770, 24839.8770, 24959.8770],
|
||||
[25199.8770, 25319.8770, 25439.8770],
|
||||
[25679.8770, 25799.8770, 25919.8770]],
|
||||
[[26639.8770, 26759.8770, 26879.8770],
|
||||
[27119.8770, 27239.8770, 27359.8770],
|
||||
[27599.8770, 27719.8770, 27839.8770]]],
|
||||
[[[66450.7656, 66826.7656, 67202.7656],
|
||||
[67954.7656, 68330.7656, 68706.7656],
|
||||
[69458.7656, 69834.7656, 70210.7656]],
|
||||
[[72466.7656, 72842.7656, 73218.7656],
|
||||
[73970.7656, 74346.7656, 74722.7656],
|
||||
[75474.7656, 75850.7656, 76226.7656]],
|
||||
[[78482.7656, 78858.7656, 79234.7656],
|
||||
[79986.7656, 80362.7656, 80738.7656],
|
||||
[81490.7656, 81866.7656, 82242.7656]]]],
|
||||
[[[[38159.8750, 38279.8750, 38399.8750],
|
||||
[38639.8750, 38759.8750, 38879.8750],
|
||||
[39119.8750, 39239.8750, 39359.8750]],
|
||||
[[40079.8750, 40199.8750, 40319.8750],
|
||||
[40559.8750, 40679.8750, 40799.8750],
|
||||
[41039.8750, 41159.8750, 41279.8750]],
|
||||
[[41999.8750, 42119.8750, 42239.8750],
|
||||
[42479.8750, 42599.8750, 42719.8750],
|
||||
[42959.8750, 43079.8750, 43199.8750]]],
|
||||
[[[114578.7656, 114954.7656, 115330.7656],
|
||||
[116082.7656, 116458.7656, 116834.7656],
|
||||
[117586.7656, 117962.7656, 118338.7656]],
|
||||
[[120594.7656, 120970.7656, 121346.7656],
|
||||
[122098.7656, 122474.7656, 122850.7656],
|
||||
[123602.7656, 123978.7656, 124354.7656]],
|
||||
[[126610.7656, 126986.7656, 127362.7656],
|
||||
[128114.7656, 128490.7656, 128866.7656],
|
||||
[129618.7656, 129994.7656, 130370.7656]]]]])
|
||||
assert np.allclose(output.asnumpy(), expect_output, atol=1e-5, rtol=1e-5)
|
Loading…
Reference in New Issue