add function api of BoundingboxEncode,BoundingboxDecode.

This commit is contained in:
LV 2022-10-10 11:08:26 +08:00
parent 2ea5414a93
commit a65653c65e
8 changed files with 132 additions and 65 deletions

View File

@ -22,6 +22,8 @@ mindspore.ops.function
mindspore.ops.avg_pool3d mindspore.ops.avg_pool3d
mindspore.ops.batch_norm mindspore.ops.batch_norm
mindspore.ops.bias_add mindspore.ops.bias_add
mindspore.ops.boundingbox_encode
mindspore.ops.boundingbox_decode
mindspore.ops.ctc_greedy_decoder mindspore.ops.ctc_greedy_decoder
mindspore.ops.conv2d mindspore.ops.conv2d
mindspore.ops.conv3d mindspore.ops.conv3d

View File

@ -7,20 +7,4 @@ mindspore.ops.BoundingBoxDecode
算子的功能是计算偏移量此算子将偏移量转换为Bbox用于在后续图像中标记目标等。 算子的功能是计算偏移量此算子将偏移量转换为Bbox用于在后续图像中标记目标等。
参数: 更多细节详见 :func:`mindspore.ops.boundingbox_decode`
- **means** (tuple) - 计算 `deltas` 的均值。默认值0.0, 0.0, 0.0, 0.0, 0.0)。
- **stds** (tuple) - 计算 `deltas` 的标准差。默认值1.0, 1.0, 1.0, 1.0)。
- **max_shape** (tuple) - 解码框计算的上限值。
- **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值0.016。
输入:
- **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)`
- **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。
输出:
Tensor解码框。它的数据类型和shape与 `anchor_box` 相同。
异常:
- **TypeError** - 如果 `means``stds``max_shape` 不是tuple。
- **TypeError** - 如果 `wh_ratio_clip` 不是float。
- **TypeError** - 如果 `anchor_box``deltas` 不是Tensor。

View File

@ -7,17 +7,4 @@ mindspore.ops.BoundingBoxEncode
算子的功能是计算预测边界框和真实边界框之间的偏移,并将此偏移作为损失变量。 算子的功能是计算预测边界框和真实边界框之间的偏移,并将此偏移作为损失变量。
参数: 更多细节详见 :func:`mindspore.ops.boundingbox_encode`
- **means** (tuple) - 计算编码边界框的均值。默认值0.0, 0.0, 0.0, 0.0, 0.0)。
- **stds** (tuple) - 计算增量的标准偏差。默认值1.0、1.0、1.0、1.0)。
输入:
- **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)`
- **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。
输出:
Tensor编码边界框。数据类型和shape与输入 `anchor_box` 相同。
异常:
- **TypeError** - 如果 `means``stds` 不是tuple。
- **TypeError** - 如果 `anchor_box``groundtruth_box` 不是Tensor。

View File

@ -0,0 +1,24 @@
mindspore.ops.boundingbox_decode
================================
.. py:function:: mindspore.ops.boundingbox_decode(max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016)
解码边界框位置信息。
算子的功能是计算偏移量此算子将偏移量转换为Bbox用于在后续图像中标记目标等。
参数:
- **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)`
- **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。
- **means** (tuple) - 计算 `deltas` 的均值。默认值0.0, 0.0, 0.0, 0.0, 0.0)。
- **stds** (tuple) - 计算 `deltas` 的标准差。默认值1.0, 1.0, 1.0, 1.0)。
- **max_shape** (tuple) - 解码框计算的上限值。
- **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值0.016。
返回:
Tensor解码框。它的数据类型和shape与 `anchor_box` 相同。
异常:
- **TypeError** - 如果 `means``stds``max_shape` 不是tuple。
- **TypeError** - 如果 `wh_ratio_clip` 不是float。
- **TypeError** - 如果 `anchor_box``deltas` 不是Tensor。

View File

@ -0,0 +1,21 @@
mindspore.ops.boundingbox_encode
================================
.. py:function:: mindspore.ops.boundingbox_encode(means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0))
编码边界框位置信息。
算子的功能是计算预测边界框和真实边界框之间的偏移,并将此偏移作为损失变量。
参数:
- **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)`
- **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。
- **means** (tuple) - 计算编码边界框的均值。默认值0.0, 0.0, 0.0, 0.0, 0.0)。
- **stds** (tuple) - 计算增量的标准偏差。默认值1.0、1.0、1.0、1.0)。
返回:
Tensor编码边界框。数据类型和shape与输入 `anchor_box` 相同。
异常:
- **TypeError** - 如果 `means``stds` 不是tuple。
- **TypeError** - 如果 `anchor_box``groundtruth_box` 不是Tensor。

View File

@ -325,6 +325,8 @@ from .nn_func import (
glu, glu,
multi_margin_loss, multi_margin_loss,
multi_label_margin_loss, multi_label_margin_loss,
boundingbox_decode,
boundingbox_encode
) )
from .linalg_func import ( from .linalg_func import (
svd, svd,

View File

@ -3286,6 +3286,82 @@ def multi_label_margin_loss(inputs, target, reduction='mean'):
return outputs return outputs
def boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)):
"""
Encodes bounding boxes locations.
This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes,
and this offset will be used as a variable for the loss.
Args:
anchor_box(Tensor): Anchor boxes. The shape of anchor_box must be (n, 4).
groundtruth_box(Tensor): Ground truth boxes. Which has the same shape with anchor_box.
means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0).
stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
Returns:
Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`.
Raises:
TypeError: If `means` or `stds` is not a tuple.
TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor.
Supported Platforms:
``Ascend`` ``GPU`` ``CPU``
Examples:
>>> anchor_box = Tensor([[2, 2, 2, 3], [2, 2, 2, 3]], mindspore.float32)
>>> groundtruth_box = Tensor([[1, 2, 1, 4], [1, 2, 1, 4]], mindspore.float32)
>>> output = ops.boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0),
... stds=(1.0, 1.0, 1.0, 1.0))
>>> print(output)
[[ -1. 0.25 0. 0.40551758]
[ -1. 0.25 0. 0.40551758]]
"""
boundingbox_encode_op = _get_cache_prim(P.BoundingBoxEncode)(means, stds)
return boundingbox_encode_op(anchor_box, groundtruth_box)
def boundingbox_decode(anchor_box, deltas, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0),
wh_ratio_clip=0.016):
"""
Decodes bounding boxes locations.
The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox,
which is used to mark the target in the subsequent images, etc.
Args:
anchor_box** (Tensor): Anchor boxes. The shape of `anchor_box` must be (n, 4).
deltas** (Tensor): Delta of boxes. Which has the same shape with `anchor_box`.
means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0).
stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
max_shape (tuple): The max size limit for decoding box calculation.
wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016.
Returns:
Tensor, decoded boxes. It has the same data type and shape as `anchor_box`.
Raises:
TypeError: If `means`, `stds` or `max_shape` is not a tuple.
TypeError: If `wh_ratio_clip` is not a float.
TypeError: If `anchor_box` or `deltas` is not a Tensor.
Supported Platforms:
``Ascend`` ``GPU`` ``CPU``
Examples:
>>> anchor_box = Tensor([[4, 1, 2, 1], [2, 2, 2, 3]], mindspore.float32)
>>> deltas = Tensor([[3, 1, 2, 2], [1, 2, 1, 4]], mindspore.float32)
>>> output = ops.boundingbox_decode(anchor_box, deltas, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0),
... max_shape=(768, 1280), wh_ratio_clip=0.016)
>>> print(output)
[[ 4.1953125 0. 0. 5.1953125]
[ 2.140625 0. 3.859375 60.59375 ]]
"""
boundingbox_decode_op = _get_cache_prim(P.BoundingBoxDecode)(max_shape, means, stds, wh_ratio_clip)
return boundingbox_decode_op(anchor_box, deltas)
__all__ = [ __all__ = [
'adaptive_avg_pool1d', 'adaptive_avg_pool1d',
'adaptive_avg_pool2d', 'adaptive_avg_pool2d',
@ -3336,6 +3412,8 @@ __all__ = [
'conv3d', 'conv3d',
'glu', 'glu',
'multi_margin_loss', 'multi_margin_loss',
'multi_label_margin_loss' 'multi_label_margin_loss',
'boundingbox_decode',
'boundingbox_encode'
] ]
__all__.sort() __all__.sort()

View File

@ -113,20 +113,7 @@ class BoundingBoxEncode(PrimitiveWithInfer):
This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes, This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes,
and this offset will be used as a variable for the loss. and this offset will be used as a variable for the loss.
Args: Refer to :func:`mindspore.ops.boundingbox_encode` for more detail.
means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0).
stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
Inputs:
- **anchor_box** (Tensor) - Anchor boxes. The shape of anchor_box must be (n, 4).
- **groundtruth_box** (Tensor) - Ground truth boxes. Which has the same shape with anchor_box.
Outputs:
Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`.
Raises:
TypeError: If `means` or `stds` is not a tuple.
TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor.
Supported Platforms: Supported Platforms:
``Ascend`` ``GPU`` ``CPU`` ``Ascend`` ``GPU`` ``CPU``
@ -140,7 +127,6 @@ class BoundingBoxEncode(PrimitiveWithInfer):
[[ -1. 0.25 0. 0.40551758] [[ -1. 0.25 0. 0.40551758]
[ -1. 0.25 0. 0.40551758]] [ -1. 0.25 0. 0.40551758]]
""" """
@prim_attr_register @prim_attr_register
def __init__(self, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)): def __init__(self, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)):
"""Initialize BoundingBoxEncode.""" """Initialize BoundingBoxEncode."""
@ -223,23 +209,7 @@ class BoundingBoxDecode(Primitive):
The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox, The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox,
which is used to mark the target in the subsequent images, etc. which is used to mark the target in the subsequent images, etc.
Args: Refer to :func:`mindspore.ops.boundingbox_decode` for more detail.
means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0).
stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
max_shape (tuple): The max size limit for decoding box calculation.
wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016.
Inputs:
- **anchor_box** (Tensor) - Anchor boxes. The shape of `anchor_box` must be (n, 4).
- **deltas** (Tensor) - Delta of boxes. Which has the same shape with `anchor_box`.
Outputs:
Tensor, decoded boxes. It has the same data type and shape as `anchor_box`.
Raises:
TypeError: If `means`, `stds` or `max_shape` is not a tuple.
TypeError: If `wh_ratio_clip` is not a float.
TypeError: If `anchor_box` or `deltas` is not a Tensor.
Supported Platforms: Supported Platforms:
``Ascend`` ``GPU`` ``CPU`` ``Ascend`` ``GPU`` ``CPU``
@ -255,7 +225,6 @@ class BoundingBoxDecode(Primitive):
[ 2.140625 0. 3.859375 60.59375 ]] [ 2.140625 0. 3.859375 60.59375 ]]
""" """
@prim_attr_register @prim_attr_register
def __init__(self, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016): def __init__(self, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016):
"""Initialize BoundingBoxDecode.""" """Initialize BoundingBoxDecode."""