add function api of BoundingboxEncode,BoundingboxDecode.

2022-10-10 11:08:26 +08:00 · 2022-10-10 11:08:26 +08:00 · a65653c65e
parent 2ea5414a93
commit a65653c65e
8 changed files with 132 additions and 65 deletions
--- a/docs/api/api_python/mindspore.ops.function.rst
+++ b/docs/api/api_python/mindspore.ops.function.rst
@ -22,6 +22,8 @@ mindspore.ops.function
    mindspore.ops.avg_pool3d
    mindspore.ops.batch_norm
    mindspore.ops.bias_add
+    mindspore.ops.boundingbox_encode
+    mindspore.ops.boundingbox_decode
    mindspore.ops.ctc_greedy_decoder
    mindspore.ops.conv2d
    mindspore.ops.conv3d
--- a/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst
@ -7,20 +7,4 @@ mindspore.ops.BoundingBoxDecode

    算子的功能是计算偏移量，此算子将偏移量转换为Bbox，用于在后续图像中标记目标等。

-    参数：
-        - **means** (tuple) - 计算 `deltas` 的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
-        - **stds** (tuple) - 计算 `deltas` 的标准差。默认值：（1.0, 1.0, 1.0, 1.0）。
-        - **max_shape** (tuple) - 解码框计算的上限值。
-        - **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值：0.016。
-
-    输入：
-        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
-        - **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。
-
-    输出：
-        Tensor，解码框。它的数据类型和shape与 `anchor_box` 相同。
-
-    异常：
-        - **TypeError** - 如果 `means` 、 `stds` 或 `max_shape` 不是tuple。
-        - **TypeError** - 如果 `wh_ratio_clip` 不是float。
-        - **TypeError** - 如果 `anchor_box` 或 `deltas` 不是Tensor。
+    更多细节详见 :func:`mindspore.ops.boundingbox_decode`。
--- a/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst
@ -7,17 +7,4 @@ mindspore.ops.BoundingBoxEncode

    算子的功能是计算预测边界框和真实边界框之间的偏移，并将此偏移作为损失变量。

-    参数：
-        - **means** (tuple) - 计算编码边界框的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
-        - **stds** (tuple) - 计算增量的标准偏差。默认值：（1.0、1.0、1.0、1.0）。
-
-    输入：
-        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
-        - **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。
-
-    输出：
-        Tensor，编码边界框。数据类型和shape与输入 `anchor_box` 相同。
-
-    异常：
-        - **TypeError** - 如果 `means` 或 `stds` 不是tuple。
-        - **TypeError** - 如果 `anchor_box` 或 `groundtruth_box` 不是Tensor。
+    更多细节详见 :func:`mindspore.ops.boundingbox_encode`。
--- a/docs/api/api_python/ops/mindspore.ops.func_boundingbox_decode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_decode.rst
@ -0,0 +1,24 @@
+mindspore.ops.boundingbox_decode
+================================
+
+.. py:function:: mindspore.ops.boundingbox_decode(max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016)
+
+    解码边界框位置信息。
+
+    算子的功能是计算偏移量，此算子将偏移量转换为Bbox，用于在后续图像中标记目标等。
+
+    参数：
+        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
+        - **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。
+        - **means** (tuple) - 计算 `deltas` 的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
+        - **stds** (tuple) - 计算 `deltas` 的标准差。默认值：（1.0, 1.0, 1.0, 1.0）。
+        - **max_shape** (tuple) - 解码框计算的上限值。
+        - **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值：0.016。
+
+    返回：
+        Tensor，解码框。它的数据类型和shape与 `anchor_box` 相同。
+
+    异常：
+        - **TypeError** - 如果 `means` 、 `stds` 或 `max_shape` 不是tuple。
+        - **TypeError** - 如果 `wh_ratio_clip` 不是float。
+        - **TypeError** - 如果 `anchor_box` 或 `deltas` 不是Tensor。
--- a/docs/api/api_python/ops/mindspore.ops.func_boundingbox_encode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_encode.rst
@ -0,0 +1,21 @@
+mindspore.ops.boundingbox_encode
+================================
+
+.. py:function:: mindspore.ops.boundingbox_encode(means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0))
+
+    编码边界框位置信息。
+
+    算子的功能是计算预测边界框和真实边界框之间的偏移，并将此偏移作为损失变量。
+
+    参数：
+        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
+        - **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。
+        - **means** (tuple) - 计算编码边界框的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
+        - **stds** (tuple) - 计算增量的标准偏差。默认值：（1.0、1.0、1.0、1.0）。
+
+    返回：
+        Tensor，编码边界框。数据类型和shape与输入 `anchor_box` 相同。
+
+    异常：
+        - **TypeError** - 如果 `means` 或 `stds` 不是tuple。
+        - **TypeError** - 如果 `anchor_box` 或 `groundtruth_box` 不是Tensor。
--- a/mindspore/python/mindspore/ops/function/init.py
+++ b/mindspore/python/mindspore/ops/function/init.py
@ -325,6 +325,8 @@ from .nn_func import (
    glu,
    multi_margin_loss,
    multi_label_margin_loss,
+    boundingbox_decode,
+    boundingbox_encode
 )
 from .linalg_func import (
    svd,
--- a/mindspore/python/mindspore/ops/function/nn_func.py
+++ b/mindspore/python/mindspore/ops/function/nn_func.py
@ -3286,6 +3286,82 @@ def multi_label_margin_loss(inputs, target, reduction='mean'):
    return outputs


+def boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)):
+    """
+    Encodes bounding boxes locations.
+
+    This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes,
+    and this offset will be used as a variable for the loss.
+
+    Args:
+        anchor_box(Tensor): Anchor boxes. The shape of anchor_box must be (n, 4).
+        groundtruth_box(Tensor): Ground truth boxes. Which has the same shape with anchor_box.
+        means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0).
+        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
+
+    Returns:
+        Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`.
+
+    Raises:
+        TypeError: If `means` or `stds` is not a tuple.
+        TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor.
+
+    Supported Platforms:
+        ``Ascend`` ``GPU`` ``CPU``
+
+    Examples:
+        >>> anchor_box = Tensor([[2, 2, 2, 3], [2, 2, 2, 3]], mindspore.float32)
+        >>> groundtruth_box = Tensor([[1, 2, 1, 4], [1, 2, 1, 4]], mindspore.float32)
+        >>> output = ops.boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0),
+        ...                                 stds=(1.0, 1.0, 1.0, 1.0))
+        >>> print(output)
+        [[ -1.  0.25  0.  0.40551758]
+         [ -1.  0.25  0.  0.40551758]]
+    """
+    boundingbox_encode_op = _get_cache_prim(P.BoundingBoxEncode)(means, stds)
+    return boundingbox_encode_op(anchor_box, groundtruth_box)
+
+
+def boundingbox_decode(anchor_box, deltas, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0),
+                       wh_ratio_clip=0.016):
+    """
+    Decodes bounding boxes locations.
+
+    The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox,
+    which is used to mark the target in the subsequent images, etc.
+
+    Args:
+        anchor_box** (Tensor): Anchor boxes. The shape of `anchor_box` must be (n, 4).
+        deltas** (Tensor): Delta of boxes. Which has the same shape with `anchor_box`.
+        means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0).
+        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
+        max_shape (tuple): The max size limit for decoding box calculation.
+        wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016.
+
+    Returns:
+        Tensor, decoded boxes. It has the same data type and shape as `anchor_box`.
+
+    Raises:
+        TypeError: If `means`, `stds` or `max_shape` is not a tuple.
+        TypeError: If `wh_ratio_clip` is not a float.
+        TypeError: If `anchor_box` or `deltas` is not a Tensor.
+
+    Supported Platforms:
+        ``Ascend`` ``GPU`` ``CPU``
+
+    Examples:
+        >>> anchor_box = Tensor([[4, 1, 2, 1], [2, 2, 2, 3]], mindspore.float32)
+        >>> deltas = Tensor([[3, 1, 2, 2], [1, 2, 1, 4]], mindspore.float32)
+        >>> output = ops.boundingbox_decode(anchor_box, deltas, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0),
+        ...                                 max_shape=(768, 1280), wh_ratio_clip=0.016)
+        >>> print(output)
+        [[ 4.1953125  0.         0.         5.1953125]
+         [ 2.140625   0.         3.859375  60.59375  ]]
+    """
+    boundingbox_decode_op = _get_cache_prim(P.BoundingBoxDecode)(max_shape, means, stds, wh_ratio_clip)
+    return boundingbox_decode_op(anchor_box, deltas)
+
+
 __all__ = [
    'adaptive_avg_pool1d',
    'adaptive_avg_pool2d',
@ -3336,6 +3412,8 @@ __all__ = [
    'conv3d',
    'glu',
    'multi_margin_loss',
-    'multi_label_margin_loss'
+    'multi_label_margin_loss',
+    'boundingbox_decode',
+    'boundingbox_encode'
 ]
 __all__.sort()
--- a/mindspore/python/mindspore/ops/operations/other_ops.py
+++ b/mindspore/python/mindspore/ops/operations/other_ops.py
@ -113,20 +113,7 @@ class BoundingBoxEncode(PrimitiveWithInfer):
    This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes,
    and this offset will be used as a variable for the loss.

-    Args:
-        means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0).
-        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
-
-    Inputs:
-        - **anchor_box** (Tensor) - Anchor boxes. The shape of anchor_box must be (n, 4).
-        - **groundtruth_box** (Tensor) - Ground truth boxes. Which has the same shape with anchor_box.
-
-    Outputs:
-        Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`.
-
-    Raises:
-        TypeError: If `means` or `stds` is not a tuple.
-        TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor.
+    Refer to :func:`mindspore.ops.boundingbox_encode` for more detail.

    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
@ -140,7 +127,6 @@ class BoundingBoxEncode(PrimitiveWithInfer):
        [[ -1.  0.25  0.  0.40551758]
         [ -1.  0.25  0.  0.40551758]]
    """
-
    @prim_attr_register
    def __init__(self, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)):
        """Initialize BoundingBoxEncode."""
@ -223,23 +209,7 @@ class BoundingBoxDecode(Primitive):
    The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox,
    which is used to mark the target in the subsequent images, etc.

-    Args:
-        means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0).
-        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
-        max_shape (tuple): The max size limit for decoding box calculation.
-        wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016.
-
-    Inputs:
-        - **anchor_box** (Tensor) - Anchor boxes. The shape of `anchor_box` must be (n, 4).
-        - **deltas** (Tensor) - Delta of boxes. Which has the same shape with `anchor_box`.
-
-    Outputs:
-        Tensor, decoded boxes. It has the same data type and shape as `anchor_box`.
-
-    Raises:
-        TypeError: If `means`, `stds` or `max_shape` is not a tuple.
-        TypeError: If `wh_ratio_clip` is not a float.
-        TypeError: If `anchor_box` or `deltas` is not a Tensor.
+    Refer to :func:`mindspore.ops.boundingbox_decode` for more detail.

    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
@ -255,7 +225,6 @@ class BoundingBoxDecode(Primitive):
         [ 2.140625   0.         3.859375  60.59375  ]]

    """
-
    @prim_attr_register
    def __init__(self, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016):
        """Initialize BoundingBoxDecode."""