add function api of BoundingboxEncode,BoundingboxDecode.

2022-10-10 11:08:26 +08:00 · 2022-10-10 11:08:26 +08:00 · a65653c65e
parent 2ea5414a93
commit a65653c65e
8 changed files with 132 additions and 65 deletions
--- a/docs/api/api_python/mindspore.ops.function.rst
+++ b/docs/api/api_python/mindspore.ops.function.rst
@ -22,6 +22,8 @@ mindspore.ops.function
    mindspore.ops.avg_pool3d
    mindspore.ops.batch_norm
    mindspore.ops.bias_add
    mindspore.ops.boundingbox_encode
    mindspore.ops.boundingbox_decode
    mindspore.ops.ctc_greedy_decoder
    mindspore.ops.conv2d
    mindspore.ops.conv3d
--- a/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst
@ -7,20 +7,4 @@ mindspore.ops.BoundingBoxDecode
    算子的功能是计算偏移量，此算子将偏移量转换为Bbox，用于在后续图像中标记目标等。
-    参数：
+    更多细节详见 :func:`mindspore.ops.boundingbox_decode`。
        - **means** (tuple) - 计算 `deltas` 的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
        - **stds** (tuple) - 计算 `deltas` 的标准差。默认值：（1.0, 1.0, 1.0, 1.0）。
        - **max_shape** (tuple) - 解码框计算的上限值。
        - **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值：0.016。
    输入：
        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
        - **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。
    输出：
        Tensor，解码框。它的数据类型和shape与 `anchor_box` 相同。
    异常：
        - **TypeError** - 如果 `means` 、 `stds` 或 `max_shape` 不是tuple。
        - **TypeError** - 如果 `wh_ratio_clip` 不是float。
        - **TypeError** - 如果 `anchor_box` 或 `deltas` 不是Tensor。
--- a/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst
@ -7,17 +7,4 @@ mindspore.ops.BoundingBoxEncode
    算子的功能是计算预测边界框和真实边界框之间的偏移，并将此偏移作为损失变量。
-    参数：
+    更多细节详见 :func:`mindspore.ops.boundingbox_encode`。
        - **means** (tuple) - 计算编码边界框的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
        - **stds** (tuple) - 计算增量的标准偏差。默认值：（1.0、1.0、1.0、1.0）。
    输入：
        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
        - **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。
    输出：
        Tensor，编码边界框。数据类型和shape与输入 `anchor_box` 相同。
    异常：
        - **TypeError** - 如果 `means` 或 `stds` 不是tuple。
        - **TypeError** - 如果 `anchor_box` 或 `groundtruth_box` 不是Tensor。
--- a/docs/api/api_python/ops/mindspore.ops.func_boundingbox_decode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_decode.rst
@ -0,0 +1,24 @@
 mindspore.ops.boundingbox_decode
 ================================
 .. py:function:: mindspore.ops.boundingbox_decode(max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016)
    解码边界框位置信息。
    算子的功能是计算偏移量，此算子将偏移量转换为Bbox，用于在后续图像中标记目标等。
    参数：
        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
        - **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。
        - **means** (tuple) - 计算 `deltas` 的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
        - **stds** (tuple) - 计算 `deltas` 的标准差。默认值：（1.0, 1.0, 1.0, 1.0）。
        - **max_shape** (tuple) - 解码框计算的上限值。
        - **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值：0.016。
    返回：
        Tensor，解码框。它的数据类型和shape与 `anchor_box` 相同。
    异常：
        - **TypeError** - 如果 `means` 、 `stds` 或 `max_shape` 不是tuple。
        - **TypeError** - 如果 `wh_ratio_clip` 不是float。
        - **TypeError** - 如果 `anchor_box` 或 `deltas` 不是Tensor。
--- a/docs/api/api_python/ops/mindspore.ops.func_boundingbox_encode.rst
+++ b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_encode.rst
@ -0,0 +1,21 @@
 mindspore.ops.boundingbox_encode
 ================================
 .. py:function:: mindspore.ops.boundingbox_encode(means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0))
    编码边界框位置信息。
    算子的功能是计算预测边界框和真实边界框之间的偏移，并将此偏移作为损失变量。
    参数：
        - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。
        - **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。
        - **means** (tuple) - 计算编码边界框的均值。默认值：（0.0, 0.0, 0.0, 0.0, 0.0）。
        - **stds** (tuple) - 计算增量的标准偏差。默认值：（1.0、1.0、1.0、1.0）。
    返回：
        Tensor，编码边界框。数据类型和shape与输入 `anchor_box` 相同。
    异常：
        - **TypeError** - 如果 `means` 或 `stds` 不是tuple。
        - **TypeError** - 如果 `anchor_box` 或 `groundtruth_box` 不是Tensor。
--- a/mindspore/python/mindspore/ops/function/init.py
+++ b/mindspore/python/mindspore/ops/function/init.py
@ -325,6 +325,8 @@ from .nn_func import (
    glu,
    multi_margin_loss,
    multi_label_margin_loss,
    boundingbox_decode,
    boundingbox_encode
 )
 from .linalg_func import (
    svd,
--- a/mindspore/python/mindspore/ops/function/nn_func.py
+++ b/mindspore/python/mindspore/ops/function/nn_func.py
@ -3286,6 +3286,82 @@ def multi_label_margin_loss(inputs, target, reduction='mean'):
    return outputs
 def boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)):
    """
    Encodes bounding boxes locations.
    This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes,
    and this offset will be used as a variable for the loss.
    Args:
        anchor_box(Tensor): Anchor boxes. The shape of anchor_box must be (n, 4).
        groundtruth_box(Tensor): Ground truth boxes. Which has the same shape with anchor_box.
        means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0).
        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
    Returns:
        Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`.
    Raises:
        TypeError: If `means` or `stds` is not a tuple.
        TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor.
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
    Examples:
        >>> anchor_box = Tensor([[2, 2, 2, 3], [2, 2, 2, 3]], mindspore.float32)
        >>> groundtruth_box = Tensor([[1, 2, 1, 4], [1, 2, 1, 4]], mindspore.float32)
        >>> output = ops.boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0),
        ...                                 stds=(1.0, 1.0, 1.0, 1.0))
        >>> print(output)
        [[ -1.  0.25  0.  0.40551758]
         [ -1.  0.25  0.  0.40551758]]
    """
    boundingbox_encode_op = _get_cache_prim(P.BoundingBoxEncode)(means, stds)
    return boundingbox_encode_op(anchor_box, groundtruth_box)
 def boundingbox_decode(anchor_box, deltas, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0),
                       wh_ratio_clip=0.016):
    """
    Decodes bounding boxes locations.
    The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox,
    which is used to mark the target in the subsequent images, etc.
    Args:
        anchor_box** (Tensor): Anchor boxes. The shape of `anchor_box` must be (n, 4).
        deltas** (Tensor): Delta of boxes. Which has the same shape with `anchor_box`.
        means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0).
        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
        max_shape (tuple): The max size limit for decoding box calculation.
        wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016.
    Returns:
        Tensor, decoded boxes. It has the same data type and shape as `anchor_box`.
    Raises:
        TypeError: If `means`, `stds` or `max_shape` is not a tuple.
        TypeError: If `wh_ratio_clip` is not a float.
        TypeError: If `anchor_box` or `deltas` is not a Tensor.
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
    Examples:
        >>> anchor_box = Tensor([[4, 1, 2, 1], [2, 2, 2, 3]], mindspore.float32)
        >>> deltas = Tensor([[3, 1, 2, 2], [1, 2, 1, 4]], mindspore.float32)
        >>> output = ops.boundingbox_decode(anchor_box, deltas, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0),
        ...                                 max_shape=(768, 1280), wh_ratio_clip=0.016)
        >>> print(output)
        [[ 4.1953125  0.         0.         5.1953125]
         [ 2.140625   0.         3.859375  60.59375  ]]
    """
    boundingbox_decode_op = _get_cache_prim(P.BoundingBoxDecode)(max_shape, means, stds, wh_ratio_clip)
    return boundingbox_decode_op(anchor_box, deltas)
 __all__ = [
    'adaptive_avg_pool1d',
    'adaptive_avg_pool2d',
@ -3336,6 +3412,8 @@ __all__ = [
    'conv3d',
    'glu',
    'multi_margin_loss',
-    'multi_label_margin_loss'
+    'multi_label_margin_loss',
    'boundingbox_decode',
    'boundingbox_encode'
 ]
 __all__.sort()
--- a/mindspore/python/mindspore/ops/operations/other_ops.py
+++ b/mindspore/python/mindspore/ops/operations/other_ops.py
@ -113,20 +113,7 @@ class BoundingBoxEncode(PrimitiveWithInfer):
    This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes,
    and this offset will be used as a variable for the loss.
-    Args:
+    Refer to :func:`mindspore.ops.boundingbox_encode` for more detail.
        means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0).
        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
    Inputs:
        - **anchor_box** (Tensor) - Anchor boxes. The shape of anchor_box must be (n, 4).
        - **groundtruth_box** (Tensor) - Ground truth boxes. Which has the same shape with anchor_box.
    Outputs:
        Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`.
    Raises:
        TypeError: If `means` or `stds` is not a tuple.
        TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor.
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
@ -140,7 +127,6 @@ class BoundingBoxEncode(PrimitiveWithInfer):
        [[ -1.  0.25  0.  0.40551758]
         [ -1.  0.25  0.  0.40551758]]
    """
    @prim_attr_register
    def __init__(self, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)):
        """Initialize BoundingBoxEncode."""
@ -223,23 +209,7 @@ class BoundingBoxDecode(Primitive):
    The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox,
    which is used to mark the target in the subsequent images, etc.
-    Args:
+    Refer to :func:`mindspore.ops.boundingbox_decode` for more detail.
        means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0).
        stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0).
        max_shape (tuple): The max size limit for decoding box calculation.
        wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016.
    Inputs:
        - **anchor_box** (Tensor) - Anchor boxes. The shape of `anchor_box` must be (n, 4).
        - **deltas** (Tensor) - Delta of boxes. Which has the same shape with `anchor_box`.
    Outputs:
        Tensor, decoded boxes. It has the same data type and shape as `anchor_box`.
    Raises:
        TypeError: If `means`, `stds` or `max_shape` is not a tuple.
        TypeError: If `wh_ratio_clip` is not a float.
        TypeError: If `anchor_box` or `deltas` is not a Tensor.
    Supported Platforms:
        ``Ascend`` ``GPU`` ``CPU``
@ -255,7 +225,6 @@ class BoundingBoxDecode(Primitive):
         [ 2.140625   0.         3.859375  60.59375  ]]
    """
    @prim_attr_register
    def __init__(self, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016):
        """Initialize BoundingBoxDecode."""