diff --git a/docs/api/api_python/mindspore.ops.function.rst b/docs/api/api_python/mindspore.ops.function.rst index e0ac741bf1e..4891257ec0a 100644 --- a/docs/api/api_python/mindspore.ops.function.rst +++ b/docs/api/api_python/mindspore.ops.function.rst @@ -22,6 +22,8 @@ mindspore.ops.function mindspore.ops.avg_pool3d mindspore.ops.batch_norm mindspore.ops.bias_add + mindspore.ops.boundingbox_encode + mindspore.ops.boundingbox_decode mindspore.ops.ctc_greedy_decoder mindspore.ops.conv2d mindspore.ops.conv3d diff --git a/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst b/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst index 1b95b046b12..4bd1cab2ff3 100644 --- a/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst +++ b/docs/api/api_python/ops/mindspore.ops.BoundingBoxDecode.rst @@ -7,20 +7,4 @@ mindspore.ops.BoundingBoxDecode 算子的功能是计算偏移量,此算子将偏移量转换为Bbox,用于在后续图像中标记目标等。 - 参数: - - **means** (tuple) - 计算 `deltas` 的均值。默认值:(0.0, 0.0, 0.0, 0.0, 0.0)。 - - **stds** (tuple) - 计算 `deltas` 的标准差。默认值:(1.0, 1.0, 1.0, 1.0)。 - - **max_shape** (tuple) - 解码框计算的上限值。 - - **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值:0.016。 - - 输入: - - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。 - - **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。 - - 输出: - Tensor,解码框。它的数据类型和shape与 `anchor_box` 相同。 - - 异常: - - **TypeError** - 如果 `means` 、 `stds` 或 `max_shape` 不是tuple。 - - **TypeError** - 如果 `wh_ratio_clip` 不是float。 - - **TypeError** - 如果 `anchor_box` 或 `deltas` 不是Tensor。 + 更多细节详见 :func:`mindspore.ops.boundingbox_decode`。 diff --git a/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst b/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst index c404ae9f3d3..a6eaf177db6 100644 --- a/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst +++ b/docs/api/api_python/ops/mindspore.ops.BoundingBoxEncode.rst @@ -7,17 +7,4 @@ mindspore.ops.BoundingBoxEncode 算子的功能是计算预测边界框和真实边界框之间的偏移,并将此偏移作为损失变量。 - 参数: - - **means** (tuple) - 计算编码边界框的均值。默认值:(0.0, 0.0, 0.0, 0.0, 0.0)。 - - **stds** (tuple) - 计算增量的标准偏差。默认值:(1.0、1.0、1.0、1.0)。 - - 输入: - - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。 - - **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。 - - 输出: - Tensor,编码边界框。数据类型和shape与输入 `anchor_box` 相同。 - - 异常: - - **TypeError** - 如果 `means` 或 `stds` 不是tuple。 - - **TypeError** - 如果 `anchor_box` 或 `groundtruth_box` 不是Tensor。 + 更多细节详见 :func:`mindspore.ops.boundingbox_encode`。 \ No newline at end of file diff --git a/docs/api/api_python/ops/mindspore.ops.func_boundingbox_decode.rst b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_decode.rst new file mode 100644 index 00000000000..9461441b0d7 --- /dev/null +++ b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_decode.rst @@ -0,0 +1,24 @@ +mindspore.ops.boundingbox_decode +================================ + +.. py:function:: mindspore.ops.boundingbox_decode(max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016) + + 解码边界框位置信息。 + + 算子的功能是计算偏移量,此算子将偏移量转换为Bbox,用于在后续图像中标记目标等。 + + 参数: + - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。 + - **deltas** (Tensor) - 框的增量。它的shape与 `anchor_box` 相同。 + - **means** (tuple) - 计算 `deltas` 的均值。默认值:(0.0, 0.0, 0.0, 0.0, 0.0)。 + - **stds** (tuple) - 计算 `deltas` 的标准差。默认值:(1.0, 1.0, 1.0, 1.0)。 + - **max_shape** (tuple) - 解码框计算的上限值。 + - **wh_ratio_clip** (float) - 解码框计算的宽高比限制。默认值:0.016。 + + 返回: + Tensor,解码框。它的数据类型和shape与 `anchor_box` 相同。 + + 异常: + - **TypeError** - 如果 `means` 、 `stds` 或 `max_shape` 不是tuple。 + - **TypeError** - 如果 `wh_ratio_clip` 不是float。 + - **TypeError** - 如果 `anchor_box` 或 `deltas` 不是Tensor。 diff --git a/docs/api/api_python/ops/mindspore.ops.func_boundingbox_encode.rst b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_encode.rst new file mode 100644 index 00000000000..d45ab30a757 --- /dev/null +++ b/docs/api/api_python/ops/mindspore.ops.func_boundingbox_encode.rst @@ -0,0 +1,21 @@ +mindspore.ops.boundingbox_encode +================================ + +.. py:function:: mindspore.ops.boundingbox_encode(means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)) + + 编码边界框位置信息。 + + 算子的功能是计算预测边界框和真实边界框之间的偏移,并将此偏移作为损失变量。 + + 参数: + - **anchor_box** (Tensor) - 锚框。锚框的shape必须为 :math:`(n,4)` 。 + - **groundtruth_box** (Tensor) - 真实边界框。它的shape与锚框相同。 + - **means** (tuple) - 计算编码边界框的均值。默认值:(0.0, 0.0, 0.0, 0.0, 0.0)。 + - **stds** (tuple) - 计算增量的标准偏差。默认值:(1.0、1.0、1.0、1.0)。 + + 返回: + Tensor,编码边界框。数据类型和shape与输入 `anchor_box` 相同。 + + 异常: + - **TypeError** - 如果 `means` 或 `stds` 不是tuple。 + - **TypeError** - 如果 `anchor_box` 或 `groundtruth_box` 不是Tensor。 diff --git a/mindspore/python/mindspore/ops/function/__init__.py b/mindspore/python/mindspore/ops/function/__init__.py index e436596950a..106b25cbff5 100644 --- a/mindspore/python/mindspore/ops/function/__init__.py +++ b/mindspore/python/mindspore/ops/function/__init__.py @@ -325,6 +325,8 @@ from .nn_func import ( glu, multi_margin_loss, multi_label_margin_loss, + boundingbox_decode, + boundingbox_encode ) from .linalg_func import ( svd, diff --git a/mindspore/python/mindspore/ops/function/nn_func.py b/mindspore/python/mindspore/ops/function/nn_func.py index 2b0ceca32b4..95757713dbd 100644 --- a/mindspore/python/mindspore/ops/function/nn_func.py +++ b/mindspore/python/mindspore/ops/function/nn_func.py @@ -3286,6 +3286,82 @@ def multi_label_margin_loss(inputs, target, reduction='mean'): return outputs +def boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)): + """ + Encodes bounding boxes locations. + + This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes, + and this offset will be used as a variable for the loss. + + Args: + anchor_box(Tensor): Anchor boxes. The shape of anchor_box must be (n, 4). + groundtruth_box(Tensor): Ground truth boxes. Which has the same shape with anchor_box. + means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0). + stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0). + + Returns: + Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`. + + Raises: + TypeError: If `means` or `stds` is not a tuple. + TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor. + + Supported Platforms: + ``Ascend`` ``GPU`` ``CPU`` + + Examples: + >>> anchor_box = Tensor([[2, 2, 2, 3], [2, 2, 2, 3]], mindspore.float32) + >>> groundtruth_box = Tensor([[1, 2, 1, 4], [1, 2, 1, 4]], mindspore.float32) + >>> output = ops.boundingbox_encode(anchor_box, groundtruth_box, means=(0.0, 0.0, 0.0, 0.0), + ... stds=(1.0, 1.0, 1.0, 1.0)) + >>> print(output) + [[ -1. 0.25 0. 0.40551758] + [ -1. 0.25 0. 0.40551758]] + """ + boundingbox_encode_op = _get_cache_prim(P.BoundingBoxEncode)(means, stds) + return boundingbox_encode_op(anchor_box, groundtruth_box) + + +def boundingbox_decode(anchor_box, deltas, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), + wh_ratio_clip=0.016): + """ + Decodes bounding boxes locations. + + The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox, + which is used to mark the target in the subsequent images, etc. + + Args: + anchor_box** (Tensor): Anchor boxes. The shape of `anchor_box` must be (n, 4). + deltas** (Tensor): Delta of boxes. Which has the same shape with `anchor_box`. + means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0). + stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0). + max_shape (tuple): The max size limit for decoding box calculation. + wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016. + + Returns: + Tensor, decoded boxes. It has the same data type and shape as `anchor_box`. + + Raises: + TypeError: If `means`, `stds` or `max_shape` is not a tuple. + TypeError: If `wh_ratio_clip` is not a float. + TypeError: If `anchor_box` or `deltas` is not a Tensor. + + Supported Platforms: + ``Ascend`` ``GPU`` ``CPU`` + + Examples: + >>> anchor_box = Tensor([[4, 1, 2, 1], [2, 2, 2, 3]], mindspore.float32) + >>> deltas = Tensor([[3, 1, 2, 2], [1, 2, 1, 4]], mindspore.float32) + >>> output = ops.boundingbox_decode(anchor_box, deltas, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), + ... max_shape=(768, 1280), wh_ratio_clip=0.016) + >>> print(output) + [[ 4.1953125 0. 0. 5.1953125] + [ 2.140625 0. 3.859375 60.59375 ]] + """ + boundingbox_decode_op = _get_cache_prim(P.BoundingBoxDecode)(max_shape, means, stds, wh_ratio_clip) + return boundingbox_decode_op(anchor_box, deltas) + + __all__ = [ 'adaptive_avg_pool1d', 'adaptive_avg_pool2d', @@ -3336,6 +3412,8 @@ __all__ = [ 'conv3d', 'glu', 'multi_margin_loss', - 'multi_label_margin_loss' + 'multi_label_margin_loss', + 'boundingbox_decode', + 'boundingbox_encode' ] __all__.sort() diff --git a/mindspore/python/mindspore/ops/operations/other_ops.py b/mindspore/python/mindspore/ops/operations/other_ops.py index 9ac8f003bb6..e07275089bd 100644 --- a/mindspore/python/mindspore/ops/operations/other_ops.py +++ b/mindspore/python/mindspore/ops/operations/other_ops.py @@ -113,20 +113,7 @@ class BoundingBoxEncode(PrimitiveWithInfer): This operator will calculate the offset between the predicted bounding boxes and the real bounding boxes, and this offset will be used as a variable for the loss. - Args: - means (tuple): Means for encoding bounding boxes calculation. Default: (0.0, 0.0, 0.0, 0.0). - stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0). - - Inputs: - - **anchor_box** (Tensor) - Anchor boxes. The shape of anchor_box must be (n, 4). - - **groundtruth_box** (Tensor) - Ground truth boxes. Which has the same shape with anchor_box. - - Outputs: - Tensor, encoded bounding boxes. It has the same data type and shape as input `anchor_box`. - - Raises: - TypeError: If `means` or `stds` is not a tuple. - TypeError: If `anchor_box` or `groundtruth_box` is not a Tensor. + Refer to :func:`mindspore.ops.boundingbox_encode` for more detail. Supported Platforms: ``Ascend`` ``GPU`` ``CPU`` @@ -140,7 +127,6 @@ class BoundingBoxEncode(PrimitiveWithInfer): [[ -1. 0.25 0. 0.40551758] [ -1. 0.25 0. 0.40551758]] """ - @prim_attr_register def __init__(self, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0)): """Initialize BoundingBoxEncode.""" @@ -223,23 +209,7 @@ class BoundingBoxDecode(Primitive): The function of the operator is to calculate the offset, and this operator converts the offset into a Bbox, which is used to mark the target in the subsequent images, etc. - Args: - means (tuple): The means of deltas calculation. Default: (0.0, 0.0, 0.0, 0.0). - stds (tuple): The standard deviations of deltas calculation. Default: (1.0, 1.0, 1.0, 1.0). - max_shape (tuple): The max size limit for decoding box calculation. - wh_ratio_clip (float): The limit of width and height ratio for decoding box calculation. Default: 0.016. - - Inputs: - - **anchor_box** (Tensor) - Anchor boxes. The shape of `anchor_box` must be (n, 4). - - **deltas** (Tensor) - Delta of boxes. Which has the same shape with `anchor_box`. - - Outputs: - Tensor, decoded boxes. It has the same data type and shape as `anchor_box`. - - Raises: - TypeError: If `means`, `stds` or `max_shape` is not a tuple. - TypeError: If `wh_ratio_clip` is not a float. - TypeError: If `anchor_box` or `deltas` is not a Tensor. + Refer to :func:`mindspore.ops.boundingbox_decode` for more detail. Supported Platforms: ``Ascend`` ``GPU`` ``CPU`` @@ -255,7 +225,6 @@ class BoundingBoxDecode(Primitive): [ 2.140625 0. 3.859375 60.59375 ]] """ - @prim_attr_register def __init__(self, max_shape, means=(0.0, 0.0, 0.0, 0.0), stds=(1.0, 1.0, 1.0, 1.0), wh_ratio_clip=0.016): """Initialize BoundingBoxDecode."""