!41228 add more comment for api: bucket_batch_by_length

Merge pull request !41228 from guozhijian/code_docs_comment_for_bucket
2022-08-31 08:20:16 +00:00 · 2022-08-31 08:20:16 +00:00 · 9b0c87bbba
parent 87371498fd 9e9362d3c8
commit 9b0c87bbba
6 changed files with 8 additions and 0 deletions
--- a/docs/api/api_python/dataset/bucket_batch_by_length_cn.png
+++ b/docs/api/api_python/dataset/bucket_batch_by_length_cn.png
--- a/docs/api/api_python/dataset/bucket_batch_by_length_en.png
+++ b/docs/api/api_python/dataset/bucket_batch_by_length_en.png
--- a/docs/api/api_python/dataset/mindspore.dataset.Dataset.b.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Dataset.b.rst
@ -5,6 +5,10 @@
    对数据集中的每一条数据进行长度计算。根据该条数据的长度计算结果和每个分桶的范围将该数据归类到特定的桶里面。
    当某个分桶中数据条数达到指定的大小 `bucket_batch_sizes` 时，将根据 `pad_info` 的信息对分桶进行填充，再进行批处理。

+    执行流程参考下图：
+
+    .. image:: bucket_batch_by_length_cn.png
+
    参数：
        - **column_names** (list[str]) - 传递给参数 `element_length_function` 的数据列，用于计算数据的长度。
        - **bucket_boundaries** (list[int]) - 指定各个分桶的上边界值，列表的数值必须严格递增。
--- a/docs/api/api_python/dataset_pipeline.png
+++ b/docs/api/api_python/dataset_pipeline.png
--- a/docs/api/api_python_en/dataset_pipeline_en.png
+++ b/docs/api/api_python_en/dataset_pipeline_en.png
--- a/mindspore/python/mindspore/dataset/engine/datasets.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets.py
@ -479,6 +479,10 @@ class Dataset:
        corresponding size specified in bucket_batch_sizes, the entire bucket will be
        padded according to pad_info, and then form a batch.

+        Refer to the following figure for the execution process:
+
+        .. image:: bucket_batch_by_length_en.png
+
        Args:
            column_names (list[str]): Columns passed to element_length_function.
            bucket_boundaries (list[int]): A list consisting of the upper boundaries