!41228 add more comment for api: bucket_batch_by_length

Merge pull request !41228 from guozhijian/code_docs_comment_for_bucket
This commit is contained in:
i-robot 2022-08-31 08:20:16 +00:00 committed by Gitee
commit 9b0c87bbba
No known key found for this signature in database
GPG Key ID: 173E9B9CA92EEF8F
6 changed files with 8 additions and 0 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

View File

@ -5,6 +5,10 @@
对数据集中的每一条数据进行长度计算。根据该条数据的长度计算结果和每个分桶的范围将该数据归类到特定的桶里面。
当某个分桶中数据条数达到指定的大小 `bucket_batch_sizes` 时,将根据 `pad_info` 的信息对分桶进行填充,再进行批处理。
执行流程参考下图:
.. image:: bucket_batch_by_length_cn.png
参数:
- **column_names** (list[str]) - 传递给参数 `element_length_function` 的数据列,用于计算数据的长度。
- **bucket_boundaries** (list[int]) - 指定各个分桶的上边界值,列表的数值必须严格递增。

Binary file not shown.

Before

Width:  |  Height:  |  Size: 56 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 53 KiB

View File

@ -479,6 +479,10 @@ class Dataset:
corresponding size specified in bucket_batch_sizes, the entire bucket will be
padded according to pad_info, and then form a batch.
Refer to the following figure for the execution process:
.. image:: bucket_batch_by_length_en.png
Args:
column_names (list[str]): Columns passed to element_length_function.
bucket_boundaries (list[int]): A list consisting of the upper boundaries