forked from mindspore-Ecosystem/mindspore
add commnet for api: bucket_batch_by_length
This commit is contained in:
parent
0765c2cc3c
commit
9e9362d3c8
Binary file not shown.
After Width: | Height: | Size: 17 KiB |
Binary file not shown.
After Width: | Height: | Size: 16 KiB |
|
@ -5,6 +5,10 @@
|
|||
对数据集中的每一条数据进行长度计算。根据该条数据的长度计算结果和每个分桶的范围将该数据归类到特定的桶里面。
|
||||
当某个分桶中数据条数达到指定的大小 `bucket_batch_sizes` 时,将根据 `pad_info` 的信息对分桶进行填充,再进行批处理。
|
||||
|
||||
执行流程参考下图:
|
||||
|
||||
.. image:: bucket_batch_by_length_cn.png
|
||||
|
||||
参数:
|
||||
- **column_names** (list[str]) - 传递给参数 `element_length_function` 的数据列,用于计算数据的长度。
|
||||
- **bucket_boundaries** (list[int]) - 指定各个分桶的上边界值,列表的数值必须严格递增。
|
||||
|
|
Binary file not shown.
Before Width: | Height: | Size: 56 KiB After Width: | Height: | Size: 52 KiB |
Binary file not shown.
Before Width: | Height: | Size: 35 KiB After Width: | Height: | Size: 53 KiB |
|
@ -479,6 +479,10 @@ class Dataset:
|
|||
corresponding size specified in bucket_batch_sizes, the entire bucket will be
|
||||
padded according to pad_info, and then form a batch.
|
||||
|
||||
Refer to the following figure for the execution process:
|
||||
|
||||
.. image:: bucket_batch_by_length_en.png
|
||||
|
||||
Args:
|
||||
column_names (list[str]): Columns passed to element_length_function.
|
||||
bucket_boundaries (list[int]): A list consisting of the upper boundaries
|
||||
|
|
Loading…
Reference in New Issue