!44822 Fix problems with some Chinese API reviews

Merge pull request !44822 from 刘勇琪/code_docs_modify_chinese_api
2022-11-02 06:36:55 +00:00 · 2022-11-02 06:36:55 +00:00 · d4db709ee0
parent f8c6006841 18194a3c4a
commit d4db709ee0
22 changed files with 454 additions and 415 deletions
--- a/docs/api/api_python/dataset/dataset_method/operation/mindspore.dataset.Dataset.shuffle.rst
+++ b/docs/api/api_python/dataset/dataset_method/operation/mindspore.dataset.Dataset.shuffle.rst
@ -3,7 +3,7 @@ mindspore.dataset.Dataset.shuffle

 .. py:method:: mindspore.dataset.Dataset.shuffle(buffer_size)

-    使用以下策略混洗此数据集的行：
+    通过创建 `buffer_size` 大小的缓存来混洗该数据集。

    1. 生成一个混洗缓冲区包含 `buffer_size` 条数据行。

--- a/docs/api/api_python/dataset/mindspore.dataset.AGNewsDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.AGNewsDataset.rst
@ -12,17 +12,23 @@ mindspore.dataset.AGNewsDataset
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'，'test'或'all'。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取所有样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

          - **Shuffle.GLOBAL**：混洗文件和样本。
          - **Shuffle.FILES**：仅混洗文件。

-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

+    异常：
+        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
+        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
+        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+
    **关于AGNews数据集：**

    AG是一个大型合集，具有超过100万篇新闻文章。这些新闻文章是由ComeToMyHead在持续1年多的活动中，从2000多个新闻来源收集的。ComeToMyHead是一个学术新闻搜索引擎，自2004年7月以来一直在运营。
@ -52,5 +58,5 @@ mindspore.dataset.AGNewsDataset
        archivePrefix={arXiv},
        primaryClass={cs.LG}
        }
-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.AmazonReviewDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.AmazonReviewDataset.rst
@ -14,22 +14,22 @@ mindspore.dataset.AmazonReviewDataset
          对于Full数据集，'train'将读取300万个训练样本，'test'将读取65万个测试样本，'all'将读取所有365万个样本。默认值：None，读取所有样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

          - **Shuffle.GLOBAL**：混洗文件和样本。
          - **Shuffle.FILES**：仅混洗文件。

-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    **关于AmazonReview数据集：**

--- a/docs/api/api_python/dataset/mindspore.dataset.DBpediaDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.DBpediaDataset.rst
@ -10,26 +10,26 @@ mindspore.dataset.DBpediaDataset
    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'，'test'或'all'。
-          'train'将读取560,000个训练样本，'test'将读取70,000个测试样本中，'all'将读取所有63万个样本。默认值：None，读取全部样本。
+          'train'将读取560,000个训练样本，'test'将读取70,000个测试样本中，'all'将读取所有630,000个样本。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取所有样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

          - **Shuffle.GLOBAL**：混洗文件和样本。
          - **Shuffle.FILES**：仅混洗文件。

-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数值错误，小于0或者大于等于 `num_shards` 。

    **关于DBpedia数据集：**

@ -61,5 +61,5 @@ mindspore.dataset.DBpediaDataset
        howpublished = {http://dbpedia.org}
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.EMnistDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.EMnistDataset.rst
@ -5,19 +5,19 @@ mindspore.dataset.EMnistDataset

    读取和解析EMNIST数据集的源文件构建数据集。

-    生成的数据集有两列: `[image, label]`。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。
+    生成的数据集有两列: `[image, label]` 。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **name** (str) - 按给定规则对数据集进行拆分，可以是'byclass'、'bymerge'、'balanced'、'letters'、'digits'或'mnist'。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为 'train'、'test' 或 'all'。
          取值为'train'时将会读取60,000个训练样本，取值为'test'时将会读取10,000个测试样本，取值为'all'时将会读取全部70,000个样本。默认值：None，读取全部样本图片。
-        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数，可以小于数据集总数。默认值：None，读取全部样本图片。
+        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本图片。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
-        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
@ -25,7 +25,7 @@ mindspore.dataset.EMnistDataset
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -96,5 +96,5 @@ mindspore.dataset.EMnistDataset
                        publication_support_materials/emnist}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.EnWik9Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.EnWik9Dataset.rst
@ -3,7 +3,7 @@ mindspore.dataset.EnWik9Dataset

 .. py:class:: mindspore.dataset.EnWik9Dataset(dataset_dir, num_samples=None, num_parallel_workers=None, shuffle=True, num_shards=None, shard_id=None, cache=None)

-    读取和解析EnWik9数据集的源数据集。
+    读取和解析EnWik9 Full和EnWik9 Polarity数据集。

    生成的数据集有一列 `[text]` ，数据类型为string。

@ -13,17 +13,23 @@ mindspore.dataset.EnWik9Dataset
          对于Polarity数据集，'train'将读取360万个训练样本，'test'将读取40万个测试样本，'all'将读取所有400万个样本。
          对于Full数据集，'train'将读取300万个训练样本，'test'将读取65万个测试样本，'all'将读取所有365万个样本。默认值：None，读取所有样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：True。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：True。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

          - **Shuffle.GLOBAL**：混洗文件和样本。
          - **Shuffle.FILES**：仅混洗文件。

-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

+    异常：
+        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
+        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
+        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+
    **关于EnWik9数据集：**

    EnWik9的数据是一系列UTF-8编码的XML，主要由英文文本组成。数据集包含243,426篇文章标题，其中85,560个被重定向以修复丢失的网页链接，其余是常规文章。
@ -50,5 +56,5 @@ mindspore.dataset.EnWik9Dataset
        year      = {2006}
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.FakeImageDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.FakeImageDataset.rst
@ -5,28 +5,28 @@ mindspore.dataset.FakeImageDataset

    生成虚假图像构建数据集。

-    生成的数据集有两列: `[image, label]`。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。
+    生成的数据集有两列: `[image, label]` 。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。

    参数：
-        - **num_images** (int, 可选) - 要生成的虚假图像数，默认值：1000。
-        - **image_size** (tuple, 可选) - 虚假图像的尺寸，默认值：(224, 224, 3)。
-        - **num_classes** (int, 可选) - 数据集的类别数，默认值：10。
-        - **base_seed** (int, 可选) - 生成随机图像的随机种子，默认值：0。
+        - **num_images** (int, 可选) - 要生成的虚假图像数。默认值：1000。
+        - **image_size** (tuple, 可选) - 虚假图像的尺寸。默认值：(224, 224, 3)。
+        - **num_classes** (int, 可选) - 数据集的类别数。默认值：10。
+        - **base_seed** (int, 可选) - 生成随机图像的随机种子。默认值：0。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数，可以小于数据集总数。默认值：None，读取全部样本图片。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
-        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -56,5 +56,5 @@ mindspore.dataset.FakeImageDataset
         - False
         - 不允许

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.FashionMnistDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.FashionMnistDataset.rst
@ -5,7 +5,7 @@ mindspore.dataset.FashionMnistDataset

    读取和解析Fashion-MNIST数据集的源文件构建数据集。

-    生成的数据集有两列: `[image, label]`。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。
+    生成的数据集有两列: `[image, label]` 。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
@ -14,19 +14,19 @@ mindspore.dataset.FashionMnistDataset
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数，可以小于数据集总数。默认值：None，读取全部样本图片。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
-        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

--- a/docs/api/api_python/dataset/mindspore.dataset.Flowers102Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Flowers102Dataset.rst
@ -7,29 +7,29 @@ mindspore.dataset.Flowers102Dataset

    根据给定的 `task` 配置，生成数据集具有不同的输出列：

-    - `task` = 'Classification'，输出列： `[image, dtype=uint8]` , `[label, dtype=uint32]` 。
-    - `task` = 'Segmentation'，输出列： `[image, dtype=uint8]` , `[segmentation, dtype=uint8]` , `[label, dtype=uint32]`。
+    - `task` = 'Classification'，输出列： `[image, dtype=uint8]` 、 `[label, dtype=uint32]` 。
+    - `task` = 'Segmentation'，输出列： `[image, dtype=uint8]` 、 `[segmentation, dtype=uint8]` 、 `[label, dtype=uint32]` 。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录的路径。
        - **task** (str, 可选) - 指定读取数据的任务类型，支持'Classification'和'Segmentation'。默认值：'Classification'。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'，'valid'，'test'或'all'。默认值：'all'，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，所有图像样本。
-        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
+        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：1。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
-        - **decode** (bool, 可选) - 是否对读取的图片进行解码操作，默认值：False，不解码。
-        - **sampler** (Union[Sampler, Iterable], 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **decode** (bool, 可选) - 是否对读取的图片进行解码操作。默认值：False，不解码。
+        - **sampler** (Union[Sampler, Iterable], 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数值错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -93,5 +93,5 @@ mindspore.dataset.Flowers102Dataset
          year         = "2008",
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.IMDBDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.IMDBDataset.rst
@ -14,26 +14,20 @@ mindspore.dataset.IMDBDataset
          对于Polarity数据集，'train'将读取360万个训练样本，'test'将读取40万个测试样本，'all'将读取所有400万个样本。
          对于Full数据集，'train'将读取300万个训练样本，'test'将读取65万个测试样本，'all'将读取所有365万个样本。默认值：None，读取所有样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (bool, 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：mindspore.dataset.Shuffle.GLOBAL。
-          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
-          通过传入枚举变量设置数据混洗的模式：
-
-          - **Shuffle.GLOBAL**：混洗文件和样本。
-          - **Shuffle.FILES**：仅混洗文件。
-
-        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
+        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数值错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -112,5 +106,5 @@ mindspore.dataset.IMDBDataset
            url       = {http://www.aclweb.org/anthology/P11-1015}
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.IWSLT2016Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.IWSLT2016Dataset.rst
@ -10,31 +10,31 @@ mindspore.dataset.IWSLT2016Dataset
    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'，'valid'，'test'或'all'。默认值：None，读取全部样本。
-        - **language_pair** (sequence, 可选) - 包含源语言和目标语言的序列，支持的值为（'en'，'fr'）、（'en'，'de'）、（'en'，'cs'）、（'en'，'ar'）、（'de'，'en'）,（'cs'，'en'）,（'ar'，'en'），默认值：（'de'，'en'）。
-        - **valid_set** (str, 可选) - 标识验证集的字符串，支持的值为'dev2010'、'tst2010'、'tst2011'、'tst'2012，'tst2013'和'tst2014'，默认值：'tst2013'。
-        - **test_set** (str, 可选) - 识测试集的字符串，支持的值为'dev2010'、'tst2010'、'tst2011'、'tst'2012、'tst2013'和'tst2014'，默认值：'tst2014'。
-        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **language_pair** (sequence, 可选) - 包含源语言和目标语言的序列，支持的值为（'en'，'fr'）、（'en'，'de'）、（'en'，'cs'）、（'en'，'ar'）、（'de'，'en'）、（'cs'，'en'）、（'ar'，'en'）。默认值：（'de'，'en'）。
+        - **valid_set** (str, 可选) - 标识验证集的字符串，支持的值为'dev2010'、'tst2010'、'tst2011'、'tst2012'、'tst2013'和'tst2014'。默认值：'tst2013'。
+        - **test_set** (str, 可选) - 识别测试集的字符串，支持的值为'dev2010'、'tst2010'、'tst2011'、'tst'2012、'tst2013'和'tst2014'。默认值：'tst2014'。
+        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取所有样本。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

          - **Shuffle.GLOBAL**：混洗文件和样本。
          - **Shuffle.FILES**：仅混洗文件。

-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    **关于IWSLT2016数据集：**

-    IWSLT是一个专门讨论口译各个方面的重要年度科学会议。IWSLT评估活动中的MT任务被构成一个数据集，该数据集可通过wit3.fbk.eu公开获取。
+    IWSLT是一个专门讨论口译各个方面的重要年度科学会议。IWSLT评估活动中的MT任务被构成一个数据集，该数据集可通过 `wit3 <https://wit3.fbk.eu>`_ 公开获取。
    IWSLT2016数据集包括从英语到阿拉伯语、捷克、法语和德语的翻译，以及从阿拉伯语、捷克、法语和德语到英语的翻译。

    可以将原始IWSLT2016数据集文件解压缩到此目录结构中，并由MindSpore的API读取。解压后，还需要将要读取的数据集解压到指定文件夹中。例如，如果要读取de-en的数据集，则需要解压缩de/en目录下的tgz文件，数据集位于解压缩文件夹中。
--- a/docs/api/api_python/dataset/mindspore.dataset.IWSLT2017Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.IWSLT2017Dataset.rst
@ -5,34 +5,37 @@ mindspore.dataset.IWSLT2017Dataset

    读取和解析IWSLT2017数据集的源数据集。

-    生成的数据集有两列 `[text, translation]` 。 `text` 列的数据类型是string。 `translation` 列的数据类型是string。
+    生成的数据集有两列 `[text, translation]` 。 `text` 列和 `translation` 列的数据类型均为string。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'，'valid'，'test'或'all'。默认值：None，读取全部样本。
-        - **language_pair** (sequence, 可选) - 包含源语和目标语的语言列表，支持的语言对有（'en'，'nl'）、（'en'，'de'）、（'en'，'it'）、（'en'，'ro'）、（'nl'，'en'，'de'）、（'nl'，'it'）、（'nl'，'ro'）、（'de'，'en'）、（'de'，'nl'）、（'de'，'it'，'it'，'en'）、（'it'，'nl'）、（'it'，'de'）、（'it'，'ro'）、（'ro'，'en'）、（'ro'，'nl'）、（'ro'，'de'）、（'ro'，'it'），默认值：（'de'，'en'）。
-        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **language_pair** (sequence, 可选) - 包含源语和目标语的语言列表，支持的语言对有（'en', 'nl'）、
+            （'en', 'de'）、（'en', 'it'）、（'en', 'ro'）、（'nl', 'en'）、（'nl', 'de'）、（'nl', 'it'）、（'nl', 'ro'）、
+            （'de', 'en'）、（'de', 'nl'）、（'de', 'it'）、（'de', 'ro'）、（'it', 'en'）、（'it', 'nl'）、（'it', 'de'）、
+            （'it', 'ro'）、（'ro', 'en'）、（'ro', 'nl'）、（'ro', 'de'）、（'ro', 'it'）。默认值：（'de'，'en'）。
+        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取所有样本。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

          - **Shuffle.GLOBAL**：混洗文件和样本。
          - **Shuffle.FILES**：仅混洗文件。

-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    **关于IWSLT2016数据集：**

-    IWSLT是一个专门讨论口译各个方面的重要年度科学会议。IWSLT评估活动中的MT任务被构成一个数据集，该数据集可通过wit3.fbk.eu公开获取。
+    IWSLT是一个专门讨论口译各个方面的重要年度科学会议。IWSLT评估活动中的MT任务被构成一个数据集，该数据集可通过 `wit3 <https://wit3.fbk.eu>`_ 公开获取。
    IWSLT2017数据集中有德语、英语、意大利语、荷兰语和罗马尼亚语，数据集包括其中任何两种语言的翻译。

    可以将原始IWSLT2017数据集文件解压缩到此目录结构中，并由MindSpore的API读取。解压后，还需要将要读取的数据集解压到指定文件夹中。例如，如果要读取de-en的数据集，则需要解压缩de/en目录下的tgz文件，数据集位于解压缩文件夹中。
--- a/docs/api/api_python/dataset/mindspore.dataset.KMnistDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.KMnistDataset.rst
@ -5,7 +5,7 @@ mindspore.dataset.KMnistDataset

    读取和解析KMNIST数据集的源文件构建数据集。

-    生成的数据集有两列: `[image, label]`。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。
+    生成的数据集有两列: `[image, label]` 。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
@ -14,19 +14,19 @@ mindspore.dataset.KMnistDataset
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数，可以小于数据集总数。默认值：None，读取全部样本图片。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
-        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

--- a/docs/api/api_python/dataset/mindspore.dataset.LJSpeechDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.LJSpeechDataset.rst
@ -5,7 +5,7 @@ mindspore.dataset.LJSpeechDataset

    读取和解析LJSpeech数据集的源文件构建数据集。

-    生成的数据集有两列: `[waveform, sample_rate, transcription, normalized_transcript]`。
+    生成的数据集有四列: `[waveform, sample_rate, transcription, normalized_transcript]` 。
    `waveform` 列的数据类型为float32。 `sample_rate` 列的数据类型为int32。 `transcription` 列的数据类型为string。 `normalized_transcript` 列的数据类型为string。

    参数：
@ -13,19 +13,19 @@ mindspore.dataset.LJSpeechDataset
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数，可以小于数据集总数。默认值：None，读取全部样本音频。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
-        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

--- a/docs/api/api_python/dataset/mindspore.dataset.PennTreebankDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.PennTreebankDataset.rst
@ -5,7 +5,7 @@ mindspore.dataset.PennTreebankDataset

    读取和解析PennTreebank数据集的源数据集。

-    生成的数据集有一列 `[text]` 。数据类型为string。
+    生成的数据集有一列 `[text]` 。 `text` 列的数据类型为string。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
@ -13,17 +13,23 @@ mindspore.dataset.PennTreebankDataset
          取值为'train'将读取42,068个样本，'test'将读取3,370个样本，'test'将读取3,761个样本，'all'将读取所有49,199个样本。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取所有样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定，默认值：True。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

          - **Shuffle.GLOBAL**：混洗文件和样本。
          - **Shuffle.FILES**：仅混洗文件。

-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

+    异常：
+        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
+        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
+        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+
    **关于PennTreebank数据集：**

    Penn Treebank (PTB) 数据集，广泛用于 NLP（自然语言处理）的机器学习研究。
@ -60,5 +66,5 @@ mindspore.dataset.PennTreebankDataset
          year = 1990
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.PhotoTourDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.PhotoTourDataset.rst
@ -5,26 +5,26 @@ mindspore.dataset.PhotoTourDataset

    读取和解析PhotoTour数据集的源数据集。

-    当 `usage` = 'train'，生成的数据集有一列 `[image]` ，数据类型为uint8。
-    当 `usage` ≠ 'train'，生成的数据集有三列: `[image1, image2, matches]`。 `image1` 、 `image2` 列的数据类型为uint8。 `matches` 列的数据类型为uint32。
+    根据给定的 `usage` 配置，生成数据集具有不同的输出列：
+    - `usage` = 'train'，输出列： `[image, dtype=uint8]` 。
+    - `usage` ≠ 'train'，输出列： `[image1, dtype=uint8]` 、 `[image2, dtype=uint8]` 、 `[matches, dtype=uint32]` 。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
-        - **name** (str) - 要加载的数据集内容名称，可以取值为'notredame'， 'yosemite'， 'liberty'， 'notredame_harris'， 'yosemite_harris' 或 'liberty_harris'。
+        - **name** (str) - 要加载的数据集内容名称，可以取值为'notredame'、'yosemite'、'liberty'、'notredame_harris'、'yosemite_harris' 或 'liberty_harris'。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'或'test'。默认值：None，将被设置为'train'。
          取值为'train'时，每个 `name` 的数据集样本数分别为{'notredame': 468159, 'yosemite': 633587, 'liberty': 450092, 'liberty_harris': 379587, 'yosemite_harris': 450912, 'notredame_harris': 325295}。
          取值为'test'时，将读取100,000个测试样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取所有样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None，下表中会展示不同参数配置的预期行为。
-        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器，默认值：None，下表中会展示不同配置的预期行为。
-        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数，默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
-        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号，默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None，下表中会展示不同配置的预期行为。
+        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
+        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
@ -32,7 +32,8 @@ mindspore.dataset.PhotoTourDataset
        - **ValueError** - `dataset_dir` 不存在。
        - **ValueError** - `usage` 不是["train", "test"]中的任何一个。
        - **ValueError** - `name` 不是["notredame", "yosemite", "liberty","notredame_harris", "yosemite_harris", "liberty_harris"]中的任何一个。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -112,5 +113,5 @@ mindspore.dataset.PhotoTourDataset
            doi={10.1109/CVPR.2007.382971}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/mindspore.dataset.config.rst
+++ b/docs/api/api_python/mindspore.dataset.config.rst
@ -303,24 +303,6 @@ API示例所需模块的导入代码如下：
    返回：
        bool，表示是否开启watchdog Python线程。

-.. py:function:: mindspore.dataset.config.set_multiprocessing_timeout_interval(interval)
-
-    设置在多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的默认时间间隔（秒）。
-
-    参数：
-        - **interval** (int) - 表示多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的时间间隔（秒）。
-
-    异常：
-        - **TypeError** - `interval` 不是int类型。
-        - **ValueError** - `interval` 小于等于0或 `interval` 大于 `INT32_MAX(2147483647)` 时， `interval` 无效。
-
-.. py:function:: mindspore.dataset.config.get_multiprocessing_timeout_interval()
-
-    获取在多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的时间间隔的全局配置。
-
-    返回：
-        int，表示多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的时间间隔（默认300秒）。
-
 .. py:function:: mindspore.dataset.config.set_fast_recovery(fast_recovery)

    在数据集管道故障恢复时，是否开启快速恢复模式（快速恢复模式下，无法保证随机性的数据增强操作得到与故障之前相同的结果）。
@ -340,3 +322,21 @@ API示例所需模块的导入代码如下：

 .. automodule:: mindspore.dataset.config
    :members:
+
+.. py:function:: mindspore.dataset.config.set_multiprocessing_timeout_interval(interval)
+
+    设置在多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的默认时间间隔（秒）。
+
+    参数：
+        - **interval** (int) - 表示多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的时间间隔（秒）。
+
+    异常：
+        - **TypeError** - `interval` 不是int类型。
+        - **ValueError** - `interval` 小于等于0或 `interval` 大于 `INT32_MAX(2147483647)` 时， `interval` 无效。
+
+.. py:function:: mindspore.dataset.config.get_multiprocessing_timeout_interval()
+
+    获取在多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的时间间隔的全局配置。
+
+    返回：
+        int，表示多进程/多线程下，主进程/主线程获取数据超时时，告警日志打印的时间间隔（默认300秒）。
--- a/mindspore/python/mindspore/dataset/core/config.py
+++ b/mindspore/python/mindspore/dataset/core/config.py
@ -808,7 +808,7 @@ def set_fast_recovery(fast_recovery):
    (yet with slightly different random augmentations).

    Args:
-        fast_recovery (bool): Whether the dataset pipeline recovers in fast mode. Default: True
+        fast_recovery (bool): Whether the dataset pipeline recovers in fast mode.

    Raises:
        TypeError: If `fast_recovery` is not a boolean data type.
@ -823,10 +823,10 @@ def set_fast_recovery(fast_recovery):

 def get_fast_recovery():
    """
-    Get the fast_recovery flag of the dataset pipeline
+    Get whether the fast recovery mode is enabled for the current dataset pipeline.

    Returns:
-        bool, whether the dataset recovers fast in failover reset
+        bool, whether the dataset recovers fast in failover reset.

    Examples:
        >>> is_fast_recovery = ds.config.get_fast_recovery()
--- a/mindspore/python/mindspore/dataset/engine/datasets.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets.py
@ -714,20 +714,20 @@ class Dataset:
    @check_shuffle
    def shuffle(self, buffer_size):
        """
-        Randomly shuffles the rows of this dataset using the following policy:
+        Shuffle the dataset by creating a cache with the size of `buffer_size` .

-        1. Make a shuffle buffer that contains the first buffer_size rows.
+        1. Make a shuffle buffer that contains the first `buffer_size` rows.
        2. Randomly select an element from the shuffle buffer to be the next row
           propagated to the child node.
        3. Get the next row (if any) from the parent node and put it in the shuffle buffer.
        4. Repeat steps 2 and 3 until there are no more rows left in the shuffle buffer.

-        A random seed can be provided to be used on the first epoch via `dataset.config.set_seed`. In every subsequent
+        A random seed can be provided to be used on the first epoch via `dataset.config.set_seed` . In every subsequent
        epoch, the seed is changed to a new one, randomly generated value.

        Args:
            buffer_size (int): The size of the buffer (must be larger than 1) for
-                shuffling. Setting buffer_size equal to the number of rows in the entire
+                shuffling. Setting `buffer_size` equal to the number of rows in the entire
                dataset will result in a global shuffle.

        Returns:
--- a/mindspore/python/mindspore/dataset/engine/datasets_audio.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_audio.py
@ -456,38 +456,38 @@ class LJSpeechDataset(MappableDataset, AudioBaseDataset):
    """
    A source dataset that reads and parses LJSpeech dataset.

-    The generated dataset has four columns :py:obj:`[waveform, sample_rate, transcription, normalized_transcript]`.
-    The tensor of column :py:obj:`waveform` is a tensor of the float32 type.
-    The tensor of column :py:obj:`sample_rate` is a scalar of the int32 type.
-    The tensor of column :py:obj:`transcription` is a scalar of the string type.
-    The tensor of column :py:obj:`normalized_transcript` is a scalar of the string type.
+    The generated dataset has four columns :py:obj:`[waveform, sample_rate, transcription, normalized_transcript]` .
+    The column :py:obj:`waveform` is a tensor of the float32 type.
+    The column :py:obj:`sample_rate` is a scalar of the int32 type.
+    The column :py:obj:`transcription` is a scalar of the string type.
+    The column :py:obj:`normalized_transcript` is a scalar of the string type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        num_samples (int, optional): The number of audios to be included in the dataset
-            (default=None, all audios).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the config).
-        shuffle (bool, optional): Whether to perform shuffle on the dataset (default=None, expected
-            order behavior shown in the table).
-        sampler (Sampler, optional): Object used to choose samples from the
-            dataset (default=None, expected order behavior shown in the table).
-        num_shards (int, optional): Number of shards that the dataset will be divided
-            into (default=None). When this argument is specified, `num_samples` reflects
+        num_samples (int, optional): The number of audios to be included in the dataset.
+            Default: None, all audios.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
+            order behavior shown in the table below.
+        sampler (Sampler, optional): Object used to choose samples from the dataset.
+            Default: None, expected order behavior shown in the table below.
+        num_shards (int, optional): Number of shards that the dataset will be divided into.
+            Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
--- a/mindspore/python/mindspore/dataset/engine/datasets_text.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_text.py
@ -39,32 +39,38 @@ class AGNewsDataset(SourceDataset, TextBaseDataset):
    """
    A source dataset that reads and parses AG News datasets.

-    The generated dataset has three columns: :py:obj:`[index, title, description]`,
+    The generated dataset has three columns: :py:obj:`[index, title, description]` ,
    and the data type of three columns is string type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        usage (str, optional): Acceptable usages include 'train', 'test' and 'all' (default=None, all samples).
-        num_samples (int, optional): Number of samples (rows) to read (default=None, reads the full dataset).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the config).
-        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch
-            (default=Shuffle.GLOBAL). Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+        usage (str, optional): Acceptable usages include 'train', 'test' and 'all'. Default: None, all samples.
+        num_samples (int, optional): Number of samples (rows) to read. Default: None, reads the full dataset.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If `shuffle` is False, no shuffling will be performed.
+            If `shuffle` is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

-            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.
+            - Shuffle.GLOBAL: Shuffle both the files and samples.

            - Shuffle.FILES: Shuffle files only.

-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
-            When this argument is specified, 'num_samples' reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
-            argument can only be specified when `num_shards` is also specified.
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
+            When this argument is specified, `num_samples` reflects the max sample number of per shard.
+        shard_id (int, optional): The shard ID within `num_shards` . This
+            argument can only be specified when `num_shards` is also specified. Default: None.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.
+
+    Raises:
+        RuntimeError: If `dataset_dir` does not contain data files.
+        RuntimeError: If `num_shards` is specified but `shard_id` is None.
+        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> ag_news_dataset_dir = "/path/to/ag_news_dataset_file"
@ -125,45 +131,45 @@ class AmazonReviewDataset(SourceDataset, TextBaseDataset):
    """
    A source dataset that reads and parses Amazon Review Polarity and Amazon Review Full datasets.

-    The generated dataset has three columns: :py:obj:`[label, title, content]`,
+    The generated dataset has three columns: :py:obj:`[label, title, content]` ,
    and the data type of three columns is string.

    Args:
        dataset_dir (str): Path to the root directory that contains the Amazon Review Polarity dataset
            or the Amazon Review Full dataset.
-        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all' (default= 'all').
+        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all'.
            For Polarity dataset, 'train' will read from 3,600,000 train samples,
            'test' will read from 400,000 test samples,
            'all' will read from all 4,000,000 samples.
            For Full dataset, 'train' will read from 3,000,000 train samples,
            'test' will read from 650,000 test samples,
-            'all' will read from all 3,650,000 samples (default=None, all samples).
-        num_samples (int, optional): Number of samples (rows) to be read (default=None, reads the full dataset).
-        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch
-            (default=Shuffle.GLOBAL). Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+            'all' will read from all 3,650,000 samples. Default: None, all samples.
+        num_samples (int, optional): Number of samples (rows) to be read. Default: None, reads the full dataset.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If `shuffle` is False, no shuffling will be performed.
+            If `shuffle` is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

-            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.
+            - Shuffle.GLOBAL: Shuffle both the files and samples.

            - Shuffle.FILES: Shuffle files only.

-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the  mindspore.dataset.config).
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> amazon_review_dataset_dir = "/path/to/amazon_review_dataset_dir"
@ -545,7 +551,7 @@ class DBpediaDataset(SourceDataset, TextBaseDataset):
    """
    A source dataset that reads and parses the DBpedia dataset.

-    The generated dataset has three columns :py:obj:`[class, title, content]`,
+    The generated dataset has three columns :py:obj:`[class, title, content]` ,
    and the data type of three columns is string.

    Args:
@ -553,34 +559,34 @@ class DBpediaDataset(SourceDataset, TextBaseDataset):
        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all'.
            'train' will read from 560,000 train samples,
            'test' will read from 70,000 test samples,
-            'all' will read from all 630,000 samples (default=None, all samples).
-        num_samples (int, optional): The number of samples to be included in the dataset
-            (default=None, will include all text).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the config).
-        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch
-            (default=Shuffle.GLOBAL). Bool type and Shuffle enum are both supported to pass in.
+            'all' will read from all 630,000 samples. Default: None, all samples.
+        num_samples (int, optional): The number of samples to be included in the dataset.
+            Default: None, will include all text.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

-            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.
+            - Shuffle.GLOBAL: Shuffle both the files and samples.

            - Shuffle.FILES: Shuffle files only.

-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Examples:
@ -640,33 +646,42 @@ class DBpediaDataset(SourceDataset, TextBaseDataset):

 class EnWik9Dataset(SourceDataset, TextBaseDataset):
    """
-    A source dataset that reads and parses EnWik9 dataset.
+    A source dataset that reads and parses EnWik9 Polarity and EnWik9 Full datasets.

    The generated dataset has one column :py:obj:`[text]` with type string.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        num_samples (int, optional): The number of samples to be included in the dataset
-            (default=None, will include all samples).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the config).
-        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch
-            (default=True). Bool type and Shuffle enum are both supported to pass in.
+        num_samples (int, optional): The number of samples to be included in the dataset.
+            For Polarity dataset, 'train' will read from 3,600,000 train samples, 'test' will read from 400,000 test
+            samples, 'all' will read from all 4,000,000 samples.
+            For Full dataset, 'train' will read from 3,000,000 train samples, 'test' will read from 650,000 test
+            samples, 'all' will read from all 3,650,000 samples. Default: None, will include all samples.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
+            Bool type and Shuffle enum are both supported to pass in. Default: True.
            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

-            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.
+            - Shuffle.GLOBAL: Shuffle both the files and samples.

            - Shuffle.FILES: Shuffle files only.

-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.
+
+    Raises:
+        RuntimeError: If `dataset_dir` does not contain data files.
+        RuntimeError: If `num_shards` is specified but `shard_id` is None.
+        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> en_wik9_dataset_dir = "/path/to/en_wik9_dataset"
@ -719,38 +734,41 @@ class IMDBDataset(MappableDataset, TextBaseDataset):
    """
    A source dataset that reads and parses Internet Movie Database (IMDb).

-    The generated dataset has two columns: :py:obj:`[text, label]`.
+    The generated dataset has two columns: :py:obj:`[text, label]` .
    The tensor of column :py:obj:`text` is of the string type.
-    The tensor of column :py:obj:`label` is of a scalar of uint32 type.
+    The column :py:obj:`label` is of a scalar of uint32 type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all'
-            (default=None, will read all samples).
-        num_samples (int, optional): The number of images to be included in the dataset
-            (default=None, will read all samples).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, set in the config).
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset
-            (default=None, expected order behavior shown in the table).
-        sampler (Sampler, optional): Object used to choose samples from the
-            dataset (default=None, expected order behavior shown in the table).
+        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all'.
+            Default: None, will read all samples.
+        num_samples (int, optional): The number of images to be included in the dataset.
+            For Polarity dataset, 'train' will read from 3,600,000 train samples, 'test' will read from 400,000 test
+            samples, 'all' will read from all 4,000,000 samples. For Full dataset, 'train' will read from 3,000,000
+            train samples, 'test' will read from 650,000 test samples, 'all' will read from all 3,650,000 samples.
+            Default: None, will include all samples.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
+        sampler (Sampler, optional): Object used to choose samples from the dataset.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
-            into (default=None). When this argument is specified, `num_samples` reflects
+            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -861,47 +879,48 @@ class IWSLT2016Dataset(SourceDataset, TextBaseDataset):
    """
    A source dataset that reads and parses IWSLT2016 datasets.

-    The generated dataset has two columns: :py:obj:`[text, translation]`.
+    The generated dataset has two columns: :py:obj:`[text, translation]` .
    The tensor of column :py:obj: `text` is of the string type.
-    The tensor of column :py:obj: `translation` is of the string type.
+    The column :py:obj: `translation` is of the string type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        usage (str, optional): Acceptable usages include 'train', 'valid', 'test' and 'all' (default=None, all samples).
+        usage (str, optional): Acceptable usages include 'train', 'valid', 'test' and 'all'. Default: None, all samples.
        language_pair (sequence, optional): Sequence containing source and target language, supported values are
            ('en', 'fr'), ('en', 'de'), ('en', 'cs'), ('en', 'ar'), ('fr', 'en'), ('de', 'en'), ('cs', 'en'),
-            ('ar', 'en') (default=('de', 'en')).
+            ('ar', 'en'). Default: ('de', 'en').
        valid_set (str, optional): A string to identify validation set, when usage is valid or all, the validation set
-            of valid_set type will be read, supported values are 'dev2010', 'tst2010', 'tst2011', 'tst2012', 'tst2013'
-            and 'tst2014' (default='tst2013').
-        test_set (str, optional): A string to identify test set, when usage is test or all, the test set of test_set
-            type will be read, supported values are 'dev2010', 'tst2010', 'tst2011', 'tst2012', 'tst2013' and 'tst2014'
-            (default='tst2014').
-        num_samples (int, optional): Number of samples (rows) to read (default=None, reads the full dataset).
-        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch
-            (default=Shuffle.GLOBAL). Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+            of `valid_set` type will be read, supported values are 'dev2010', 'tst2010', 'tst2011', 'tst2012', 'tst2013'
+            and 'tst2014'. Default: 'tst2013'.
+        test_set (str, optional): A string to identify test set, when usage is test or all, the test set of `test_set`
+            type will be read, supported values are 'dev2010', 'tst2010', 'tst2011', 'tst2012', 'tst2013' and 'tst2014'.
+            Default: 'tst2014'.
+        num_samples (int, optional): Number of samples (rows) to read. Default: None, reads the full dataset.
+        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If `shuffle` is False, no shuffling will be performed.
+            If `shuffle` is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

-            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.
+            - Shuffle.GLOBAL: Shuffle both the files and samples.

            - Shuffle.FILES: Shuffle files only.
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` .Default: None. This
            argument can only be specified when `num_shards` is also specified.
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the config).
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> iwslt2016_dataset_dir = "/path/to/iwslt2016_dataset_dir"
@ -912,8 +931,8 @@ class IWSLT2016Dataset(SourceDataset, TextBaseDataset):

    IWSLT is an international oral translation conference, a major annual scientific conference dedicated to all aspects
    of oral translation. The MT task of the IWSLT evaluation activity constitutes a dataset, which can be publicly
-    obtained through the WIT3 website wit3.fbk.eu. The IWSLT2016 dataset includes translations from English to Arabic,
-    Czech, French, and German, and translations from Arabic, Czech, French, and German to English.
+    obtained through the WIT3 website `wit3 <https://wit3.fbk.eu>`_ . The IWSLT2016 dataset includes translations from
+    English to Arabic, Czech, French, and German, and translations from Arabic, Czech, French, and German to English.

    You can unzip the original IWSLT2016 dataset files into this directory structure and read by MindSpore's API. After
    decompression, you also need to decompress the dataset to be read in the specified folder. For example, if you want
@ -988,42 +1007,42 @@ class IWSLT2017Dataset(SourceDataset, TextBaseDataset):
    """
    A source dataset that reads and parses IWSLT2017 datasets.

-    The generated dataset has two columns: :py:obj:`[text, translation]`.
-    The tensor of column :py:obj:`text` is of the string type.
-    The tensor of column :py:obj:`translation` is of the string type.
+    The generated dataset has two columns: :py:obj:`[text, translation]` .
+    The tensor of column :py:obj:`text` and :py:obj:`translation` are of the string type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        usage (str, optional): Acceptable usages include 'train', 'valid', 'test' and 'all' (default=None, all samples).
+        usage (str, optional): Acceptable usages include 'train', 'valid', 'test' and 'all'. Default: None, all samples.
        language_pair (sequence, optional): List containing src and tgt language, supported values are ('en', 'nl'),
            ('en', 'de'), ('en', 'it'), ('en', 'ro'), ('nl', 'en'), ('nl', 'de'), ('nl', 'it'), ('nl', 'ro'),
            ('de', 'en'), ('de', 'nl'), ('de', 'it'), ('de', 'ro'), ('it', 'en'), ('it', 'nl'), ('it', 'de'),
-            ('it', 'ro'), ('ro', 'en'), ('ro', 'nl'), ('ro', 'de'), ('ro', 'it') (default=('de', 'en')).
-        num_samples (int, optional): Number of samples (rows) to read (default=None, reads the full dataset).
-        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch
-            (default=Shuffle.GLOBAL). Bool type and Shuffle enum are both supported to pass in.
+            ('it', 'ro'), ('ro', 'en'), ('ro', 'nl'), ('ro', 'de'), ('ro', 'it'). Default: ('de', 'en').
+        num_samples (int, optional): Number of samples (rows) to read. Default: None, reads the full dataset.
+        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

-            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.
+            - Shuffle.GLOBAL: Shuffle both the files and samples.

            - Shuffle.FILES: Shuffle files only.
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the config).
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> iwslt2017_dataset_dir = "/path/to/iwslt2017_dataset_dir"
@ -1033,8 +1052,8 @@ class IWSLT2017Dataset(SourceDataset, TextBaseDataset):

    IWSLT is an international oral translation conference, a major annual scientific conference dedicated to all aspects
    of oral translation. The MT task of the IWSLT evaluation activity constitutes a dataset, which can be publicly
-    obtained through the WIT3 website wit3.fbk.eu. The IWSLT2017 dataset involves German, English, Italian, Dutch, and
-    Romanian. The dataset includes translations in any two different languages.
+    obtained through the WIT3 website  `wit3 <https://wit3.fbk.eu>`_ . The IWSLT2017 dataset involves German, English,
+    Italian, Dutch, and Romanian. The dataset includes translations in any two different languages.

    You can unzip the original IWSLT2017 dataset files into this directory structure and read by MindSpore's API. You
    need to decompress the dataset package in texts/DeEnItNlRo/DeEnItNlRo directory to get the DeEnItNlRo-DeEnItNlRo
@ -1186,7 +1205,7 @@ class PennTreebankDataset(SourceDataset, TextBaseDataset):
    """
    A source dataset that reads and parses PennTreebank datasets.

-    The generated dataset has one column :py:obj:`[text]`.
+    The generated dataset has one column :py:obj:`[text]` .
    The tensor of column :py:obj:`text` is of the string type.

    Args:
@ -1195,27 +1214,33 @@ class PennTreebankDataset(SourceDataset, TextBaseDataset):
            'train' will read from 42,068 train samples of string type,
            'test' will read from 3,370 test samples of string type,
            'valid' will read from 3,761 test samples of string type,
-            'all' will read from all 49,199 samples of string type (default=None, all samples).
-        num_samples (int, optional): Number of samples (rows) to read (default=None, reads the full dataset).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, number set in the config).
-        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch
-            (default=Shuffle.GLOBAL). Bool type and Shuffle enum are both supported to pass in.
+            'all' will read from all 49,199 samples of string type. Default: None, all samples.
+        num_samples (int, optional): Number of samples (rows) to read. Default: None, reads the full dataset.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

-            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.
+            - Shuffle.GLOBAL: Shuffle both the files and samples.

            - Shuffle.FILES: Shuffle files only.

-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, 'num_samples' reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.
+
+    Raises:
+        RuntimeError: If `dataset_dir` does not contain data files.
+        RuntimeError: If `num_shards` is specified but `shard_id` is None.
+        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> penn_treebank_dataset_dir = "/path/to/penn_treebank_dataset_directory"
--- a/mindspore/python/mindspore/dataset/engine/datasets_vision.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_vision.py
@ -1444,7 +1444,7 @@ class EMnistDataset(MappableDataset, VisionBaseDataset):
    """
    A source dataset that reads and parses the EMNIST dataset.

-    The generated dataset has two columns :py:obj:`[image, label]`.
+    The generated dataset has two columns :py:obj:`[image, label]` .
    The tensor of column :py:obj:`image` is of the uint8 type.
    The tensor of column :py:obj:`label` is a scalar of the uint32 type.

@ -1452,23 +1452,24 @@ class EMnistDataset(MappableDataset, VisionBaseDataset):
        dataset_dir (str): Path to the root directory that contains the dataset.
        name (str): Name of splits for this dataset, can be 'byclass', 'bymerge', 'balanced', 'letters', 'digits'
            or 'mnist'.
-        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all'.
-            (default=None, will read all samples).
-        num_samples (int, optional): The number of images to be included in the dataset
-            (default=None, will read all images).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, will use value set in the config).
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset
-            (default=None, expected order behavior shown in the table).
+        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all'.'train' will read from 60,000
+            train samples, 'test' will read from 10,000 test samples, 'all' will read from all 70,000 samples.
+            Default: None, will read all samples.
+        num_samples (int, optional): The number of images to be included in the dataset.
+            Default: None, will read all images.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset (default=None, expected order behavior shown in the table).
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+            dataset. Default: None, expected order behavior shown in the table below.
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
@ -1577,44 +1578,44 @@ class FakeImageDataset(MappableDataset, VisionBaseDataset):
    """
    A source dataset for generating fake images.

-    The generated dataset has two columns :py:obj:`[image, label]`.
+    The generated dataset has two columns :py:obj:`[image, label]` .
    The tensor of column :py:obj:`image` is of the uint8 type.
-    The tensor of column :py:obj:`label` is a scalar of the uint32 type.
+    The column :py:obj:`label` is a scalar of the uint32 type.

    Args:
-        num_images (int, optional): Number of images to generate in the dataset (default=1000).
-        image_size (tuple, optional):  Size of the fake image (default=(224, 224, 3)).
-        num_classes (int, optional): Number of classes in the dataset (default=10).
-        base_seed (int, optional): Offsets the index-based random seed used to generate each image (default=0).
-        num_samples (int, optional): The number of images to be included in the dataset
-            (default=None, will read all images).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, will use value set in the config).
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset
-            (default=None, expected order behavior shown in the table).
+        num_images (int, optional): Number of images to generate in the dataset. Default: 1000.
+        image_size (tuple, optional):  Size of the fake image. Default: (224, 224, 3).
+        num_classes (int, optional): Number of classes in the dataset. Default: 10.
+        base_seed (int, optional): Offsets the index-based random seed used to generate each image. Default: 0.
+        num_samples (int, optional): The number of images to be included in the dataset.
+            Default: None, will read all images.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset (default=None, expected order behavior shown in the table).
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+            dataset. Default: None, expected order behavior shown in the table below.
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
-        - This dataset can take in a sampler. 'sampler' and 'shuffle' are mutually exclusive.
+        - This dataset can take in a `sampler` . `sampler` and `shuffle` are mutually exclusive.
          The table below shows what input arguments are allowed and their expected behavior.

-    .. list-table:: Expected Order Behavior of Using 'sampler' and 'shuffle'
+    .. list-table:: Expected Order Behavior of Using `sampler` and `shuffle`
       :widths: 25 25 50
       :header-rows: 1

@ -1665,40 +1666,40 @@ class FakeImageDataset(MappableDataset, VisionBaseDataset):

 class FashionMnistDataset(MappableDataset, VisionBaseDataset):
    """
-    A source dataset that reads and parses the FASHION-MNIST dataset.
+    A source dataset that reads and parses the Fashion-MNIST dataset.

-    The generated dataset has two columns :py:obj:`[image, label]`.
+    The generated dataset has two columns :py:obj:`[image, label]` .
    The tensor of column :py:obj:`image` is of the uint8 type.
-    The tensor of column :py:obj:`label` is a scalar of the uint32 type.
+    The column :py:obj:`label` is a scalar of the uint32 type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all'. 'train' will read from 60,000
            train samples, 'test' will read from 10,000 test samples, 'all' will read from all 70,000 samples.
-            (default=None, will read all samples)
-        num_samples (int, optional): The number of images to be included in the dataset
-            (default=None, will read all images).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, will use value set in the config).
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset
-            (default=None, expected order behavior shown in the table).
-        sampler (Sampler, optional): Object used to choose samples from the
-            dataset (default=None, expected order behavior shown in the table).
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+            Default: None, will read all samples.
+        num_samples (int, optional): The number of images to be included in the dataset.
+            Default: None, will read all images.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
+        sampler (Sampler, optional): Object used to choose samples from the dataset.
+            Default: None, expected order behavior shown in the table below.
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -2033,43 +2034,42 @@ class Flowers102Dataset(GeneratorDataset):
    """
    A source dataset that reads and parses Flowers102 dataset.

-    The generated dataset has two columns :py:obj:`[image, label]` or three :py:obj:`[image, segmentation, label]`.
-    The tensor of column :py:obj:`image` is of the uint8 type.
-    The tensor of column :py:obj:`segmentation` is of the uint8 type.
-    The tensor of column :py:obj:`label` is a scalar or a tensor of the uint32 type.
+    According to the given `task` configuration, the generated dataset has different output columns:
+    - `task` = 'Classification', output columns: `[image, dtype=uint8]` , `[label, dtype=uint32]` .
+    - `task` = 'Segmentation',
+    output columns: `[image, dtype=uint8]` , `[segmentation, dtype=uint8]` , `[label, dtype=uint32]` .

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        task (str, optional): Specify the 'Classification' or 'Segmentation' task (default='Classification').
-        usage (str, optional): Specify the 'train', 'valid', 'test' part or 'all' parts of dataset
-            (default='all', will read all samples).
-        num_samples (int, optional): The number of samples to be included in the dataset (default=None, all images).
-        num_parallel_workers (int, optional): Number of subprocesses used to fetch the dataset in parallel (default=1).
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset. Random accessible input is required.
-            (default=None, expected order behavior shown in the table).
-        decode (bool, optional): Whether or not to decode the images and segmentations after reading (default=False).
-        sampler (Union[Sampler, Iterable], optional): Object used to choose samples from the dataset. Random accessible
-            input is required (default=None, expected order behavior shown in the table).
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
-            Random accessible input is required. When this argument is specified, 'num_samples' reflects the max
-            sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This argument must be specified only
-            when num_shards is also specified. Random accessible input is required.
+        task (str, optional): Specify the 'Classification' or 'Segmentation' task. Default: 'Classification'.
+        usage (str, optional): Specify the 'train', 'valid', 'test' part or 'all' parts of dataset.
+            Default: 'all', will read all samples.
+        num_samples (int, optional): The number of samples to be included in the dataset. Default: None, all images.
+        num_parallel_workers (int, optional): Number of subprocesses used to fetch the dataset in parallel. Default: 1.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
+        decode (bool, optional): Whether or not to decode the images and segmentations after reading. Default: False.
+        sampler (Union[Sampler, Iterable], optional): Object used to choose samples from the dataset.
+            Default: None, expected order behavior shown in the table below.
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
+            When this argument is specified, `num_samples` reflects the max sample number of per shard.
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This argument must be specified only
+            when `num_shards` is also specified.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
-        - This dataset can take in a sampler. 'sampler' and 'shuffle' are mutually exclusive.
+        - This dataset can take in a `sampler` . `sampler` and `shuffle` are mutually exclusive.
          The table below shows what input arguments are allowed and their expected behavior.

-    .. list-table:: Expected Order Behavior of Using 'sampler' and 'shuffle'
+    .. list-table:: Expected Order Behavior of Using `sampler` and `shuffle`
       :widths: 25 25 50
       :header-rows: 1

@ -2479,39 +2479,39 @@ class KMnistDataset(MappableDataset, VisionBaseDataset):
    """
    A source dataset that reads and parses the KMNIST dataset.

-    The generated dataset has two columns :py:obj:`[image, label]`.
+    The generated dataset has two columns :py:obj:`[image, label]` .
    The tensor of column :py:obj:`image` is of the uint8 type.
-    The tensor of column :py:obj:`label` is a scalar of the uint32 type.
+    The column :py:obj:`label` is a scalar of the uint32 type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
        usage (str, optional): Usage of this dataset, can be 'train', 'test' or 'all' . 'train' will read from 60,000
            train samples, 'test' will read from 10,000 test samples, 'all' will read from all 70,000 samples.
-            (default=None, will read all samples)
-        num_samples (int, optional): The number of images to be included in the dataset
-            (default=None, will read all images).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, will use value set in the config).
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset
-            (default=None, expected order behavior shown in the table).
-        sampler (Sampler, optional): Object used to choose samples from the
-            dataset (default=None, expected order behavior shown in the table).
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+            Default: None, will read all samples.
+        num_samples (int, optional): The number of images to be included in the dataset.
+            Default: None, will read all images.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
+        sampler (Sampler, optional): Object used to choose samples from the dataset.
+            Default: None, expected order behavior shown in the table below.
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and sharding are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
-        ValueError: If `shard_id` is invalid (out of range [0, `num_shards`]).
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
+        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
@ -3259,41 +3259,38 @@ class PhotoTourDataset(MappableDataset, VisionBaseDataset):
    """
    A source dataset that reads and parses the PhotoTour dataset.

-    The generated dataset with different usage has different output columns.
-    If train, the generated dataset has one column :py:obj:`[image]`,
-    else three columns :py:obj:`[image1, image2, matches]`.
-    The tensor of column :py:obj:`image`, :py:obj:`image1` and :py:obj:`image2` is of the uint8 type.
-    The tensor of column :py:obj:`matches` is a scalar of the uint32 type.
+    According to the given `usage` configuration, the generated dataset has different output columns:
+    - `usage` = 'train', output columns: `[image, dtype=uint8]` .
+    - `usage` ≠ 'train', output columns: `[image1, dtype=uint8]` , `[image2, dtype=uint8]` , `[matches, dtype=uint32]` .

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
        name (str): Name of the dataset to load,
            should be one of 'notredame', 'yosemite', 'liberty', 'notredame_harris',
            'yosemite_harris' or 'liberty_harris'.
-        usage (str, optional): Usage of the dataset, can be 'train' or 'test' (Default=None, will be set to 'train').
+        usage (str, optional): Usage of the dataset, can be 'train' or 'test'. Default: None, will be set to 'train'.
            When usage is 'train', number of samples for each `name` is
            {'notredame': 468159, 'yosemite': 633587, 'liberty': 450092, 'liberty_harris': 379587,
            'yosemite_harris': 450912, 'notredame_harris': 325295}.
            When usage is 'test', will read 100,000 samples for testing.
-        num_samples (int, optional): The number of images to be included in the dataset
-            (default=None, will read all images).
-        num_parallel_workers (int, optional): Number of workers to read the data
-            (default=None, will use value set in the config).
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset
-            (default=None, expected order behavior shown in the table).
-        sampler (Sampler, optional): Object used to choose samples from the
-            dataset (default=None, expected order behavior shown in the table).
-        num_shards (int, optional): Number of shards that the dataset will be divided into (default=None).
+        num_samples (int, optional): The number of images to be included in the dataset.
+            Default: None, will read all images.
+        num_parallel_workers (int, optional): Number of workers to read the data.
+            Default: None, number set in the mindspore.dataset.config..
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
+        sampler (Sampler, optional): Object used to choose samples from the dataset.
+            Default: None, expected order behavior shown in the table below.
+        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
-        shard_id (int, optional): The shard ID within `num_shards` (default=None). This
+        shard_id (int, optional): The shard ID within `num_shards` . Default: None. This
            argument can only be specified when `num_shards` is also specified.
        cache (DatasetCache, optional): Use tensor caching service to speed up dataset processing. More details:
-            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_
-            (default=None, which means no cache is used).
+            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
+            Default: None, which means no cache is used.

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
@ -3302,13 +3299,14 @@ class PhotoTourDataset(MappableDataset, VisionBaseDataset):
        ValueError: If `usage` is not in ["train", "test"].
        ValueError: If name is not in ["notredame", "yosemite", "liberty",
            "notredame_harris", "yosemite_harris", "liberty_harris"].
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
-        - This dataset can take in a sampler. `sampler` and `shuffle` are mutually exclusive. The table
+        - This dataset can take in a `sampler` . `sampler` and `shuffle` are mutually exclusive. The table
          below shows what input arguments are allowed and their expected behavior.

-    .. list-table:: Expected Order Behavior of Using 'sampler' and 'shuffle'
+    .. list-table:: Expected Order Behavior of Using `sampler` and `shuffle`
       :widths: 64 64 1
       :header-rows: 1