Fix chinese api info

2022-11-04 14:39:15 +08:00 · 2022-11-04 14:39:15 +08:00 · 73bea6d638
parent 3008a3cee8
commit 73bea6d638
51 changed files with 394 additions and 325 deletions
--- a/docs/api/api_python/dataset/mindspore.dataset.CLUEDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.CLUEDataset.rst
@ -171,9 +171,9 @@ mindspore.dataset.CLUEDataset
        - **ValueError** - `task` 参数不为 'AFQMC'、'TNEWS'、'IFLYTEK'、'CMNLI'、'WSC' 或 'CSL'。
        - **ValueError** - `usage` 参数不为 'train'、'test' 或 'eval'。
        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。

    **关于CLUE数据集：**

@ -204,5 +204,5 @@ mindspore.dataset.CLUEDataset
        howpublished = {https://github.com/CLUEbenchmark/CLUE}
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.CSVDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.CSVDataset.rst
@ -29,7 +29,7 @@
        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.Caltech101Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Caltech101Dataset.rst
@ -29,13 +29,13 @@ mindspore.dataset.Caltech101Dataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `target_type` 参数取值不为'category'、'annotation'或'all'。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `target_type` 参数取值不为'category'、'annotation'或'all'。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -110,5 +110,5 @@ mindspore.dataset.Caltech101Dataset
        url       = {http://data.caltech.edu/records/20086},
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.Caltech256Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Caltech256Dataset.rst
@ -20,13 +20,13 @@ mindspore.dataset.Caltech256Dataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `target_type` 参数取值不为'category'、'annotation'或'all'。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `target_type` 参数取值不为'category'、'annotation'或'all'。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -97,5 +97,5 @@ mindspore.dataset.Caltech256Dataset
        year      = {2007}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.CelebADataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.CelebADataset.rst
@ -23,13 +23,13 @@ mindspore.dataset.CelebADataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
-        - **ValueError** - `usage` 参数取值不为'train'、'valid'、'test'或'all'。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `usage` 参数取值不为'train'、'valid'、'test'或'all'。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -120,5 +120,5 @@ mindspore.dataset.CelebADataset
        howpublished  = {http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.Cifar100Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Cifar100Dataset.rst
@ -21,13 +21,13 @@ mindspore.dataset.Cifar100Dataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
-        - **ValueError** - `usage` 参数取值不为'train'、'test'或'all'。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards`）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `usage` 参数取值不为'train'、'test'或'all'。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -84,5 +84,5 @@ mindspore.dataset.Cifar100Dataset
        howpublished = {http://www.cs.toronto.edu/~kriz/cifar.html}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.Cifar10Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Cifar10Dataset.rst
@ -21,13 +21,13 @@ mindspore.dataset.Cifar10Dataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
-        - **ValueError** - `usage` 参数取值不为'train'、'test'或'all'。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `usage` 参数取值不为'train'、'test'或'all'。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -88,4 +88,4 @@ mindspore.dataset.Cifar10Dataset
        howpublished = {http://www.cs.toronto.edu/~kriz/cifar.html}
        }

-.. include:: mindspore.dataset.api_list_vision.rst
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.CityscapesDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.CityscapesDataset.rst
@ -25,16 +25,16 @@ mindspore.dataset.CityscapesDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **ValueError** - `dataset_dir` 路径非法或不存在。
        - **ValueError** - `task` 参数取值不为'instance'、'semantic'、'polygon'或'color'。
        - **ValueError** - `quality_mode` 参数取值不为'fine'或'coarse'。
        - **ValueError** - `usage` 参数取值不在给定的字段中。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -121,5 +121,5 @@ mindspore.dataset.CityscapesDataset
        year        = {2016}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.CocoDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.CocoDataset.rst
@ -67,7 +67,7 @@
        - **ValueError** - `task` 参数取值不为 `Detection` 、 `Stuff` 、`Panoptic` 或 `Keypoint` 。
        - **ValueError** - `annotation_file` 参数对应的文件不存在。
        - **ValueError** - `dataset_dir` 参数路径不存在。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note::
        - 当参数 `extra_metadata` 为True时，还需使用 `rename` 操作删除额外数据列'_meta-filename'的前缀'_meta-'，
@ -151,5 +151,5 @@
        bibsource     = {dblp computer science bibliography, https://dblp.org}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.DBpediaDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.DBpediaDataset.rst
@ -29,7 +29,7 @@ mindspore.dataset.DBpediaDataset
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
-        - **ValueError** - `shard_id` 参数值错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    **关于DBpedia数据集：**

--- a/docs/api/api_python/dataset/mindspore.dataset.DIV2KDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.DIV2KDataset.rst
@ -35,7 +35,7 @@ mindspore.dataset.DIV2KDataset
        - **ValueError** - `scale` 参数取值不在给定的字段中，或与 `downgrade` 参数的值不匹配。
        - **ValueError** - `scale` 参数取值为8，但 `downgrade` 参数的值不为 'bicubic'。
        - **ValueError** - `downgrade` 参数取值为'mild'、'difficult'或'wild'，但 `scale` 参数的值不为4。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -138,5 +138,5 @@ mindspore.dataset.DIV2KDataset
        year      = {2017}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.FlickrDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.FlickrDataset.rst
@ -28,7 +28,7 @@
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `annotation_file` 参数对应的文件不存在。
        - **ValueError** - `dataset_dir` 参数路径不存在。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -126,5 +126,5 @@
        bibsource = {dblp computer science bibliography, https://dblp.org}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.Flowers102Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Flowers102Dataset.rst
@ -29,7 +29,7 @@ mindspore.dataset.Flowers102Dataset
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
-        - **ValueError** - `shard_id` 参数值错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

--- a/docs/api/api_python/dataset/mindspore.dataset.GeneratorDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.GeneratorDataset.rst
@ -14,9 +14,8 @@
        - **column_names** (Union[str, list[str]]，可选) - 指定数据集生成的列名。默认值：None，不指定。用户可以通过此参数或 `schema` 参数指定列名。
        - **column_types** (list[mindspore.dtype]，可选) - 指定生成数据集各个数据列的数据类型。默认值：None，不指定。
          如果未指定该参数，则自动推断类型；如果指定了该参数，将在数据输出时做类型匹配检查。
-        - **schema** (Union[Schema, str]，可选) - 读取模式策略，用于指定读取数据列的数据类型、数据维度等信息。
-          支持传入JSON文件路径或 mindspore.dataset.Schema 构造的对象。默认值：None，不指定。
-          用户可以通过提供 `column_names` 或 `schema` 指定数据集的列名，但如果同时指定两者，则将优先从 `schema` 中获取列名信息。
+        - **schema** (Union[str, Schema], 可选) - 数据格式策略，用于指定读取数据列的数据类型、数据维度等信息。
+          支持传入JSON文件路径或 mindspore.dataset.Schema 构造的对象。默认值：None。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作进程数/线程数（由参数 `python_multiprocessing` 决定当前为多进程模式或多线程模式）。默认值：1。
        - **shuffle** (bool，可选) - 是否混洗数据集。只有输入的 `source` 参数带有可随机访问属性（`__getitem__`）时，才可以指定该参数。默认值：None。下表中会展示不同配置的预期行为。
@ -34,7 +33,7 @@
        - **ValueError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **ValueError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **ValueError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note::
        - `source` 参数接收用户自定义的Python函数（PyFuncs），不要将 `mindspore.nn` 和 `mindspore.ops` 目录下或其他的网络计算算子添加
@ -67,5 +66,5 @@
         - False
         - 不允许

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.IMDBDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.IMDBDataset.rst
@ -27,7 +27,7 @@ mindspore.dataset.IMDBDataset
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
-        - **ValueError** - `shard_id` 参数值错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

--- a/docs/api/api_python/dataset/mindspore.dataset.ImageFolderDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.ImageFolderDataset.rst
@ -29,7 +29,7 @@ mindspore.dataset.ImageFolderDataset
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **RuntimeError** - `class_indexing` 参数的类型不是dict。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note::
        - 如果 `decode` 参数的值为False，则得到的 `image` 列的shape为[undecoded_image_size]，如果为True则 `image` 列的shape为[H,W,C]。
--- a/docs/api/api_python/dataset/mindspore.dataset.MindDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.MindDataset.rst
@ -30,7 +30,7 @@
        - **ValueError** - `num_parallel_workers` 参数超过最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -60,5 +60,5 @@
         - False
         - 不允许

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.MnistDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.MnistDataset.rst
@ -27,7 +27,7 @@ mindspore.dataset.MnistDataset
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

--- a/docs/api/api_python/dataset/mindspore.dataset.NumpySlicesDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.NumpySlicesDataset.rst
@ -54,7 +54,7 @@ mindspore.dataset.NumpySlicesDataset
        - **ValueError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **ValueError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **ValueError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.OBSMindDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.OBSMindDataset.rst
@ -33,12 +33,12 @@
        - **ValueError** - `columns_list` 参数无效。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note::
        - 需要用户提前在云存储上创建同步用的目录，然后通过 `sync_obs_path` 指定。
        - 如果线下训练，建议为每次训练设置 `BATCH_JOB_ID` 环境变量。
        - 分布式训练中，假如使用多个节点（服务器），则必须使用每个节点全部的8张卡。如果只有一个节点（服务器），则没有这样的限制。

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.Places365Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.Places365Dataset.rst
@ -3,7 +3,7 @@ mindspore.dataset.Places365Dataset

 .. py:class:: mindspore.dataset.Places365Dataset(dataset_dir, usage=None, small=True, decode=False, num_samples=None, num_parallel_workers=None, shuffle=None, sampler=None, num_shards=None, shard_id=None, cache=None)

-    读取和解析PhotoTour数据集的源数据集。
+    读取和解析Places365数据集的源数据集。

    生成的数据集有两列: `[image, label]`。 
    `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。
@ -23,12 +23,12 @@ mindspore.dataset.Places365Dataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，参数小于0或者大于等于 `num_shards` 。
        - **ValueError** - `usage` 不是['train-standard', 'train-challenge', 'val']中的任何一个。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。
--- a/docs/api/api_python/dataset/mindspore.dataset.QMnistDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.QMnistDataset.rst
@ -21,12 +21,12 @@ mindspore.dataset.QMnistDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -87,5 +87,5 @@ mindspore.dataset.QMnistDataset
           publisher = {Curran Associates, Inc.},
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.RandomDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.RandomDataset.rst
@ -7,9 +7,9 @@ mindspore.dataset.RandomDataset

    参数：
        - **total_rows** (int, 可选) - 随机生成样本数据的数量。默认值：None，生成随机数量的样本。
-        - **schema** (Union[str, Schema], 可选) - 读取模式策略，用于指定读取数据列的数据类型、数据维度等信息。
-          支持传入JSON文件路径或 mindspore.dataset.Schema 构造的对象。默认值：None，不指定。
-        - **columns_list** (list[str], 可选) - 指定生成数据集的列名。默认值：None，生成的数据列将以"c0"，"c1"，"c2" ... "cn"的规则命名。
+        - **schema** (Union[str, Schema], 可选) - 数据格式策略，用于指定读取数据列的数据类型、数据维度等信息。
+          支持传入JSON文件路径或 mindspore.dataset.Schema 构造的对象。默认值：None。
+        - **columns_list** (list[str], 可选) - 指定生成数据集的列名。默认值：None，生成的数据列将以"c0"、"c1"、"c2" ... "cn"的规则命名。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取所有样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。
@ -17,5 +17,16 @@ mindspore.dataset.RandomDataset
        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+    异常：
+        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
+        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **TypeError** - `total_rows` 的类型不是int。
+        - **TypeError** - `num_shards` 的类型不是int。
+        - **TypeError** - `num_parallel_workers` 的类型不是int。
+        - **TypeError** - `shuffle` 的类型不是bool。
+        - **TypeError** - `columns_list` 的类型不是list。
+
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.SBDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.SBDataset.rst
@ -5,7 +5,7 @@ mindspore.dataset.SBDataset

    读取和解析Semantic Boundaries数据集的源文件构建数据集。

-    根据给定的 `task` 配置，生成数据集具有不同的输出列：
+    通过配置 `task` 参数，生成的数据集具有不同的输出列：

    - `task` = 'Boundaries'，有两个输出列： `image` 列的数据类型为uint8，`label` 列包含1个的数据类型为uint8的图像。
    - `task` = 'Segmentation'，有两个输出列： `image` 列的数据类型为uint8。 `label` 列包含20个的数据类型为uint8的图像。
@ -15,7 +15,7 @@ mindspore.dataset.SBDataset
        - **task** (str, 可选) - 指定读取SB数据集的任务类型，支持'Boundaries'和'Segmentation'。默认值：'Boundaries'。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'、'val'、'train_noval'和'all'。默认值：'train'。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，所有图像样本。
-        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
+        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：1，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None。下表中会展示不同参数配置的预期行为。
        - **decode** (bool, 可选) - 是否对读取的图片进行解码操作。默认值：False，不解码。
        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None。下表中会展示不同配置的预期行为。
@ -24,15 +24,15 @@ mindspore.dataset.SBDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `dataset_dir` 不存在。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **ValueError** - `task` 不是['Boundaries', 'Segmentation']中的任何一个。
        - **ValueError** - `usage` 不是['train', 'val', 'train_noval', 'all']中的任何一个。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -102,5 +102,5 @@ mindspore.dataset.SBDataset
            year         = "2011",
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.SBUDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.SBUDataset.rst
@ -9,10 +9,10 @@ mindspore.dataset.SBUDataset

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录的路径。
-        - **decode** (bool, 可选) - 是否对读取的图片进行解码操作。默认值：False，不解码。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，所有图像样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None。下表中会展示不同参数配置的预期行为。
+        - **decode** (bool, 可选) - 是否对读取的图片进行解码操作。默认值：False，不解码。
        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None。下表中会展示不同配置的预期行为。
        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
@ -20,12 +20,12 @@ mindspore.dataset.SBUDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -83,5 +83,5 @@ mindspore.dataset.SBUDataset
          Year      = {2011},
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.STL10Dataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.STL10Dataset.rst
@ -22,13 +22,13 @@ mindspore.dataset.STL10Dataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `usage` 参数无效。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -95,5 +95,5 @@ mindspore.dataset.STL10Dataset
                        }
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.SVHNDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.SVHNDataset.rst
@ -11,7 +11,7 @@ mindspore.dataset.SVHNDataset
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'、'test'、'extra'或'all'。默认值：None，读取全部样本图片。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数，可以小于数据集总数。默认值：None，读取全部样本图片。
-        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
+        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：1，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None。下表中会展示不同参数配置的预期行为。
        - **sampler** (Sampler, 可选) - 指定从数据集中选取样本的采样器。默认值：None。下表中会展示不同配置的预期行为。
        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
@ -19,13 +19,13 @@ mindspore.dataset.SVHNDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `usage` 参数无效。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

--- a/docs/api/api_python/dataset/mindspore.dataset.SemeionDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.SemeionDataset.rst
@ -19,12 +19,12 @@ mindspore.dataset.SemeionDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -77,5 +77,5 @@ mindspore.dataset.SemeionDataset
          author={M Buscema, MetaNet},
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.SogouNewsDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.SogouNewsDataset.rst
@ -11,9 +11,8 @@ mindspore.dataset.SogouNewsDataset
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'，'test'或'all'。默认值：None，读取全部样本。
          取值为'train'时将会读取45万个训练样本，取值为'test'时将会读取6万个测试样本，取值为'all'时将会读取全部51万个样本。默认值：None，读取全部样本。
-        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。
-        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None， 读取全部样本。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值： `Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

@ -22,13 +21,14 @@ mindspore.dataset.SogouNewsDataset

        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
+        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    **关于SogouNew数据集：**

@ -60,5 +60,5 @@ mindspore.dataset.SogouNewsDataset
            primaryClass={cs.LG}
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.SpeechCommandsDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.SpeechCommandsDataset.rst
@ -22,12 +22,12 @@ mindspore.dataset.SpeechCommandsDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -87,5 +87,5 @@ mindspore.dataset.SpeechCommandsDataset
        year={2018}
        }

-
-.. include:: mindspore.dataset.api_list_audio.rst
+
+.. include:: mindspore.dataset.api_list_audio.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.TFRecordDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.TFRecordDataset.rst
@ -7,8 +7,8 @@ mindspore.dataset.TFRecordDataset

    参数：
        - **dataset_files** (Union[str, list[str]]) - 数据集文件路径，支持单文件路径字符串、多文件路径字符串列表或可被glob库模式匹配的字符串，文件列表将在内部进行字典排序。
-        - **schema** (Union[str, Schema], 可选) - 读取模式策略，用于指定读取数据列的数据类型、数据维度等信息。
-          支持传入JSON文件路径或 mindspore.dataset.Schema 构造的对象。默认值：None，不指定。
+        - **schema** (Union[str, Schema], 可选) - 数据格式策略，用于指定读取数据列的数据类型、数据维度等信息。
+          支持传入JSON文件路径或 mindspore.dataset.Schema 构造的对象。默认值：None。
        - **columns_list** (list[str], 可选) - 指定从TFRecord文件中读取的数据列。默认值：None，读取所有列。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本。

@ -33,7 +33,7 @@ mindspore.dataset.TFRecordDataset
        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.TedliumDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.TedliumDataset.rst
@ -25,12 +25,12 @@ mindspore.dataset.TedliumDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -149,5 +149,5 @@ mindspore.dataset.TedliumDataset
          biburl={https://www.openslr.org/51/}
        }

-
-.. include:: mindspore.dataset.api_list_audio.rst
+
+.. include:: mindspore.dataset.api_list_audio.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.TextFileDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.TextFileDataset.rst
@ -25,7 +25,7 @@
        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.UDPOSDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.UDPOSDataset.rst
@ -10,9 +10,9 @@ mindspore.dataset.UDPOSDataset
    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'、'test'、'valid'或'all'。
-          取值为'train'时将会读取12,543个样本，取值为'test'时将会读取2,077个测试样本，取值为'test'时将会读取9,981个样本，取值为'valid'时将会读取2,002个样本，取值为'all'时将会读取全部16,622个样本。默认值：None，读取全部样本。
+          取值为'train'时将会读取12,543个样本，取值为'test'时将会读取2,077个测试样本，取值为'valid'时将会读取2,002个样本，取值为'all'时将会读取全部16,622个样本。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值： `Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

@ -26,9 +26,27 @@ mindspore.dataset.UDPOSDataset

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+    **关于UDPOS数据集：**
+
+    UDPOS是一个解析的文本语料库数据集，用于阐明句法或者语义句子结构。
+    该语料库包含254,830个单词和16,622个句子，取自各种网络媒体，包括博客、新闻组、电子邮件和评论。
+
+    **引用：**
+
+    .. code-block::
+
+        @inproceedings{silveira14gold,
+          year = {2014},
+          author = {Natalia Silveira and Timothy Dozat and Marie-Catherine de Marneffe and Samuel Bowman
+            and Miriam Connor and John Bauer and Christopher D. Manning},
+          title = {A Gold Standard Dependency Corpus for {E}nglish},
+          booktitle = {Proceedings of the Ninth International Conference on Language
+            Resources and Evaluation (LREC-2014)}
+        }
+
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.USPSDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.USPSDataset.rst
@ -3,17 +3,17 @@ mindspore.dataset.USPSDataset

 .. py:class:: mindspore.dataset.USPSDataset(dataset_dir, usage=None, num_samples=None, num_parallel_workers=None, shuffle=Shuffle.GLOBAL, num_shards=None, shard_id=None, cache=None)

-    读取和解析UDPOS数据集的源数据集。
+    读取和解析USPS数据集的源数据集。

    生成的数据集有两列: `[image, label]`。 `image` 列的数据类型为uint8。 `label` 列的数据类型为uint32。

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'、'test'、或'all'。
-          取值为'train'时将会读取7,291个样本，取值为'test'时将会读取2,077个测试样本，取值为'test'时将会读取2,007个样本，取值为'all'时将会读取全部9,298个样本。默认值：None，读取全部样本。
+          取值为'train'时将会读取7,291个样本，取值为'test'时将会读取2,007个测试样本，取值为'all'时将会读取全部9,298个样本。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

@ -26,11 +26,11 @@ mindspore.dataset.USPSDataset

    异常：
        - **RuntimeError** - `dataset_dir` 路径下不包含数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
        - **ValueError** - `usage` 参数无效。
-        - **ValueError** - `shard_id` 参数错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    **关于USPS数据集：**
    
@ -61,5 +61,5 @@ mindspore.dataset.USPSDataset
          publisher={IEEE}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.VOCDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.VOCDataset.rst
@ -43,7 +43,7 @@ mindspore.dataset.VOCDataset
        - **ValueError** - 指定的任务不为'Segmentation'或'Detection'。
        - **ValueError** - 指定任务为'Segmentation'时， `class_indexing` 参数不为None。
        - **ValueError** - 与 `usage` 参数相关的txt文件不存在。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note::
        - 当参数 `extra_metadata` 为True时，还需使用 `rename` 操作删除额外数据列'_meta-filename'的前缀'_meta-'，
@ -125,5 +125,5 @@ mindspore.dataset.VOCDataset
        howpublished = {http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html}
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.WIDERFaceDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.WIDERFaceDataset.rst
@ -11,7 +11,7 @@ mindspore.dataset.WIDERFaceDataset
    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'、'test'、'valid'或'all'。
-          取值为'train'时将会读取12,880个样本，取值为'test'时将会读取2,077个测试样本，取值为'test'时将会读取16,097个样本，取值为'valid'时将会读取3,226个样本，取值为'all'时将会读取全部类别样本。默认值：None，读取全部样本。
+          取值为'train'时将会读取12,880个样本，取值为'test'时将会读取16,097个样本，取值为'valid'时将会读取3,226个样本，取值为'all'时将会读取全部类别样本。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
        - **shuffle** (bool, 可选) - 是否混洗数据集。默认值：None。下表中会展示不同参数配置的预期行为。
@ -23,13 +23,13 @@ mindspore.dataset.WIDERFaceDataset

    异常：
        - **RuntimeError** - `dataset_dir` 不包含任何数据文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `shuffle` 参数。
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
        - **ValueError** - `usage` 不在['train', 'test', 'valid', 'all']中。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **ValueError** - `annotation_file` 不存在。
        - **ValueError** - `dataset_dir` 不存在。
    
@ -109,5 +109,5 @@ mindspore.dataset.WIDERFaceDataset
          year={2016},
        }

-
-.. include:: mindspore.dataset.api_list_vision.rst
+
+.. include:: mindspore.dataset.api_list_vision.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.WikiTextDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.WikiTextDataset.rst
@ -9,10 +9,10 @@ mindspore.dataset.WikiTextDataset

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
-        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train', 'test', 'valid'或'all'。默认值：None，读取全部样本。
+        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'、'test'、'valid'或'all'。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

@ -22,14 +22,14 @@ mindspore.dataset.WikiTextDataset
        - **num_shards** (int, 可选) - 指定分布式训练时将数据集进行划分的分片数。默认值：None。指定此参数后， `num_samples` 表示每个分片的最大样本数。
        - **shard_id** (int, 可选) - 指定分布式训练时使用的分片ID号。默认值：None。只有当指定了 `num_shards` 时才能指定此参数。
        - **cache** (DatasetCache, 可选) - 单节点数据缓存服务，用于加快数据集处理，详情请阅读 `单节点数据缓存 <https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/cache.html>`_ 。默认值：None，不使用缓存。
-
-    异常：
-        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_samples` 参数值错误（小于0）。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
-        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
-        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+
+    异常：
+        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
+        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
+        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `num_samples` 参数值错误，小于0。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    **关于WikiText数据集：**

@ -59,5 +59,5 @@ mindspore.dataset.WikiTextDataset
          year={2016}
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.YahooAnswersDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.YahooAnswersDataset.rst
@ -9,11 +9,11 @@ mindspore.dataset.YahooAnswersDataset

    参数：
        - **dataset_dir** (str) - 包含数据集文件的根目录路径。
-        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train', 'test'或'all'。
+        - **usage** (str, 可选) - 指定数据集的子集，可取值为'train'、'test'或'all'。
          取值为'train'时将会读取1,400,000个训练样本，取值为'test'时将会读取60,000个测试样本，取值为'all'时将会读取全部1,460,000个样本。默认值：None，读取全部样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本。
        - **num_parallel_workers** (int, 可选) - 指定读取数据的工作线程数。默认值：None，使用mindspore.dataset.config中配置的线程数。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

@ -26,10 +26,10 @@ mindspore.dataset.YahooAnswersDataset

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    **关于YahooAnswers数据集：**

@ -59,5 +59,5 @@ mindspore.dataset.YahooAnswersDataset
        howpublished = {}
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.YelpReviewDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.YelpReviewDataset.rst
@ -13,7 +13,7 @@ mindspore.dataset.YelpReviewDataset
          对于Polarity数据集，'train'将读取560,000个训练样本，'test'将读取38,000个测试样本，'all'将读取所有598,000个样本。
          对于Full数据集，'train'将读取650,000个训练样本，'test'将读取50,000个测试样本，'all'将读取所有700,000个样本。默认值：None，读取所有样本。
        - **num_samples** (int, 可选) - 指定从数据集中读取的样本数。默认值：None，读取全部样本。
-        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：mindspore.dataset.Shuffle.GLOBAL。
+        - **shuffle** (Union[bool, Shuffle], 可选) - 每个epoch中数据混洗的模式，支持传入bool类型与枚举类型进行指定。默认值：`Shuffle.GLOBAL` 。
          如果 `shuffle` 为False，则不混洗，如果 `shuffle` 为True，等同于将 `shuffle` 设置为mindspore.dataset.Shuffle.GLOBAL。
          通过传入枚举变量设置数据混洗的模式：

@ -27,9 +27,9 @@ mindspore.dataset.YelpReviewDataset

    异常：
        - **RuntimeError** - `dataset_dir` 参数所指向的文件目录不存在或缺少数据集文件。
-        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
+        - **ValueError** - `num_parallel_workers` 参数超过系统最大线程数。

    **关于YelpReview数据集：**

@ -88,5 +88,5 @@ mindspore.dataset.YelpReviewDataset
          year = {2015},
        }

-
-.. include:: mindspore.dataset.api_list_nlp.rst
+
+.. include:: mindspore.dataset.api_list_nlp.rst
--- a/docs/api/api_python/dataset/mindspore.dataset.YesNoDataset.rst
+++ b/docs/api/api_python/dataset/mindspore.dataset.YesNoDataset.rst
@ -25,7 +25,7 @@ mindspore.dataset.YesNoDataset
        - **RuntimeError** - 同时指定了 `sampler` 和 `num_shards` 参数或同时指定了 `sampler` 和 `shard_id` 参数。
        - **RuntimeError** - 指定了 `num_shards` 参数，但是未指定 `shard_id` 参数。
        - **RuntimeError** - 指定了 `shard_id` 参数，但是未指定 `num_shards` 参数。
-        - **ValueError** - `shard_id` 参数值错误（小于0或者大于等于 `num_shards` ）。
+        - **ValueError** - `shard_id` 参数错误，小于0或者大于等于 `num_shards` 。

    .. note:: 此数据集可以指定参数 `sampler` ，但参数 `sampler` 和参数 `shuffle` 的行为是互斥的。下表展示了几种合法的输入参数组合及预期的行为。

@ -79,5 +79,5 @@ mindspore.dataset.YesNoDataset
        url       = "http://wwww.openslr.org/1/"
        }

-
-.. include:: mindspore.dataset.api_list_audio.rst
+
+.. include:: mindspore.dataset.api_list_audio.rst
--- a/docs/api/api_python/dataset_vision/mindspore.dataset.vision.AdjustGamma.rst
+++ b/docs/api/api_python/dataset_vision/mindspore.dataset.vision.AdjustGamma.rst
@ -11,7 +11,7 @@ mindspore.dataset.vision.AdjustGamma
    更多详细信息，请参见 `Gamma矫正 <https://en.wikipedia.org/wiki/Gamma_correction>`_ 。

    参数：
-        - **gamma** (float) - 输出图像像素值与输入图像像素值呈指数相关。 `gamma` 大于1使阴影更暗，而 `gamma` 小于1使黑暗区域更亮。
+        - **gamma** (float) - 非负实数。输出图像像素值与输入图像像素值呈指数相关。 `gamma` 大于1使阴影更暗，而 `gamma` 小于1使黑暗区域更亮。
        - **gain** (float, 可选) - 常数乘数。默认值：1.0。

    异常：
--- a/docs/api/api_python/dataset_vision/mindspore.dataset.vision.PadToSize.rst
+++ b/docs/api/api_python/dataset_vision/mindspore.dataset.vision.PadToSize.rst
@ -11,7 +11,7 @@ mindspore.dataset.vision.PadToSize
        - **offset** (Union[int, Sequence[int, int]], 可选) - 顶部和左侧要填充的长度。
          如果输入整型，使用此值填充图像上侧和左侧。
          如果提供了序列[int, int]，则应按[top, left]的顺序排列，填充图像上侧和左侧。
-          默认值：None，表示对称填充。
+          默认值：None，表示对称填充，保持原始图像处于中心位置。
        - **fill_value** (Union[int, tuple[int, int, int]], 可选) - 填充的像素值，仅在 `padding_mode` 取值为Border.CONSTANT时有效。
          如果是3元素元组，则分别用于填充R、G、B通道。
          如果是整数，则用于所有 RGB 通道。
--- a/docs/api/api_python/dataset_vision/mindspore.dataset.vision.RandomAdjustSharpness.rst
+++ b/docs/api/api_python/dataset_vision/mindspore.dataset.vision.RandomAdjustSharpness.rst
@ -3,7 +3,7 @@ mindspore.dataset.vision.RandomAdjustSharpness

 .. py:class:: mindspore.dataset.vision.RandomAdjustSharpness(degree, prob=0.5)

-    以给定的概率随机调整输入图像的清晰度。
+    以给定的概率随机调整输入图像的锐度。

    参数：
        - **degree** (float) - 锐度调整度，必须是非负的。
--- a/docs/api/api_python/dataset_vision/mindspore.dataset.vision.RandomAutoContrast.rst
+++ b/docs/api/api_python/dataset_vision/mindspore.dataset.vision.RandomAutoContrast.rst
@ -7,7 +7,7 @@ mindspore.dataset.vision.RandomAutoContrast

    参数：
        - **cutoff** (float, 可选) - 输入图像直方图中最亮和最暗像素的百分比。该值必须在 [0.0, 50.0) 范围内。默认值：0.0。
-        - **ignore** (Union[int, sequence], 可选) - 要忽略的背景像素值，忽略值必须在 [0, 255] 范围内。默认值：None。
+        - **ignore** (Union[int, sequence], 可选) - 要忽略的背景像素值，该值必须在 [0, 255] 范围内。默认值：None。
        - **prob** (float, 可选) - 图像被调整对比度的概率，取值范围：[0.0, 1.0]。默认值：0.5。

    异常：
--- a/mindspore/python/mindspore/dataset/engine/datasets_audio.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_audio.py
@ -52,9 +52,9 @@ class CMUArcticDataset(MappableDataset, AudioBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
@ -188,9 +188,9 @@ class GTZANDataset(MappableDataset, AudioBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
@ -324,9 +324,9 @@ class LibriTTSDataset(MappableDataset, AudioBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
@ -610,9 +610,9 @@ class SpeechCommandsDataset(MappableDataset, AudioBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This argument can only be specified
@ -623,11 +623,11 @@ class SpeechCommandsDataset(MappableDataset, AudioBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -743,9 +743,9 @@ class TedliumDataset(MappableDataset, AudioBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
@ -757,11 +757,11 @@ class TedliumDataset(MappableDataset, AudioBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain stm files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -942,9 +942,9 @@ class YesNoDataset(MappableDataset, AudioBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This argument can only
--- a/mindspore/python/mindspore/dataset/engine/datasets_standard_format.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_standard_format.py
@ -248,8 +248,9 @@ class TFRecordDataset(SourceDataset, UnionBaseDataset):
    Args:
        dataset_files (Union[str, list[str]]): String or list of files to be read or glob strings to search for a
            pattern of files. The list will be sorted in a lexicographical order.
-        schema (Union[str, Schema], optional): Path to the JSON schema file or schema object. Default: None.
-            If the schema is not provided, the meta data from the TFData file is considered the schema.
+        schema (Union[str, Schema], optional): Data format policy, which specifies the data types and shapes of the data
+            column to be read. Both JSON file path and objects constructed by mindspore.dataset.Schema are acceptable.
+            Default: None.
        columns_list (list[str], optional): List of columns to be read. Default: None, read all columns.
        num_samples (int, optional): The number of samples (rows) to be included in the dataset. Default: None.
            If num_samples is None and numRows(parsed from schema) does not exist, read the full dataset;
--- a/mindspore/python/mindspore/dataset/engine/datasets_text.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_text.py
@ -402,9 +402,9 @@ class CLUEDataset(SourceDataset, TextBaseDataset):
        ValueError: task is not in 'AFQMC', 'TNEWS', 'IFLYTEK', 'CMNLI', 'WSC' or 'CSL'.
        ValueError: usage is not in 'train', 'test' or 'eval'.
        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
+        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
-        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Examples:
        >>> clue_dataset_dir = ["/path/to/clue_dataset_file"] # contains 1 or multiple clue files
@ -1118,10 +1118,10 @@ class Multi30kDataset(SourceDataset, TextBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: `Shuffle.GLOBAL`. Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed;
-            If shuffle is True, the behavior is the same as setting shuffle to be Shuffle.GLOBAL
-            Otherwise, there are two levels of shuffling:
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If shuffle is False, no shuffling will be performed.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples.

@ -1312,10 +1312,10 @@ class SogouNewsDataset(SourceDataset, TextBaseDataset):
            'all' will read from all 510,000 samples. Default: None, all samples.
        num_samples (int, optional): Number of samples (rows) to read. Default: None, read all samples.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: `Shuffle.GLOBAL`. Bool type and Shuffle enum are both supported to pass in.
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
            If shuffle is False, no shuffling will be performed.
-            If shuffle is True, performs global shuffle.
-            There are three levels of shuffling, desired shuffle enum defined by mindspore.dataset.Shuffle.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples, same as setting shuffle to True.

@ -1332,9 +1332,9 @@ class SogouNewsDataset(SourceDataset, TextBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> sogou_news_dataset_dir = "/path/to/sogou_news_dataset_dir"
@ -1404,10 +1404,10 @@ class SQuADDataset(SourceDataset, TextBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: `Shuffle.GLOBAL`. Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed;
-            If shuffle is True, the behavior is the same as setting shuffle to be Shuffle.GLOBAL
-            Otherwise, there are two levels of shuffling:
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If shuffle is False, no shuffling will be performed.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples.

@ -1565,10 +1565,10 @@ class UDPOSDataset(SourceDataset, TextBaseDataset):
            'all' will read from all 16,622 samples. Default: None, all samples.
        num_samples (int, optional): Number of samples (rows) to read. Default: None, reads the full dataset.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: `Shuffle.GLOBAL`. Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed;
-            If shuffle is True, the behavior is the same as setting shuffle to be Shuffle.GLOBAL
-            Otherwise, there are two levels of shuffling:
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If shuffle is False, no shuffling will be performed.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples.

@ -1586,13 +1586,32 @@ class UDPOSDataset(SourceDataset, TextBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> udpos_dataset_dir = "/path/to/udpos_dataset_dir"
        >>> dataset = ds.UDPOSDataset(dataset_dir=udpos_dataset_dir, usage='all')
+
+    About UDPOS dataset:
+
+    Text corpus dataset that clarifies syntactic or semantic sentence structure.
+    The corpus comprises 254,830 words and 16,622 sentences, taken from various web media including
+    weblogs, newsgroups, emails and reviews.
+
+    Citation:
+
+    .. code-block::
+
+        @inproceedings{silveira14gold,
+          year = {2014},
+          author = {Natalia Silveira and Timothy Dozat and Marie-Catherine de Marneffe and Samuel Bowman
+            and Miriam Connor and John Bauer and Christopher D. Manning},
+          title = {A Gold Standard Dependency Corpus for {E}nglish},
+          booktitle = {Proceedings of the Ninth International Conference on Language
+            Resources and Evaluation (LREC-2014)}
+        }
    """

    @check_udpos_dataset
@ -1622,10 +1641,10 @@ class WikiTextDataset(SourceDataset, TextBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: `Shuffle.GLOBAL`. Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed;
-            If shuffle is True, the behavior is the same as setting shuffle to be Shuffle.GLOBAL
-            Otherwise, there are two levels of shuffling:
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If shuffle is False, no shuffling will be performed.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples.

@ -1641,11 +1660,11 @@ class WikiTextDataset(SourceDataset, TextBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files or invalid.
-        ValueError: If `num_samples` is invalid (< 0).
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `num_samples` is invalid (< 0).
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    About WikiTextDataset dataset:

@ -1711,10 +1730,10 @@ class YahooAnswersDataset(SourceDataset, TextBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: `Shuffle.GLOBAL`. Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed;
-            If shuffle is True, the behavior is the same as setting shuffle to be Shuffle.GLOBAL
-            Otherwise, there are two levels of shuffling:
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If shuffle is False, no shuffling will be performed.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples.

@ -1730,10 +1749,10 @@ class YahooAnswersDataset(SourceDataset, TextBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> yahoo_answers_dataset_dir = "/path/to/yahoo_answers_dataset_directory"
@ -1804,10 +1823,10 @@ class YelpReviewDataset(SourceDataset, TextBaseDataset):
            'all' will read from all 700,000 samples. Default: None, all samples.
        num_samples (int, optional): Number of samples (rows) to read. Default: None, reads all samples.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: `Shuffle.GLOBAL`. Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed;
-            If shuffle is True, the behavior is the same as setting shuffle to be Shuffle.GLOBAL
-            Otherwise, there are two levels of shuffling:
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If shuffle is False, no shuffling will be performed.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples.

@ -1824,9 +1843,9 @@ class YelpReviewDataset(SourceDataset, TextBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Examples:
        >>> yelp_review_dataset_dir = "/path/to/yelp_review_dataset_dir"
--- a/mindspore/python/mindspore/dataset/engine/datasets_user_defined.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_user_defined.py
@ -510,15 +510,16 @@ class GeneratorDataset(MappableDataset, UnionBaseDataset):
            required to provide either column_names or schema.
        column_types (list[mindspore.dtype], optional): List of column data types of the dataset. Default: None.
            If provided, sanity check will be performed on generator output.
-        schema (Union[Schema, str], optional): Path to the JSON schema file or schema object. Default: None. Users are
-            required to provide either column_names or schema. If both are provided, schema will be used.
+        schema (Union[str, Schema], optional): Data format policy, which specifies the data types and shapes of the data
+            column to be read. Both JSON file path and objects constructed by mindspore.dataset.Schema are acceptable.
+            Default: None.
        num_samples (int, optional): The number of samples to be included in the dataset.
            Default: None, all images.
        num_parallel_workers (int, optional): Number of subprocesses used to fetch the dataset in parallel. Default: 1.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset. Random accessible input is required.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Union[Sampler, Iterable], optional): Object used to choose samples from the dataset. Random accessible
-            input is required. Default: None, expected order behavior shown in the table.
+            input is required. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            Random accessible input is required. When this argument is specified, `num_samples` reflects the maximum
            sample number of per shard.
@ -844,15 +845,14 @@ class NumpySlicesDataset(GeneratorDataset):
            otherwise they will be named like column_0, column_1 ...
        num_samples (int, optional): The number of samples to be included in the dataset. Default: None, all samples.
        num_parallel_workers (int, optional): Number of subprocesses used to fetch the dataset in parallel. Default: 1.
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset. Random accessible input is required.
-            Default: None, expected order behavior shown in the table.
-        sampler (Union[Sampler, Iterable], optional): Object used to choose samples from the dataset. Random accessible
-            input is required. Default: None, expected order behavior shown in the table.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
+        sampler (Union[Sampler, Iterable], optional): Object used to choose samples from the dataset.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
-            Random accessible input is required. When this argument is specified, `num_samples` reflects the max
-            sample number of per shard.
+            When this argument is specified, `num_samples` reflects the max sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This argument must be specified only
-            when num_shards is also specified. Random accessible input is required.
+            when num_shards is also specified.

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
--- a/mindspore/python/mindspore/dataset/engine/datasets_vision.py
+++ b/mindspore/python/mindspore/dataset/engine/datasets_vision.py
@ -132,10 +132,10 @@ class Caltech101Dataset(GeneratorDataset):
            Default: None, all images.
        num_parallel_workers (int, optional): Number of workers to read the data. Default: 1.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        decode (bool, optional): Whether or not to decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
@ -144,13 +144,13 @@ class Caltech101Dataset(GeneratorDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `target_type` is not set correctly.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `target_type` is not set correctly.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
@ -293,10 +293,10 @@ class Caltech256Dataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        decode (bool, optional): Whether or not to decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
@ -308,13 +308,13 @@ class Caltech256Dataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `target_type` is not 'category', 'annotation' or 'all'.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `target_type` is not 'category', 'annotation' or 'all'.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
@ -440,13 +440,13 @@ class CelebADataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
-        ValueError: If `usage` is not 'train', 'valid', 'test' or 'all'.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
+        ValueError: If `usage` is not 'train', 'valid', 'test' or 'all'.

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
@ -595,9 +595,9 @@ class Cifar10Dataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
@ -609,13 +609,13 @@ class Cifar10Dataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
-        ValueError: If `usage` is not 'train', 'test' or 'all'.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
+        ValueError: If `usage` is not 'train', 'test' or 'all'.

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
@ -727,9 +727,9 @@ class Cifar100Dataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, 'num_samples' reflects
            the maximum sample number of per shard.
@ -741,13 +741,13 @@ class Cifar100Dataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
-        ValueError: If `usage` is not 'train', 'test' or 'all'.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
+        ValueError: If `usage` is not 'train', 'test' or 'all'.

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
@ -856,10 +856,10 @@ class CityscapesDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the max sample number of per shard.
@ -871,11 +871,11 @@ class CityscapesDataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` is invalid or does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `dataset_dir` is not exist.
        ValueError: If `task` is invalid.
        ValueError: If `quality_mode` is invalid.
@ -1024,10 +1024,10 @@ class CocoDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the configuration file.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
@ -1273,10 +1273,10 @@ class DIV2KDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the max sample number of per shard.
@ -1803,10 +1803,10 @@ class FlickrDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: None.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the max sample number of per shard.
@ -2208,9 +2208,9 @@ class ImageFolderDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        extensions (list[str], optional): List of file extensions to be
            included in the dataset. Default: None.
        class_indexing (dict, optional): A str-to-int mapping from folder name to index
@ -2354,10 +2354,10 @@ class KITTIDataset(MappableDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, 'num_samples' reflects
            the max sample number of per shard.
@ -2621,10 +2621,10 @@ class LFWDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, 'num_samples' reflects
            the max sample number of per shard.
@ -2771,10 +2771,10 @@ class LSUNDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, 'num_samples' reflects
            the max sample number of per shard.
@ -2899,9 +2899,9 @@ class ManifestDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        class_indexing (dict, optional): A str-to-int mapping from label name to index.
            Default: None, the folder names will be sorted alphabetically and each
            class will be given a unique index starting from 0.
@ -3021,9 +3021,9 @@ class MnistDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
@ -3142,10 +3142,10 @@ class OmniglotDataset(MappableDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, 'num_samples' reflects
            the max sample number of per shard.
@ -3414,22 +3414,22 @@ class Places365Dataset(MappableDataset, VisionBaseDataset):

    The generated dataset has two columns :py:obj:`[image, label]`.
    The tensor of column :py:obj:`image` is of the uint8 type.
-    The tensor of column :py:obj:`label` is a scalar of the uint32 type.
+    The tensor of column :py:obj:`label` is of the uint32 type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
        usage (str, optional): Usage of this dataset, can be 'train-standard', 'train-challenge' or 'val'.
            Default: None, will be set to 'train-standard'.
        small (bool, optional): Use 256 * 256 images (True) or high resolution images (False). Default: False.
-        decode (bool, optional): Decode the images after reading. Default: True.
+        decode (bool, optional): Decode the images after reading. Default: False.
        num_samples (int, optional): The number of images to be included in the dataset.
            Default: None, will read all images.
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
@ -3440,11 +3440,11 @@ class Places365Dataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
        ValueError: If `usage` is not in ["train-standard", "train-challenge", "val"].

@ -3556,7 +3556,7 @@ class QMnistDataset(MappableDataset, VisionBaseDataset):

    The generated dataset has two columns :py:obj:`[image, label]`.
    The tensor of column :py:obj:`image` is of the uint8 type.
-    The tensor of column :py:obj:`label` is a scalar when `compat` is True else a tensor both of the uint32 type.
+    The tensor of column :py:obj:`label` is of the uint32 type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
@ -3569,9 +3569,9 @@ class QMnistDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
@ -3582,12 +3582,12 @@ class QMnistDataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.

    Note:
        - This dataset can take in a `sampler`. `sampler` and `shuffle` are mutually exclusive.
@ -3683,8 +3683,9 @@ class RandomDataset(SourceDataset, VisionBaseDataset):
    Args:
        total_rows (int, optional): Number of samples for the dataset to generate.
            Default: None, number of samples is random.
-        schema (Union[str, Schema], optional): Path to the JSON schema file or schema object. Default: None.
-            If the schema is not provided, the random dataset generates a random schema.
+        schema (Union[str, Schema], optional): Data format policy, which specifies the data types and shapes of the data
+            column to be read. Both JSON file path and objects constructed by mindspore.dataset.Schema are acceptable.
+            Default: None.
        columns_list (list[str], optional): List of column names of the dataset.
            Default: None, the columns will be named like this "c0", "c1", "c2" etc.
        num_samples (int, optional): The number of samples to be included in the dataset.
@ -3695,12 +3696,33 @@ class RandomDataset(SourceDataset, VisionBaseDataset):
            `Single-Node Data Cache <https://www.mindspore.cn/tutorials/experts/en/master/dataset/cache.html>`_ .
            Default: None, which means no cache is used.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, 'num_samples' reflects
            the maximum sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
            argument can only be specified when `num_shards` is also specified.
+
+    Raises:
+        RuntimeError: If `num_shards` is specified but `shard_id` is None.
+        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
+        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
+        TypeError: If `total_rows` is not of type int.
+        TypeError: If `num_shards` is not of type int.
+        TypeError: If `num_parallel_workers` is not of type int.
+        TypeError: If `shuffle` is not of type bool.
+        TypeError: If `columns_list` is not of type list.
+
+    Examples:
+        >>> from mindspore import dtype as mstype
+        >>> import mindspore.dataset as ds
+        >>>
+        >>> schema = ds.Schema()
+        >>> schema.add_column('image', de_type=mstype.uint8, shape=[2])
+        >>> schema.add_column('label', de_type=mstype.uint8, shape=[1])
+        >>># apply dataset operations
+        >>> ds1 = ds.RandomDataset(schema=schema, total_rows=50, num_parallel_workers=4)
    """

    @check_random_dataset
@ -3788,11 +3810,12 @@ class SBDataset(GeneratorDataset):
    """
    A source dataset that reads and parses Semantic Boundaries Dataset.

-    The generated dataset has two columns: :py:obj:`[image, task]`.
+    By configuring the 'Task' parameter, the generated dataset has different output columns.

-    - The tensor of column :py:obj:`image` is of the uint8 type.
-    - The tensor of column :py:obj:`task` contains 20 images of the uint8 type if `task` is 'Boundaries' otherwise
-      contains 1 image of the uint8 type.
+    - 'task' = 'Boundaries' , there are two output columns: the 'image' column has the data type uint8 and
+      the 'label' column contains one image of the data type uint8.
+    - 'task' = 'Segmentation' , there are two output columns: the 'image' column has the data type uint8 and
+      the 'label' column contains 20 images of the data type uint8.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
@ -3800,13 +3823,12 @@ class SBDataset(GeneratorDataset):
        usage (str, optional): Acceptable usages include 'train', 'val', 'train_noval' and 'all'. Default: 'all'.
        num_samples (int, optional): The number of images to be included in the dataset.
            Default: None, all images.
-        num_parallel_workers (int, optional): Number of workers to read the data.
-            Default: None, number set in the config.
+        num_parallel_workers (int, optional): Number of workers to read the data. Default: 1, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: None.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the max sample number of per shard.
@ -3815,12 +3837,12 @@ class SBDataset(GeneratorDataset):

    Raises:
        RuntimeError: If `dataset_dir` is not valid or does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `dataset_dir` is not exist.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `task` is not in ['Boundaries', 'Segmentation'].
        ValueError: If `usage` is not in ['train', 'val', 'train_noval', 'all'].
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
@ -3928,15 +3950,15 @@ class SBUDataset(MappableDataset, VisionBaseDataset):

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
-        decode (bool, optional): Decode the images after reading. Default: False.
        num_samples (int, optional): The number of images to be included in the dataset.
            Default: None, will read all images.
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
+        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the max sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This
@ -3947,11 +3969,11 @@ class SBUDataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -4048,9 +4070,9 @@ class SemeionDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
@ -4062,11 +4084,11 @@ class SemeionDataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -4176,9 +4198,9 @@ class STL10Dataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the
-            dataset. Default: None, expected order behavior shown in the table.
+            dataset. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, 'num_samples' reflects
            the max sample number of per shard.
@ -4190,12 +4212,12 @@ class STL10Dataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` is not valid or does not exist or does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `usage` is invalid.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -4341,24 +4363,23 @@ class SVHNDataset(GeneratorDataset):
            Default: None, will read all samples.
        num_samples (int, optional): The number of samples to be included in the dataset. Default: None, all images.
        num_parallel_workers (int, optional): Number of subprocesses used to fetch the dataset in parallel. Default: 1.
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset. Random accessible input is required.
-            Default: None, expected order behavior shown in the table.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
        sampler (Sampler, optional): Object used to choose samples from the dataset. Random accessible
-            input is required. Default: None, expected order behavior shown in the table.
+            input is required. Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
-            Random accessible input is required. When this argument is specified, 'num_samples' reflects the max
-            sample number of per shard.
+            When this argument is specified, 'num_samples' reflects the max sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This argument must be specified only
-            when num_shards is also specified. Random accessible input is required.
+            when num_shards is also specified.

    Raises:
        RuntimeError: If `dataset_dir` is not valid or does not exist or does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `usage` is invalid.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Note:
@ -4443,7 +4464,7 @@ class USPSDataset(SourceDataset, VisionBaseDataset):

    The generated dataset has two columns: :py:obj:`[image, label]`.
    The tensor of column :py:obj:`image` is of the uint8 type.
-    The tensor of column :py:obj:`label` is of a scalar of uint32 type.
+    The tensor of column :py:obj:`label` is of the uint32 type.

    Args:
        dataset_dir (str): Path to the root directory that contains the dataset.
@ -4455,10 +4476,10 @@ class USPSDataset(SourceDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (Union[bool, Shuffle], optional): Perform reshuffling of the data every epoch.
-            Default: Shuffle.GLOBAL. Bool type and Shuffle enum are both supported to pass in.
-            If shuffle is False, no shuffling will be performed;
-            If shuffle is True, the behavior is the same as setting shuffle to be Shuffle.GLOBAL
-            Otherwise, there are two levels of shuffling:
+            Bool type and Shuffle enum are both supported to pass in. Default: `Shuffle.GLOBAL` .
+            If shuffle is False, no shuffling will be performed.
+            If shuffle is True, it is equivalent to setting `shuffle` to mindspore.dataset.Shuffle.GLOBAL.
+            Set the mode of data shuffling by passing in enumeration variables:

            - Shuffle.GLOBAL: Shuffle both the files and samples.

@ -4474,10 +4495,10 @@ class USPSDataset(SourceDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` is not valid or does not exist or does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `usage` is invalid.
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).

    Examples:
@ -4560,10 +4581,10 @@ class VOCDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, number set in the config.
        shuffle (bool, optional): Whether to perform shuffle on the dataset. Default: None, expected
-            order behavior shown in the table.
+            order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided
            into. Default: None. When this argument is specified, `num_samples` reflects
            the maximum sample number of per shard.
@ -4759,10 +4780,10 @@ class WIDERFaceDataset(MappableDataset, VisionBaseDataset):
        num_parallel_workers (int, optional): Number of workers to read the data.
            Default: None, will use value set in the config.
        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        decode (bool, optional): Decode the images after reading. Default: False.
        sampler (Sampler, optional): Object used to choose samples from the dataset.
-            Default: None, expected order behavior shown in the table.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
            When this argument is specified, `num_samples` reflects the maximum sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This argument can only be specified
@ -4773,13 +4794,13 @@ class WIDERFaceDataset(MappableDataset, VisionBaseDataset):

    Raises:
        RuntimeError: If `dataset_dir` does not contain data files.
-        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        RuntimeError: If `sampler` and `shuffle` are specified at the same time.
        RuntimeError: If `sampler` and `num_shards`/`shard_id` are specified at the same time.
        RuntimeError: If `num_shards` is specified but `shard_id` is None.
        RuntimeError: If `shard_id` is specified but `num_shards` is None.
        ValueError: If `shard_id` is invalid (< 0 or >= `num_shards`).
        ValueError: If `usage` is not in ['train', 'test', 'valid', 'all'].
+        ValueError: If `num_parallel_workers` exceeds the max thread numbers.
        ValueError: If `annotation_file` is not exist.
        ValueError: If `dataset_dir` is not exist.

--- a/mindspore/python/mindspore/dataset/engine/graphdata.py
+++ b/mindspore/python/mindspore/dataset/engine/graphdata.py
@ -1282,13 +1282,13 @@ class InMemoryGraphDataset(GeneratorDataset):
            Default: 'graph'.
        num_samples (int, optional): The number of samples to be included in the dataset. Default: None, all samples.
        num_parallel_workers (int, optional): Number of subprocesses used to fetch the dataset in parallel. Default: 1.
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset. Random accessible input is required.
-            Default: None, expected order behavior shown in the table.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
        num_shards (int, optional): Number of shards that the dataset will be divided into. Default: None.
-            Random accessible input is required. When this argument is specified, `num_samples` reflects the max
+            When this argument is specified, `num_samples` reflects the max
            sample number of per shard.
        shard_id (int, optional): The shard ID within `num_shards`. Default: None. This argument must be specified only
-            when num_shards is also specified. Random accessible input is required.
+            when num_shards is also specified.
        python_multiprocessing (bool, optional): Parallelize Python operations with multiple worker process. This
            option could be beneficial if the Python operation is computational heavy. Default: True.
        max_rowsize(int, optional): Maximum size of row in MB that is used for shared memory allocation to copy
@ -1386,8 +1386,8 @@ class ArgoverseDataset(InMemoryGraphDataset):
            recommend to specify it with
            `column_names=["edge_index", "x", "y", "cluster", "valid_len", "time_step_len"]` like the following example.
        num_parallel_workers (int, optional): Number of subprocesses used to fetch the dataset in parallel. Default: 1.
-        shuffle (bool, optional): Whether or not to perform shuffle on the dataset. Random accessible input is required.
-            Default: None, expected order behavior shown in the table.
+        shuffle (bool, optional): Whether or not to perform shuffle on the dataset.
+            Default: None, expected order behavior shown in the table below.
        python_multiprocessing (bool, optional): Parallelize Python operations with multiple worker process. This
            option could be beneficial if the Python operation is computational heavy. Default: True.
        perf_mode(bool, optional): mode for obtaining higher performance when iterate created dataset(will call