forked from mindspore-Ecosystem/mindspore
modify format
This commit is contained in:
parent
808e6c58a4
commit
1489cf432b
|
@ -46,10 +46,10 @@
|
|||
|
||||
- **params** (dict):表示没有传入参数的字典(参数派生自SentencePiece库)。
|
||||
|
||||
.. code-block::
|
||||
.. code-block::
|
||||
|
||||
input_sentence_size 0
|
||||
max_sentencepiece_length 16
|
||||
input_sentence_size 0
|
||||
max_sentencepiece_length 16
|
||||
|
||||
**返回:**
|
||||
|
||||
|
|
|
@ -6,8 +6,8 @@
|
|||
模型训练或推理的高阶接口。 `Model` 会根据用户传入的参数封装可训练或推理的实例。
|
||||
|
||||
.. note::
|
||||
如果使用混合精度功能,需要同时设置`optimizer`参数,否则混合精度功能不生效。
|
||||
当使用混合精度时,优化器中的 `global_step` 可能与模型中的 `cur_step_num` 不同。
|
||||
如果使用混合精度功能,需要同时设置 `optimizer` 参数,否则混合精度功能不生效。
|
||||
当使用混合精度时,优化器中的 `global_step` 可能与模型中的 `cur_step_num` 不同。
|
||||
|
||||
**参数:**
|
||||
|
||||
|
|
|
@ -31,7 +31,7 @@ mindspore.nn.AdaSumByDeltaWeightWrapCell
|
|||
|
||||
**异常:**
|
||||
|
||||
- **RuntimeError** - `parallel_mode` 使用了`stand_alone`模式, AdaSum仅支持在分布式场景下使用。
|
||||
- **RuntimeError** - `parallel_mode` 使用了 `stand_alone` 模式, AdaSum仅支持在分布式场景下使用。
|
||||
- **RuntimeError** - 同时使用了优化器并行, 暂时不支持在优化器并行场景下使用AdaSum。
|
||||
- **RuntimeError** - 同时使用了流水线并行, 暂时不支持在流水线并行场景下使用AdaSum。
|
||||
- **RuntimeError** - `device_num` 不是2的幂,或者小于16。
|
||||
|
|
|
@ -31,7 +31,7 @@ mindspore.nn.AdaSumByGradWrapCell
|
|||
|
||||
**异常:**
|
||||
|
||||
- **RuntimeError** - `parallel_mode` 使用了`stand_alone`模式, AdaSum仅支持在分布式场景下使用。
|
||||
- **RuntimeError** - `parallel_mode` 使用了 `stand_alone` 模式, AdaSum仅支持在分布式场景下使用。
|
||||
- **RuntimeError** - 同时使用了优化器并行, 暂时不支持在优化器并行场景下使用AdaSum。
|
||||
- **RuntimeError** - 同时使用了流水线并行, 暂时不支持在流水线并行场景下使用AdaSum。
|
||||
- **RuntimeError** - `device_num` 不是2的幂,或者小于16。
|
||||
|
|
|
@ -392,7 +392,7 @@
|
|||
其中的每一个元素指定对应的输入/输出的Tensor分布策略,可参考: `mindspore.ops.Primitive.shard` 的描述,也可以设置为None,会默认以数据并行执行。
|
||||
其余算子的并行策略由输入输出指定的策略推导得到。
|
||||
|
||||
.. note:: 需设置为PyNative模式,并且全自动并行(AUTO_PARALLEL),同时设置`set_auto_parallel_context`中的搜索模式(search mode)为"sharding_propagation",或半自动并行(SEMI_AUTO_PARALLEL)。
|
||||
.. note:: 需设置为PyNative模式,并且全自动并行(AUTO_PARALLEL),同时设置 `set_auto_parallel_context` 中的搜索模式(search mode)为"sharding_propagation",或半自动并行(SEMI_AUTO_PARALLEL)。
|
||||
|
||||
**参数:**
|
||||
|
||||
|
|
|
@ -28,5 +28,5 @@ mindspore.ops.NotEqual
|
|||
Tensor,输出shape与输入相同,数据类型为bool。
|
||||
|
||||
**异常:**
|
||||
- **TypeError** - `x` 和`y` 不是以下之一:Tensor、Number、bool。
|
||||
- **TypeError** - `x` 和`y` 都不是Tensor。
|
||||
- **TypeError** - `x` 和 `y` 不是以下之一:Tensor、Number、bool。
|
||||
- **TypeError** - `x` 和 `y` 都不是Tensor。
|
|
@ -57,53 +57,54 @@ class Custom(ops.PrimitiveWithInfer):
|
|||
This could be used when func_type is "aot" or "julia".
|
||||
|
||||
1. for "aot":
|
||||
Currently "aot" supports GPU/CPU(linux only) platform.
|
||||
"aot" means ahead of time, in which case Custom directly launches user defined "xxx.so" file as an
|
||||
operator. Users need to compile a handwriting "xxx.cu"/"xxx.cc" file into "xxx.so" ahead of time,
|
||||
and offer the path of the file along with a function name.
|
||||
|
||||
- "xxx.so" file generation:
|
||||
Currently "aot" supports GPU/CPU(linux only) platform.
|
||||
"aot" means ahead of time, in which case Custom directly launches user defined "xxx.so" file as an
|
||||
operator. Users need to compile a handwriting "xxx.cu"/"xxx.cc" file into "xxx.so" ahead of time,
|
||||
and offer the path of the file along with a function name.
|
||||
|
||||
1) GPU Platform: Given user defined "xxx.cu" file (ex. "{path}/add.cu"), use nvcc command to compile
|
||||
it.(ex. "nvcc --shared -Xcompiler -fPIC -o add.so add.cu")
|
||||
- "xxx.so" file generation:
|
||||
|
||||
2) CPU Platform: Given user defined "xxx.cc" file (ex. "{path}/add.cc"), use g++/gcc command to
|
||||
compile it.(ex. "g++ --shared -fPIC -o add.so add.cc")
|
||||
1) GPU Platform: Given user defined "xxx.cu" file (ex. "{path}/add.cu"), use nvcc command to compile
|
||||
it.(ex. "nvcc --shared -Xcompiler -fPIC -o add.so add.cu")
|
||||
|
||||
- Define a "xxx.cc"/"xxx.cu" file:
|
||||
2) CPU Platform: Given user defined "xxx.cc" file (ex. "{path}/add.cc"), use g++/gcc command to
|
||||
compile it.(ex. "g++ --shared -fPIC -o add.so add.cc")
|
||||
|
||||
"aot" is a cross-platform identity. The functions defined in "xxx.cc" or "xxx.cu" share the same args.
|
||||
Typically, the function should be as:
|
||||
- Define a "xxx.cc"/"xxx.cu" file:
|
||||
|
||||
.. code-block::
|
||||
"aot" is a cross-platform identity. The functions defined in "xxx.cc" or "xxx.cu" share
|
||||
the same args. Typically, the function should be as:
|
||||
|
||||
int func(int nparam, void **params, int *ndims, int64_t **shapes, const char **dtypes,
|
||||
.. code-block::
|
||||
|
||||
int func(int nparam, void **params, int *ndims, int64_t **shapes, const char **dtypes,
|
||||
void *stream, void *extra)
|
||||
|
||||
Parameters:
|
||||
Parameters:
|
||||
|
||||
- nparam(int): total number of inputs plus outputs; suppose the operator has 2 inputs and 3 outputs,
|
||||
then nparam=5
|
||||
- params(void \*\*): a pointer to the array of inputs and outputs' pointer; the pointer type of inputs
|
||||
and outputs is void \* ; suppose the operator has 2 inputs and 3 outputs, then the first input's
|
||||
pointer is params[0] and the second output's pointer is params[3]
|
||||
- ndims(int \*): a pointer to the array of inputs and outputs' dimension num; suppose params[i] is a
|
||||
1024x1024 tensor and params[j] is a 77x83x4 tensor, then ndims[i]=2, ndims[j]=3.
|
||||
- shapes(int64_t \*\*): a pointer to the array of inputs and outputs' shapes(int64_t \*); the ith
|
||||
input's jth dimension's size is shapes[i][j](0<=j<ndims[i]); suppose params[i] is a 2x3 tensor and
|
||||
params[j] is a 3x3x4 tensor, then shapes[i][0]=2, shapes[j][2]=4.
|
||||
- dtypes(const char \*\*): a pointer to the array of inputs and outputs' types(const char \*);
|
||||
(ex. "float32", "float16", "float", "float64", "int", "int8", "int16", "int32", "int64", "uint",
|
||||
"uint8", "uint16", "uint32", "uint64", "bool")
|
||||
- stream(void \*): stream pointer, only used in cuda file
|
||||
- extra(void \*): used for further extension
|
||||
- nparam(int): total number of inputs plus outputs; suppose the operator has 2 inputs and 3 outputs,
|
||||
then nparam=5
|
||||
- params(void \*\*): a pointer to the array of inputs and outputs' pointer; the pointer type of
|
||||
inputs and outputs is void \* ; suppose the operator has 2 inputs and 3 outputs, then the first
|
||||
input's pointer is params[0] and the second output's pointer is params[3]
|
||||
- ndims(int \*): a pointer to the array of inputs and outputs' dimension num; suppose params[i] is a
|
||||
1024x1024 tensor and params[j] is a 77x83x4 tensor, then ndims[i]=2, ndims[j]=3.
|
||||
- shapes(int64_t \*\*): a pointer to the array of inputs and outputs' shapes(int64_t \*); the ith
|
||||
input's jth dimension's size is shapes[i][j](0<=j<ndims[i]); suppose params[i] is a 2x3 tensor and
|
||||
params[j] is a 3x3x4 tensor, then shapes[i][0]=2, shapes[j][2]=4.
|
||||
- dtypes(const char \*\*): a pointer to the array of inputs and outputs' types(const char \*);
|
||||
(ex. "float32", "float16", "float", "float64", "int", "int8", "int16", "int32", "int64", "uint",
|
||||
"uint8", "uint16", "uint32", "uint64", "bool")
|
||||
- stream(void \*): stream pointer, only used in cuda file
|
||||
- extra(void \*): used for further extension
|
||||
|
||||
Return Value(int):
|
||||
Return Value(int):
|
||||
|
||||
- 0: MindSpore will continue to run if this aot kernel is successfully executed
|
||||
- others: MindSpore will raise exception and exit
|
||||
- 0: MindSpore will continue to run if this aot kernel is successfully executed
|
||||
- others: MindSpore will raise exception and exit
|
||||
|
||||
Examples: see details in tests/st/ops/graph_kernel/custom/aot_test_files/
|
||||
Examples: see details in tests/st/ops/graph_kernel/custom/aot_test_files/
|
||||
|
||||
- Use it in Custom:
|
||||
|
||||
|
@ -114,20 +115,21 @@ class Custom(ops.PrimitiveWithInfer):
|
|||
"aot")
|
||||
|
||||
2. for "julia":
|
||||
Currently "julia" supports CPU(linux only) platform.
|
||||
For julia use JIT compiler, and julia support c api to call julia code.
|
||||
The Custom can directly launches user defined "xxx.jl" file as an operator.
|
||||
Users need to write a "xxx.jl" file which include modules and functions,
|
||||
and offer the path of the file along with a module name and function name.
|
||||
|
||||
Examples: see details in tests/st/ops/graph_kernel/custom/julia_test_files/
|
||||
Currently "julia" supports CPU(linux only) platform.
|
||||
For julia use JIT compiler, and julia support c api to call julia code.
|
||||
The Custom can directly launches user defined "xxx.jl" file as an operator.
|
||||
Users need to write a "xxx.jl" file which include modules and functions,
|
||||
and offer the path of the file along with a module name and function name.
|
||||
|
||||
- Use it in Custom:
|
||||
Examples: see details in tests/st/ops/graph_kernel/custom/julia_test_files/
|
||||
|
||||
.. code-block::
|
||||
- Use it in Custom:
|
||||
|
||||
Custom(func="{dir_path}/{file_name}:{module_name}:{func_name}",...)
|
||||
(ex. Custom(func="./add.jl:Add:add", out_shape=[1], out_dtype=mstype.float32, "julia")
|
||||
.. code-block::
|
||||
|
||||
Custom(func="{dir_path}/{file_name}:{module_name}:{func_name}",...)
|
||||
(ex. Custom(func="./add.jl:Add:add", out_shape=[1], out_dtype=mstype.float32, "julia")
|
||||
|
||||
out_shape (Union[function, list, tuple]): The output shape infer function or the value of output shape of
|
||||
`func`. Default: None.
|
||||
|
@ -155,8 +157,8 @@ class Custom(ops.PrimitiveWithInfer):
|
|||
|
||||
["hybrid", "akg", "tbe", "aot", "pyfunc", "julia", "aicpu"].
|
||||
|
||||
Each `func_type` only supports specific platforms(targets). Default: "hybrid".
|
||||
The supported platforms of `func_type`:
|
||||
Each `func_type` only supports specific platforms(targets). Default: "hybrid".
|
||||
The supported platforms of `func_type`:
|
||||
|
||||
- "hybrid": supports ["Ascend", "GPU"].
|
||||
- "akg": supports ["Ascend", "GPU"].
|
||||
|
|
Loading…
Reference in New Issue