!39748 Add Warning Clear

Merge pull request !39748 from huangxinjing/fix_warning
This commit is contained in:
i-robot 2022-08-08 01:08:43 +00:00 committed by Gitee
commit 23835793b3
No known key found for this signature in database
GPG Key ID: 173E9B9CA92EEF8F
15 changed files with 249 additions and 140 deletions

View File

@ -21,21 +21,24 @@ mindspore.communication
初始化通信服务需要的分布式后端,例如 `HCCL``NCCL` 服务。
.. note::
HCCL的全称是华为集合通信库Huawei Collective Communication LibraryNCCL的全称是英伟达集合通信库NVIDIA Collective Communication Library`init` 方法应该在 `set_context` 方法之后使用。
- HCCL的全称是华为集合通信库Huawei Collective Communication LibraryNCCL的全称是英伟达集合通信库NVIDIA Collective Communication Library
- `init` 方法应该在 `set_context` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **backend_name** (str) - 分布式后端的名称可选HCCL或NCCL。如果未设置则根据硬件平台类型device_target进行推断默认值为None。
异常:
- **TypeError** - 参数 `backend_name` 不是字符串。
- **RuntimeError** - 1硬件设备类型无效2后台服务无效3分布式计算初始化失败4未设置环境变量 `RANK_ID``MINDSPORE_HCCL_CONFIG_PATH` 的情况下初始化HCCL服务。
- **RuntimeError** - 1硬件设备类型无效2后台服务无效3分布式计算初始化失败4后端是HCCL的情况下未设置环境变量 `RANK_ID``MINDSPORE_HCCL_CONFIG_PATH` 的情况下初始化HCCL服务。
.. py:function:: mindspore.communication.release()
释放分布式资源,例如 `HCCL``NCCL` 服务。
.. note::
`release` 方法应该在 `init` 方法之后使用。在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
- `release` 方法应该在 `init` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
异常:
- **RuntimeError** - 在释放分布式资源失败时抛出。
@ -45,7 +48,8 @@ mindspore.communication
在指定通信组中获取当前的设备序号。
.. note::
`get_rank` 方法应该在 `init` 方法之后使用。在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
- `get_rank` 方法应该在 `init` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **group** (str) - 通信组名称,通常由 `create_group` 方法创建,否则将使用默认组。默认值: `GlobalComm.WORLD_COMM_GROUP`
@ -62,7 +66,9 @@ mindspore.communication
获取指定通信组实例的rank_size。
.. note:: `get_group_size` 方法应该在 `init` 方法之后使用。在跑用例之前用户需要预先配置通信相关的环境变量。
.. note::
- `get_group_size` 方法应该在 `init` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **group** (str) - 指定工作组实例(由 create_group 方法创建的名称支持数据类型为str默认值为 `WORLD_COMM_GROUP`
@ -80,9 +86,10 @@ mindspore.communication
由指定通信组中的设备序号获取通信集群中的全局设备序号。
.. note::
- GPU 版本的MindSpore不支持此方法
- 参数 `group` 不能是 `hccl_world_group`
- `get_world_rank_from_group_rank` 方法应该在 `init` 方法之后使用。在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
- GPU 版本的MindSpore不支持此方法。
- 参数 `group` 不能是 `hccl_world_group`
- `get_world_rank_from_group_rank` 方法应该在 `init` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **group** (str) - 传入的通信组名称,通常由 `create_group` 方法创建。
@ -94,16 +101,17 @@ mindspore.communication
异常:
- **TypeError** - 参数 `group` 不是字符串或参数 `group_rank_id` 不是数字。
- **ValueError** - 参数 `group``hccl_world_group` 或后台不可用。
- **RuntimeError** - `HCCL` `NCCL` 服务不可用以及使用CPU版本的MindSpore。
- **RuntimeError** - `HCCL` 服务不可用时或者使用了GPU版本的MindSpore。
.. py:function:: mindspore.communication.get_group_rank_from_world_rank(world_rank_id, group)
由通信集群中的全局设备序号获取指定用户通信组中的rank ID。
.. note::
- GPU 版本的MindSpore不支持此方法
- 参数 `group` 不能是 `hccl_world_group`
- GPU 版本的MindSpore不支持此方法
- 参数 `group` 不能是 `hccl_world_group`
- `get_group_rank_from_world_rank` 方法应该在 `init` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **world_rank_id** (`int`) - 通信集群内的全局rank ID。
@ -115,18 +123,19 @@ mindspore.communication
异常:
- **TypeError** - 在参数 `group_rank_id` 不是数字或参数 `group` 不是字符串时抛出。
- **ValueError** - 在参数 `group``hccl_world_group` 或后台不可用时抛出。
- **RuntimeError** - `HCCL``NCCL` 服务不可用以及使用GPU版本的MindSpore时抛出
- **RuntimeError** - `HCCL` 服务不可用时或者使用了GPU版本的MindSpore
.. py:function:: mindspore.communication.create_group(group, rank_ids)
创建用户自定义的通信组实例。
.. note::
- GPU 版本的MindSpore不支持此方法
- 列表rank_ids的长度应大于1
- 列表rank_ids内不能有重复数据
- `create_group` 方法应该在 `init` 方法之后使用
- GPU 版本的MindSpore不支持此方法
- 列表rank_ids的长度应大于1
- 列表rank_ids内不能有重复数据
- `create_group` 方法应该在 `init` 方法之后使用
- PyNative模式下仅支持全局单通信组。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **group** (str) - 输入用户自定义的通信组实例名称支持数据类型为str。
@ -135,15 +144,16 @@ mindspore.communication
异常:
- **TypeError** - 参数 `group_rank_id` 不是数字或参数 `group` 不是字符串。
- **ValueError** - 列表rank_ids的长度小于1或列表rank_ids内有重复数据以及后台无效。
- **RuntimeError** - `HCCL``NCCL` 服务不可用以及使用CPU版本的MindSpore。
- **RuntimeError** - `HCCL` 服务不可用时或者使用了GPU版本的MindSpore。
.. py:function:: mindspore.communication.get_local_rank(group=GlobalComm.WORLD_COMM_GROUP)
获取指定通信组中当前设备的本地设备序号。
.. note::
- GPU 版本的MindSpore不支持此方法
- `get_local_rank` 方法应该在 `init` 方法之后使用。在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
- GPU 版本的MindSpore不支持此方法。
- `get_local_rank` 方法应该在 `init` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **group** (`str`) - 通信组名称,通常由 `create_group` 方法创建,否则将使用默认组名称。默认值: `WORLD_COMM_GROUP`
@ -154,15 +164,16 @@ mindspore.communication
异常:
- **TypeError** - 在参数 `group` 不是字符串时抛出。
- **ValueError** - 在后台不可用时抛出。
- **RuntimeError** - `HCCL``NCCL` 服务不可用时抛出
- **RuntimeError** - `HCCL` 服务不可用时或者使用了GPU版本的MindSpore
.. py:function:: mindspore.communication.get_local_rank_size(group=GlobalComm.WORLD_COMM_GROUP)
获取指定通信组的本地设备总数。
.. note::
- GPU 版本的MindSpore不支持此方法
- GPU 版本的MindSpore不支持此方法
- `get_local_rank_size` 方法应该在 `init` 方法之后使用。
- 在运行以下示例之前用户需要预设通信环境变量请查看mindspore.communication的文档注释。
参数:
- **group** (str) - 传入的通信组名称,通常由 `create_group` 方法创建,或默认使用 `WORLD_COMM_GROUP`
@ -173,15 +184,15 @@ mindspore.communication
异常:
- **TypeError** - 在参数 `group` 不是字符串时抛出。
- **ValueError** - 在后台不可用时抛出。
- **RuntimeError** - `HCCL``NCCL` 服务不可用时抛出
- **RuntimeError** - `HCCL` 服务不可用时或者使用了GPU版本的MindSpore
.. py:function:: mindspore.communication.destroy_group(group)
注销用户通信组。
.. note::
- GPU 版本的MindSpore不支持此方法
- 参数 `group` 不能是 `hccl_world_group`
- GPU 版本的MindSpore不支持此方法
- 参数 `group` 不能是 `hccl_world_group`
- `destroy_group` 方法应该在 `init` 方法之后使用。
参数:
@ -190,7 +201,7 @@ mindspore.communication
异常:
- **TypeError** - 在参数 `group` 不是字符串时抛出。
- **ValueError** - 在参数 `group``hccl_world_group` 或后台不可用时抛出。
- **RuntimeError** - `HCCL``NCCL` 服务不可用时抛出
- **RuntimeError** - `HCCL` 服务不可用时或者使用了GPU版本的MindSpore
.. py:data:: mindspore.communication.HCCL_WORLD_COMM_GROUP

View File

@ -116,8 +116,9 @@ double MatMulCost::GetBackwardCommCost(const std::vector<TensorInfo> &inputs, co
used_device_num *= input1_shape[i] / input1_slice_shape[i];
}
if (total_device_num != LongToSize(used_device_num))
if (total_device_num != LongToSize(used_device_num)) {
result += ListProduct(input1_slice_shape) * static_cast<double>(inputs_type_lengths_[1]);
}
}
return result;
@ -161,8 +162,9 @@ double MatMulCost::GetBackwardComputationCost(const std::vector<TensorInfo> &inp
used_device_num *= input1_shape[i] / input1_slice_shape[i];
}
if (total_device_num != LongToSize(used_device_num))
if (total_device_num != LongToSize(used_device_num)) {
result += ListProduct(input1_slice_shape) * static_cast<double>(inputs_type_lengths_[1]);
}
}
return result;
@ -197,7 +199,7 @@ void MatMulCost::CalculateInputsInMemory(const std::map<size_t, bool> &prev_outp
}
// return the per device communication cost in the forward phase.
double BatchNormCost::GetForwardCommCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
double BatchNormCost::GetForwardCommCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &,
int64_t) const {
TensorInfo input0 = inputs[0];
Shape input0_shape = input0.shape();
@ -258,8 +260,8 @@ double BatchNormCost::GetForwardComputationCost(const std::vector<TensorInfo> &i
// Return the per device computation cost in the forward phase. The cost is calculated according to the bytes
// this operator uses
double BatchNormCost::GetBackwardComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &,
int64_t stage_id) const {
double BatchNormCost::GetBackwardComputationCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &,
int64_t) const {
return 0.0;
}
@ -831,8 +833,9 @@ double SubCost::GetBackwardComputationCost(const std::vector<TensorInfo> &inputs
used_device_num *= input_a_shape[i] / input_a_slice_shape[i];
}
if (total_device_num != LongToSize(used_device_num))
if (total_device_num != LongToSize(used_device_num)) {
result += ListProduct(input_a_slice_shape) * static_cast<double>(inputs_type_lengths_[0]);
}
}
if (is_parameter_[1]) {
@ -844,8 +847,9 @@ double SubCost::GetBackwardComputationCost(const std::vector<TensorInfo> &inputs
used_device_num *= input_b_shape[i] / input_b_slice_shape[i];
}
if (total_device_num != LongToSize(used_device_num))
if (total_device_num != LongToSize(used_device_num)) {
result += ListProduct(input_b_slice_shape) * static_cast<double>(inputs_type_lengths_[1]);
}
}
return result;
}
@ -866,8 +870,9 @@ double SubCost::GetBackwardCommCost(const std::vector<TensorInfo> &inputs, const
used_device_num *= input_a_shape[i] / input_a_slice_shape[i];
}
if (total_device_num != LongToSize(used_device_num))
if (total_device_num != LongToSize(used_device_num)) {
result += ListProduct(input_a_slice_shape) * static_cast<double>(inputs_type_lengths_[0]);
}
}
if (is_parameter_[1]) {
@ -879,8 +884,9 @@ double SubCost::GetBackwardCommCost(const std::vector<TensorInfo> &inputs, const
used_device_num *= input_b_shape[i] / input_b_slice_shape[i];
}
if (total_device_num != LongToSize(used_device_num))
if (total_device_num != LongToSize(used_device_num)) {
result += ListProduct(input_b_slice_shape) * static_cast<double>(inputs_type_lengths_[1]);
}
}
return result;
@ -1205,8 +1211,9 @@ double ReduceSumCost::GetBackwardCommCost(const std::vector<TensorInfo> &inputs,
used_device_num *= input_shape[i] / input_slice_shape[i];
}
if (total_device_num != LongToSize(used_device_num))
if (total_device_num != LongToSize(used_device_num)) {
result += ListProduct(input_slice_shape) * static_cast<double>(inputs_type_lengths_[0]);
}
}
return result;
@ -1432,7 +1439,7 @@ double DSDMatmulCost::GetForwardComputationCost(const std::vector<TensorInfo> &i
void DSDMatmulCost::CalculateOutputInMemory() {
is_output_should_in_memory_ =
(std::find(is_parameter_involve_.begin(), is_parameter_involve_.end(), true) != is_parameter_involve_.end());
(std::find(is_parameter_involve_.cbegin(), is_parameter_involve_.cend(), true) != is_parameter_involve_.cend());
}
void DSDMatmulCost::CalculateInputsInMemory(const std::map<size_t, bool> &) {
@ -1867,7 +1874,7 @@ double MatmulDDSCost::GetForwardComputationCost(const std::vector<TensorInfo> &i
// Not taking account of output
void MatmulDDSCost::CalculateOutputInMemory() {
is_output_should_in_memory_ =
(std::find(is_parameter_involve_.begin(), is_parameter_involve_.end(), true) != is_parameter_involve_.end());
(std::find(is_parameter_involve_.cbegin(), is_parameter_involve_.cend(), true) != is_parameter_involve_.cend());
}
// Taking account of input

View File

@ -24,13 +24,13 @@
namespace mindspore {
namespace parallel {
#define MAXIMUM_INPUT_NUMBER 100
#define DEFAULT_DATA_TYPE_LENGTH 4
#define DROPOUT_COST_RATE 1.125 // the DropoutGenMask need 12.5% memory
#define GATHERV2_COST_WEIGHT0 3
#define GATHERV2_COST_WEIGHT1 7
#define GATHERV2_COST_WEIGHT2 2
#define GATHERV2_COST_WEIGHT3 6
constexpr size_t MAXIMUM_INPUT_NUMBER = 100;
constexpr size_t DEFAULT_DATA_TYPE_LENGTH = 4;
constexpr double DROPOUT_COST_RATE = 1.125; // the DropoutGenMask need 12.5% memory
constexpr size_t GATHERV2_COST_WEIGHT0 = 3;
constexpr size_t GATHERV2_COST_WEIGHT1 = 7;
constexpr size_t GATHERV2_COST_WEIGHT2 = 2;
constexpr size_t GATHERV2_COST_WEIGHT3 = 6;
class OperatorCost;
using OperatorCostPtr = std::shared_ptr<OperatorCost>;
@ -92,7 +92,7 @@ class OperatorCost {
// Contributing the output part for 'GetMemoryCost'
double GetOutputMemoryCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs) const;
// per device memory cost in a inference phase
double GetMemoryCostForInference(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &) const;
double GetMemoryCostForInference(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &outputs) const;
protected:
// For each input in 'inputs_', a bool variable is true if the corresponding one is a parameter or a output of
@ -153,7 +153,7 @@ class BatchNormCost : public OperatorCost {
int64_t stage_id) const override {
return GetForwardCommCost(inputs, outputs, stage_id) + GetBackwardCommCost(inputs, outputs, stage_id);
}
double GetForwardCommCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
double GetForwardCommCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &,
int64_t stage_id) const override;
double GetBackwardCommCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
int64_t stage_id) const override;
@ -165,8 +165,8 @@ class BatchNormCost : public OperatorCost {
}
double GetForwardComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
int64_t stage_id) const override;
double GetBackwardComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
int64_t stage_id) const override;
double GetBackwardComputationCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &,
int64_t) const override;
void CalculateOutputInMemory() override;
void CalculateInputsInMemory(const std::map<size_t, bool> &prev_output_in_mem) override;
};
@ -399,7 +399,8 @@ class BatchParallelCost : public OperatorCost {
double GetForwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &, int64_t) const override {
return 0.0;
}
double GetBackwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &, int64_t) const override;
double GetBackwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &,
int64_t stage_id) const override;
double GetComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
int64_t stage_id) const override {
return GetForwardComputationCost(inputs, outputs, stage_id) + GetBackwardComputationCost(inputs, outputs, stage_id);
@ -629,7 +630,8 @@ class SubCost : public OperatorCost {
double GetForwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &, int64_t) const override {
return 0.0;
}
double GetBackwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &, int64_t) const override;
double GetBackwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &,
int64_t stage_id) const override;
double GetComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
int64_t stage_id) const override {
@ -852,10 +854,12 @@ class GetNextCost : public OperatorCost {
int64_t stage_id) const override {
return GetForwardCommCost(inputs, outputs, stage_id) + GetBackwardCommCost(inputs, outputs, stage_id);
}
double GetForwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &, int64_t) const override {
double GetForwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &,
int64_t stage_id) const override {
return 0.0;
}
double GetBackwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &, int64_t) const override {
double GetBackwardCommCost(const std::vector<TensorInfo> &, const std::vector<TensorInfo> &,
int64_t stage_id) const override {
return 0.0;
}
double GetComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
@ -1056,7 +1060,7 @@ class UniqueCost : public OperatorCost {
double GetForwardComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
int64_t stage_id) const override;
double GetBackwardComputationCost(const std::vector<TensorInfo> &inputs, const std::vector<TensorInfo> &outputs,
int64_t) const override;
int64_t stage_id) const override;
// Taking account of output
void CalculateOutputInMemory() override;
// Not Taking account of input

View File

@ -180,8 +180,8 @@ double CostMatMul::GetMaxCostIn(const OperatorRec &op) {
}
// Chose strategy for MatMul
StrategyRec CostMatMul::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
StrategyRec CostMatMul::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) const {
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}
@ -318,8 +318,8 @@ double CostConvolution::GetMinCostIn(const Graph::NodeType &node) {
}
// Chose strategy for Conv
StrategyRec CostConvolution::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
StrategyRec CostConvolution::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) const {
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}
@ -381,7 +381,7 @@ StrategyRec CostConvolution::ChoseStr(const std::vector<double> &cost_op, Strate
// Get optimal strategy for Pooling
StrategyRec CostPooling::GetOptimalStr(const Graph::NodeType &node,
const std::vector<std::pair<std::string, StrategyRec>> &node_name_to_strategy,
const Graph &graph) {
const Graph &graph) const {
int64_t tensor_n = static_cast<int64_t>(node.tensor_parm.tensor_shape.shape_n * node.tensor_parm.tensor_str.str_n);
int64_t tensor_c = static_cast<int64_t>(node.tensor_parm.tensor_shape.shape_c * node.tensor_parm.tensor_str.str_c);
@ -408,8 +408,8 @@ StrategyRec CostPooling::GetOptimalStr(const Graph::NodeType &node,
}
// Chose strategy for Pooling
StrategyRec CostPooling::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
StrategyRec CostPooling::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) const {
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}
@ -451,7 +451,7 @@ StrategyRec CostPooling::ChoseStr(const std::vector<double> &cost_op, StrategyRe
// Chose strategy for Add
StrategyRec CostTensorAdd::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}
@ -502,7 +502,7 @@ StrategyRec CostReshape::ChoseStr(StrategyRec str) const { return str; }
// Chose strategy for BiasAdd
StrategyRec CostBiasAdd::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}
@ -588,7 +588,7 @@ StrategyRec CostCommon::GetOptimalStr(const Graph::NodeType &node,
// Chose strategy for Common op
StrategyRec CostCommon::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}
@ -667,7 +667,7 @@ StrategyRec CostBatchParallel::GetOptimalStr(const Graph::NodeType &node) {
// Chose strategy for BatchParallel op
StrategyRec CostBatchParallel::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}
@ -709,7 +709,7 @@ StrategyRec CostBatchParallel::ChoseStr(const std::vector<double> &cost_op, Stra
// Chose strategy for CostSoftmaxCrossEntropyWithLogits
StrategyRec CostSoftmaxCrossEntropyWithLogits::ChoseStr(const std::vector<double> &cost_op, StrategyRec str) {
uint64_t min_position = min_element(cost_op.begin(), cost_op.end()) - cost_op.begin();
uint64_t min_position = LongToUlong(min_element(cost_op.begin(), cost_op.end()) - cost_op.begin());
if (cost_op[min_position] > (DOUBLE_MAX - 0.1)) {
return str;
}

View File

@ -25,12 +25,13 @@
#include "frontend/parallel/auto_parallel/rec_core/rec_graph.h"
#include "frontend/parallel/auto_parallel/rec_core/rec_strategy.h"
#include "utils/check_convert_utils.h"
namespace mindspore {
namespace parallel {
#define DOUBLE_MAX (std::numeric_limits<double>::max)()
#define MATMUL_MEM_COEF 0.25
#define REDIS_COEF 16
constexpr double MATMUL_MEM_COEF = 0.25;
constexpr size_t REDIS_COEF = 16;
double CostRedis(const Graph::NodeType &node,
const std::vector<std::pair<std::string, StrategyRec>> &node_name_to_strategy,
@ -69,7 +70,7 @@ class CostMatMul {
return cost_in_k_;
}
StrategyRec ChoseStr(const std::vector<double> &cost_op, StrategyRec str);
StrategyRec ChoseStr(const std::vector<double> &cost_op, StrategyRec str) const;
double cost_in_i_ = 0;
@ -130,7 +131,7 @@ class CostConvolution {
return cost_in_q_;
}
StrategyRec ChoseStr(const std::vector<double> &cost_op, StrategyRec str);
StrategyRec ChoseStr(const std::vector<double> &cost_op, StrategyRec str) const;
double cost_in_b_ = 0;
@ -152,12 +153,12 @@ class CostPooling {
public:
StrategyRec GetOptimalStr(const Graph::NodeType &node,
const std::vector<std::pair<std::string, StrategyRec>> &node_name_to_strategy,
const Graph &graph);
const Graph &graph) const;
double GetMinCostIn() const { return cost_in_; }
private:
StrategyRec ChoseStr(const std::vector<double> &cost_op, StrategyRec str);
StrategyRec ChoseStr(const std::vector<double> &cost_op, StrategyRec str) const;
double cost_in_ = 0;
}; // class CostPooling is used to compute the cost of Pooling operator.

View File

@ -129,7 +129,7 @@ Strategies PrepareStridedSlice(const std::vector<std::shared_ptr<OperatorInfo>>
}
Strategies PrepareSoftMax(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
Dimensions basic_stra) {
const Dimensions &basic_stra) {
Strategies strategies;
strategies.push_back(basic_stra);
std::vector<int64_t> axis_list;
@ -218,7 +218,7 @@ Strategies PrepareGatherV2(const std::vector<std::shared_ptr<OperatorInfo>> &ops
return (output_shape[LongToSize(a + 1)] > output_shape[LongToSize(b + 1)]);
});
std::transform(std::begin(index), std::end(index), std::begin(index), [](int64_t x) { return x + 1; });
index.insert(index.begin(), 0);
(void)index.insert(index.cbegin(), 0);
Dimensions strategie(output_shape.size(), 1);
size_t num_device = g_device_manager->DeviceNum();
@ -282,7 +282,7 @@ Dimensions PrepareGatherV2OutputStrategy(const std::vector<std::shared_ptr<Opera
std::sort(index.begin(), index.end(),
[&output_shape](const size_t &a, const size_t &b) { return (output_shape[a + 1] > output_shape[b + 1]); });
std::transform(std::begin(index), std::end(index), std::begin(index), [](int64_t x) { return x + 1; });
index.insert(index.begin(), 0);
(void)index.insert(index.cbegin(), 0);
Dimensions strategie(output_shape.size(), 1);
size_t num_device = g_device_manager->DeviceNum();
@ -450,7 +450,7 @@ Strategies MakeDataParallelStrategy(const std::shared_ptr<Graph> &graph,
StrategyPtr origin_strategy = ops[iter_ops]->strategy();
Strategies strategies;
size_t max_device_num = g_device_manager->DeviceNum();
size_t target_tensor_batch = ops[iter_ops]->inputs_tensor_info()[0].shape()[0];
size_t target_tensor_batch = LongToUlong(ops[iter_ops]->inputs_tensor_info()[0].shape()[0]);
for (size_t iter_op_inputs = 0; iter_op_inputs < ops[iter_ops]->inputs_tensor_info().size(); iter_op_inputs++) {
if (iter_op_inputs >= origin_strategy->GetInputDim().size()) {
MS_LOG(EXCEPTION) << "Failure: Strategy's InputDim out of range.";
@ -623,7 +623,7 @@ void ModifyParamSharingOpsStrategy(const std::vector<std::shared_ptr<OperatorInf
str1 = str_j;
size_t num_device_used = 1;
for (size_t i = 0; i < str_j.size(); i++) {
num_device_used *= str_j[i];
num_device_used *= LongToSize(str_j[i]);
}
str2.push_back(g_device_manager->DeviceNum() / num_device_used);
str2.push_back(1);
@ -676,14 +676,14 @@ float CheckVirtualDatasetStrategy(const std::shared_ptr<Graph> &graph, const siz
Dimensions CopyVirtualDataset(const std::shared_ptr<Graph> &graph,
const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
const size_t iter_graph) {
const size_t iter_graph, float epsilon = 0.00005f) {
Dimensions s;
auto input_stra_dim = ops[iter_ops]->inputs_tensor_info()[0].shape().size();
auto virtual_dataset_str = CheckVirtualDatasetStrategy(graph, iter_graph);
if (input_stra_dim == 0) {
return s;
} else {
if (virtual_dataset_str == 0) {
if (std::fabs(virtual_dataset_str) < epsilon) {
s.push_back(1);
} else {
s.push_back(FloatToLong(1 / virtual_dataset_str));
@ -805,7 +805,7 @@ Dimensions PrepareReshapeOutputStrategy(const std::vector<std::shared_ptr<Operat
for (size_t j = LongToSize(tmp_index); j < input_shape.size(); j++) {
tmp_prod *= strategy->GetInputDim()[0][j];
tmp_index++;
if (mapping[i] == (int64_t)j) {
if (mapping[i] == SizeToLong(j)) {
s.push_back(tmp_prod);
tmp_prod = 1;
break;
@ -839,12 +839,15 @@ Dimensions PrepareExpandDimsOutputStrategy(const std::vector<std::shared_ptr<Ope
// The strategy of the expanded dimesion will be assigned 1, the others take the strategies of corresponding
// dimensions.
for (size_t i = 0; i < ops[incoming_op_index]->inputs_tensor_info()[0].shape().size() + 1; i++) {
if ((int64_t)i == axis_input) {
if (UlongToLong(i) == axis_input) {
s.push_back(1);
already_expand = true;
} else if ((int64_t)i != axis_input && !already_expand) {
} else if (UlongToLong(i) != axis_input && !already_expand) {
s.push_back(strategy->GetInputDim()[0][i]);
} else {
if (i < 1) {
MS_LOG(EXCEPTION) << "The index i -1 is less than 0. Please check the situation.";
}
s.push_back(strategy->GetInputDim()[0][i - 1]);
}
}
@ -904,7 +907,7 @@ Dimensions PrepareIncomingOperatorInputStrategy(const std::vector<std::shared_pt
} else if (ops[incoming_op_index]->type() == EXPAND_DIMS) {
return PrepareExpandDimsOutputStrategy(ops, incoming_op_index);
}
for (size_t i = 0; i < (size_t)ops[incoming_op_index]->inputs_tensor_info().size(); i++) {
for (size_t i = 0; i < ops[incoming_op_index]->inputs_tensor_info().size(); i++) {
if (ops[incoming_op_index]->inputs_tensor_info()[i].shape().size() == 0) {
continue;
}
@ -955,10 +958,10 @@ Dimensions ModifyStrategyIfSqueezeIncoming(const std::vector<std::shared_ptr<Ope
if (ops[incoming_op_index]->inputs_tensor_info()[0].shape()[LongToSize(axis)] != 1) {
MS_LOG(EXCEPTION) << "Failure: Removed dimension's shape is not 1." << std::endl;
}
stra_dim_list.erase(it);
(void)stra_dim_list.erase(it);
}
for (size_t i = 0; i < (size_t)stra_dim_list.size(); i++) {
for (size_t i = 0; i < stra_dim_list.size(); i++) {
s_Squeeze.push_back(s[LongToSize(stra_dim_list[i])]);
}
return s_Squeeze;
@ -1020,10 +1023,10 @@ Dimensions ModifyStrategyIfReduceIncoming(const std::vector<std::shared_ptr<Oper
if (it == axis_list.end()) {
MS_LOG(EXCEPTION) << "Failure: Can not find dimension indexes in Axis." << std::endl;
}
axis_list.erase(it);
(void)axis_list.erase(it);
}
for (size_t i = 0; i < (size_t)axis_list.size(); i++) {
for (size_t i = 0; i < axis_list.size(); i++) {
s_Reduce.push_back(s[LongToSize(axis_list[i])]);
}
return s_Reduce;
@ -1076,10 +1079,10 @@ Dimensions ModifyStrategyIfArgIncoming(const std::vector<std::shared_ptr<Operato
if (it == axis_list.end()) {
MS_LOG(EXCEPTION) << "Failure: Can not find dimension indexes in Axis." << std::endl;
}
axis_list.erase(it);
(void)axis_list.erase(it);
}
for (size_t i = 0; i < (size_t)axis_list.size(); i++) {
for (size_t i = 0; i < axis_list.size(); i++) {
s_Arg.push_back(s[LongToSize(axis_list[i])]);
}
return s_Arg;
@ -1117,8 +1120,7 @@ Strategies GenerateStrategiesFromStrategy(const std::vector<std::shared_ptr<Oper
}
if (basic_stra.size() == 0) {
for (size_t iter_op_inputs = 0; iter_op_inputs < (size_t)ops[iter_ops]->inputs_tensor_info().size();
iter_op_inputs++) {
for (size_t iter_op_inputs = 0; iter_op_inputs < ops[iter_ops]->inputs_tensor_info().size(); iter_op_inputs++) {
stra.push_back(basic_stra);
}
return stra;
@ -1158,7 +1160,7 @@ Strategies GenerateStrategiesFromStrategy(const std::vector<std::shared_ptr<Oper
// Function to deal with ops with broadcasting, like TensorAdd/Sub/Mul/Div etc.
Strategies CheckBroadcast(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
const Dimensions s) {
const Dimensions &s) {
Strategies stra;
size_t first_tensor_dim = ops[iter_ops]->inputs_tensor_info()[0].shape().size();
@ -1257,13 +1259,12 @@ Dimensions ApplyBroadcast(const std::vector<std::shared_ptr<OperatorInfo>> &ops,
// Check whether the operator can be divided by the current strategy.
Strategies CheckDivisible(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
const Dimensions basic_stra) {
const Dimensions &basic_stra) {
Dimensions s_empty = {};
Strategies stra;
// For all the input tensors.
for (size_t iter_op_inputs = 0; iter_op_inputs < (size_t)ops[iter_ops]->inputs_tensor_info().size();
iter_op_inputs++) {
for (size_t iter_op_inputs = 0; iter_op_inputs < ops[iter_ops]->inputs_tensor_info().size(); iter_op_inputs++) {
// If input tensor is empty, return strategy as void.
if (ops[iter_ops]->inputs_tensor_info()[iter_op_inputs].shape().size() == 0) {
stra.push_back(s_empty);
@ -1274,7 +1275,7 @@ Strategies CheckDivisible(const std::vector<std::shared_ptr<OperatorInfo>> &ops,
bool modified = false;
// Make sure each tensor's dim shape is greater than 1. If not, push back strategy as 1 instead.
for (size_t j = 0; j < (size_t)ops[iter_ops]->inputs_tensor_info()[iter_op_inputs].shape().size(); j++) {
for (size_t j = 0; j < ops[iter_ops]->inputs_tensor_info()[iter_op_inputs].shape().size(); j++) {
if (ops[iter_ops]->inputs_tensor_info()[iter_op_inputs].shape()[j] == 1) {
tmp_stra[j] = 1;
modified = true;
@ -1336,8 +1337,8 @@ Dimensions ModifyStrategyIfSqueezeOutgoing(const std::vector<std::shared_ptr<Ope
auto axis_list = GetAxisList(ops, SizeToLong(iter_ops));
size_t s_index = 0;
size_t axis_list_index = 0;
for (size_t i = 0; i < (size_t)(s.size() + axis_list.size()); i++) {
if (i == (size_t)axis_list[axis_list_index]) {
for (size_t i = 0; i < s.size() + axis_list.size(); i++) {
if (i == LongToSize(axis_list[axis_list_index])) {
s_Squeeze.push_back(1);
axis_list_index++;
} else {

View File

@ -31,7 +31,7 @@ void GenerateStrategy(const std::shared_ptr<Graph> &graph, const std::vector<std
const std::shared_ptr<std::vector<std::vector<size_t>>> &eli_list,
const std::vector<std::vector<std::string>> &input_tensor_names,
const std::shared_ptr<std::vector<size_t>> &index_list, bool is_training,
const std::vector<std::vector<size_t>> &shared_tensors_ops);
const std::vector<std::vector<size_t>> &param_users_ops_index);
Dimensions PrepareMatMulStrategy(const std::shared_ptr<Graph> &graph, const size_t iter_graph, bool transpose_a,
bool transpose_b, size_t iter_op_inputs);
Strategies PrepareMatMul(const std::shared_ptr<Graph> &graph, const std::vector<std::shared_ptr<OperatorInfo>> &ops,
@ -40,7 +40,7 @@ Strategies PrepareBiasAdd(const std::shared_ptr<Dimensions> &s);
Strategies PrepareStridedSlice(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
Dimensions basic_stra);
Strategies PrepareSoftMax(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
Dimensions basic_stra);
const Dimensions &basic_stra);
Strategies PrepareOneHot(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops, Dimensions s);
Strategies PrepareAxisRelatedStrategy(const std::shared_ptr<Graph> &graph,
const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_graph,
@ -53,10 +53,12 @@ Strategies PrepareL2Normalize(const std::vector<std::shared_ptr<OperatorInfo>> &
Strategies MakeRecSearchStrategy(const std::shared_ptr<Graph> &graph,
const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_graph,
const size_t iter_ops);
Strategies CheckBroadcast(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops, Dimensions s);
Strategies CheckBroadcast(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
const Dimensions &s);
Dimensions ApplyBroadcast(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops, Dimensions s,
size_t first_tensor_dim, size_t second_tensor_dim, bool broadcast_first_tensor);
Strategies CheckDivisible(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops, Dimensions s);
Strategies CheckDivisible(const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_ops,
const Dimensions &s);
Strategies MakeDataParallelStrategy(const std::shared_ptr<Graph> &graph,
const std::vector<std::shared_ptr<OperatorInfo>> &ops, const size_t iter_graph,
const size_t iter_ops);
@ -118,7 +120,7 @@ void GenerateRemainingOperatorStrategy(const std::shared_ptr<Graph> &graph,
const std::shared_ptr<std::vector<size_t>> &index_list,
const std::shared_ptr<std::vector<size_t>> &no_stra_op_list);
void ModifyParamSharingOpsStrategy(const std::vector<std::shared_ptr<OperatorInfo>> &ops,
const std::vector<std::vector<size_t>> &shared_tensors_ops);
const std::vector<std::vector<size_t>> &param_users_ops_index);
} // namespace parallel
} // namespace mindspore
#endif // PARALLEL_AUTO_PARALLEL_REC_GENERATE_STRATEGY_H_

View File

@ -34,8 +34,8 @@ namespace {
constexpr char INPUTS[] = "inputs";
constexpr char ATTRS[] = "attrs";
using FuncGraphNameMap = const std::unordered_map<FuncGraphPtr, std::string>;
static std::unordered_map<std::string, size_t> op_count;
static std::unordered_map<CNodePtr, std::string> name_map;
static std::unordered_map<std::string, size_t> op_count = {};
static std::unordered_map<CNodePtr, std::string> name_map = {};
// Extract the op name and the topology number of the same node in the graph
// e.g, Default/Mul-op32 -> Mul-op0, Default/Mul-op35 -> Mul-op1

View File

@ -65,8 +65,8 @@ Status BroadcastToInfo::CheckStrategy(const StrategyPtr &strategy) {
MS_LOG(ERROR) << name_ << ": Invalid strategy";
return FAILED;
}
auto stra = strategy->GetInputDim().at(0);
auto input_dim = strategy->GetInputDim();
auto stra = input_dim.at(0);
auto in_shape = inputs_shape_.at(0);
for (size_t i = 0; i < stra.size(); ++i) {
if ((in_shape[i] == 1) && (stra[i] != 1)) {

View File

@ -241,8 +241,9 @@ Status GatherInfo::CheckManualSplit(const Strategies &strategy) {
}
Status GatherInfo::CheckSplitAxisStrategy(const StrategyPtr &strategy) {
auto param_strategy = strategy->GetInputDim().at(0);
auto index_strategy = strategy->GetInputDim().at(1);
auto input_dim = strategy->GetInputDim();
auto param_strategy = input_dim.at(0);
auto index_strategy = input_dim.at(1);
// param_strategy(axis) != 1, index can't be split
auto product_i = std::accumulate(index_strategy.begin(), index_strategy.end(), 1, std::multiplies<int64_t>());
if ((param_strategy.at(LongToSize(axis_)) != 1) && (product_i != 1)) {
@ -302,7 +303,8 @@ bool GatherInfo::ShardBatchAndAxis(const Strategies &strategy) const {
}
void GatherInfo::SetAttribute(const StrategyPtr &strategy) {
auto param_strategy = strategy->GetInputDim().at(0);
auto input_dim = strategy->GetInputDim();
auto param_strategy = input_dim.at(0);
// axis=0, index_shape(0)%param_strategy(0) must be 0
Shape index_shape = inputs_shape_.at(1);
if ((axis_ == 0) && (index_shape.at(0) % param_strategy.at(0) != 0) && !dynamic_shape_indices_) {
@ -335,7 +337,8 @@ Status GatherInfo::CheckStrategy(const StrategyPtr &strategy) {
// param slice shape need 32Byte aligned
auto param_shape = inputs_shape_.at(0);
auto param_strategy = strategy->GetInputDim().at(0);
auto input_dim = strategy->GetInputDim();
auto param_strategy = input_dim.at(0);
auto slice_shape = param_shape.at(param_shape.size() - 1) / param_strategy.at(param_strategy.size() - 1);
if ((target_ != CPU) && (slice_shape % 8 != 0) && (slice_shape != 1)) {
ReportError(name_ + ": Last dim of param slice shape need 32Byte aligned.");
@ -480,8 +483,9 @@ Status GatherInfo::InferDevMatrixShape() {
dev_matrix_shape_.clear();
out_dev_matrix_shape_.clear();
// infer input dev_matrix_shape
auto param_strategy = strategy_->GetInputDim().at(0);
auto index_strategy = strategy_->GetInputDim().at(1);
auto param_stra = strategy_->GetInputDim();
auto param_strategy = param_stra.at(0);
auto index_strategy = param_stra.at(1);
if (manual_split_) {
dev_matrix_shape_ = param_strategy;
@ -545,7 +549,8 @@ void GatherInfo::InferInputsTensorMap() {
size_t total_size = param_size + index_size;
Shape tensor_map_index;
Shape tensor_map_params;
auto param_strategy = strategy_->GetInputDim().at(0);
auto input_dim = strategy_->GetInputDim();
auto param_strategy = input_dim.at(0);
if (param_strategy.at(LongToSize(axis_)) != 1) {
(void)tensor_map_index.insert(tensor_map_index.begin(), index_size, MAP_NONE);
for (size_t i = 0; i < param_size; ++i) {
@ -600,7 +605,8 @@ void GatherInfo::InferOutputsTensorMap() {
size_t index_size = inputs_shape_.at(1).size();
size_t total_size = param_size + index_size;
Shape tensor_map_out;
auto param_strategy = strategy_->GetInputDim().at(0);
auto input_dim = strategy_->GetInputDim();
auto param_strategy = input_dim.at(0);
if (param_strategy.at(LongToSize(axis_)) == 1) {
// param_strategy(axis) is 1
for (size_t i = 0; i < param_size; ++i) {
@ -835,7 +841,8 @@ Status GatherInfo::InferForwardCommunication() {
}
forward_op_.clear();
auto param_strategy = strategy_->GetInputDim().at(0);
auto input_dim = strategy_->GetInputDim();
auto param_strategy = input_dim.at(0);
// don't split axis or target is not CPU, no need forward communication
if (target_ != CPU || param_strategy.at(LongToSize(axis_)) == 1) {
return SUCCESS;
@ -934,8 +941,8 @@ ReplaceGraphPtr GatherInfo::replace_graph(const CNodePtr &cnode) {
}
return replace_graph_;
}
auto param_strategy = strategy_->GetInputDim().at(0);
auto input_dim = strategy_->GetInputDim();
auto param_strategy = input_dim.at(0);
// target_ == CPU, no need to replace graph
if (target_ == CPU) {
return nullptr;

View File

@ -85,11 +85,11 @@ def init(backend_name=None):
Initialize distributed backend, e.g. HCCL/NCCL, it is required before using the communication service.
Note:
The full name of HCCL is Huawei Collective Communication Library.
The full name of NCCL is NVIDIA Collective Communication Library.
The full name of MCCL is MindSpore Collective Communication Library.
This method should be used after set_context. The user needs to preset communication environment variables
before running the following example, please see the docstring of the mindspore.management.
- The full name of HCCL is Huawei Collective Communication Library.
- The full name of NCCL is NVIDIA Collective Communication Library.
- The full name of MCCL is MindSpore Collective Communication Library.
- This method should be used after set_context. The user needs to preset communication environment variables
before running the following example, please see the docstring of the mindspore.communication.
Args:
backend_name (str): Backend, using HCCL/NCCL/MCCL. If the `backend_name` is None, system will recognize
@ -167,7 +167,7 @@ def release():
Note:
This method should be used after init(). The user needs to preset communication environment variables
before running the following example, please see the docstring of the mindspore.managerment.
before running the following example, please see the docstring of the mindspore.communication.
Raises:
RuntimeError: If failed to release distributed resource.
@ -189,7 +189,7 @@ def get_rank(group=GlobalComm.WORLD_COMM_GROUP):
Note:
This method should be used after init(). The user needs to preset communication environment variables
before running the following example, please see the docstring of the mindspore.managerment.
before running the following example, please see the docstring of the mindspore.communication.
Args:
group (str): The communication group to work on. Normally, the group should be created by create_group,
@ -226,7 +226,7 @@ def get_local_rank(group=GlobalComm.WORLD_COMM_GROUP):
Note:
GPU version of MindSpore doesn't support this method.
This method should be used after init(). The user needs to preset communication environment variables
before running the following example, please see the docstring of the mindspore.managerment.
before running the following example, please see the docstring of the mindspore.communication.
Args:
group (str): The communication group to work on. Normally, the group should be created by create_group,
@ -266,7 +266,7 @@ def get_group_size(group=GlobalComm.WORLD_COMM_GROUP):
Note:
This method should be used after init(). The user needs to preset communication environment variables before
running the following example, please see the docstring of the mindspore.managerment.
running the following example, please see the docstring of the mindspore.communication.
Args:
group (str): The communication group to work on. Normally, the group should be created by create_group,
@ -305,7 +305,7 @@ def get_local_rank_size(group=GlobalComm.WORLD_COMM_GROUP):
Note:
GPU version of MindSpore doesn't support this method.
This method should be used after init(). The user needs to preset communication environment variables before
running the following example, please see the docstring of the mindspore.managerment.
running the following example, please see the docstring of the mindspore.communication.
Args:
group (str): The communication group to work on. The group is created by create_group
@ -347,7 +347,7 @@ def get_world_rank_from_group_rank(group, group_rank_id):
GPU version of MindSpore doesn't support this method.
The parameter group should not be "hccl_world_group".
This method should be used after init(). The user needs to preset communication environment variables
before running the following example, please see the docstring of the mindspore.managerment.
before running the following example, please see the docstring of the mindspore.communication.
Args:
group (str): The communication group to work on. The group is created by create_group.

View File

@ -607,9 +607,10 @@ class _LogActionOnce:
"""
__is_logged__ = dict()
def __init__(self, logger, key):
def __init__(self, logger, key, no_warning=False):
self.logger = logger
self.key = key
self.no_warning = no_warning
def __call__(self, func):
def wrapper(*args, **kwargs):
@ -617,7 +618,7 @@ class _LogActionOnce:
return func(*args, **kwargs)
_old_ = self.logger.warning
if self.key in _LogActionOnce.__is_logged__:
if self.no_warning or self.key in _LogActionOnce.__is_logged__:
self.logger.warning = lambda x: x
else:
_LogActionOnce.__is_logged__[self.key] = True

View File

@ -26,6 +26,8 @@ from mindspore.nn.loss.loss import _check_is_tensor
from mindspore.parallel._utils import _get_parallel_mode, _is_sharding_propagation
from mindspore.context import ParallelMode
from mindspore.parallel._utils import _get_device_num, _get_pipeline_stages
from mindspore.log import _LogActionOnce
from mindspore import log as logger
from .layers import _check_input_dtype, _check_input_shape
from .op_parallel_config import default_dpmp_config, OpParallelConfig
@ -181,7 +183,8 @@ class CrossEntropyLoss(Cell):
>>> print(output.shape)
(1,)
"""
@_LogActionOnce(logger=logger, key='CrossEntropyLoss',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
def __init__(self, parallel_config=default_dpmp_config):
super(CrossEntropyLoss, self).__init__()
if not isinstance(parallel_config, OpParallelConfig):

View File

@ -31,6 +31,7 @@ from mindspore._checkparam import Validator
from mindspore import log as logger
from mindspore.parallel._utils import _get_parallel_mode, _is_sharding_propagation
from mindspore.context import ParallelMode
from mindspore.log import _LogActionOnce
from .layers import _LayerNorm, _Linear, _check_input_shape, \
_args_type_validator_check, _valid_type_checks, _valid_value_checks, \
_check_shape_equal, _check_past_none_input_none, _check_input_dtype, _check_input_shape_value
@ -390,7 +391,8 @@ class FeedForward(Cell):
>>> print(output.shape)
(2, 20, 15)
"""
@_LogActionOnce(logger=logger, key='FeedForward',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(hidden_size=Validator.check_positive_int,
ffn_hidden_size=Validator.check_positive_int,
dropout_rate=Validator.check_non_negative_float,
@ -578,7 +580,8 @@ class AttentionMask(Cell):
[1. 1. 1. 0]
[0. 0. 0. 0]]]
"""
@_LogActionOnce(logger=logger, key='AttentionMask',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(seq_length=Validator.check_positive_int,
parallel_config=_valid_type_checks([OpParallelConfig], "AttentionMask"))
def __init__(self, seq_length, parallel_config=default_dpmp_config):
@ -667,7 +670,8 @@ class VocabEmbedding(Cell):
>>> print(table.shape)
(30, 30)
"""
@_LogActionOnce(logger=logger, key='VocabEmbedding',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(vocab_size=Validator.check_positive_int,
embedding_size=Validator.check_positive_int,
parallel_config=_valid_type_checks([EmbeddingOpParallelConfig], "VocabEmbedding"))
@ -821,7 +825,8 @@ class MultiHeadAttention(Cell):
>>> print(past[1].shape)
(2, 3, 20, 5)
"""
@_LogActionOnce(logger=logger, key='MultiHeadAttention',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(batch_size=Validator.check_positive_int,
hidden_size=Validator.check_positive_int,
num_heads=Validator.check_positive_int,
@ -1420,7 +1425,8 @@ class TransformerEncoderLayer(Cell):
>>> print(past[1].shape)
(2, 2, 16, 4)
"""
@_LogActionOnce(logger=logger, key='TransformerEncoderLayer',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(batch_size=Validator.check_positive_int,
hidden_size=Validator.check_positive_int,
num_heads=Validator.check_positive_int,
@ -1804,7 +1810,8 @@ class TransformerDecoderLayer(Cell):
>>> print(past[3].shape)
(2, 2, 20, 32)
"""
@_LogActionOnce(logger=logger, key='TransformerDecoderLayer',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(batch_size=Validator.check_positive_int,
hidden_size=Validator.check_positive_int,
num_heads=Validator.check_positive_int,
@ -2344,7 +2351,8 @@ class TransformerEncoder(Cell):
>>> print(past[0][1].shape)
(2, 2, 16, 4)
"""
@_LogActionOnce(logger=logger, key='TransformerEncoder',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(batch_size=Validator.check_positive_int,
hidden_size=Validator.check_positive_int,
num_heads=Validator.check_positive_int,
@ -2575,7 +2583,8 @@ class TransformerDecoder(Cell):
>>> print(past[0][3].shape)
(2, 2, 20, 32)
"""
@_LogActionOnce(logger=logger, key='TransformerDecoder',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(batch_size=Validator.check_positive_int,
hidden_size=Validator.check_positive_int,
num_heads=Validator.check_positive_int,
@ -2842,7 +2851,8 @@ class Transformer(Cell):
>>> print(de_past[0][3].shape)
(2, 2, 20, 32)
"""
@_LogActionOnce(logger=logger, key='Transformer',
no_warning=_get_parallel_mode() in (ParallelMode.STAND_ALONE,))
@_args_type_validator_check(batch_size=Validator.check_positive_int,
hidden_size=Validator.check_positive_int,
num_heads=Validator.check_positive_int,

View File

@ -13,8 +13,13 @@
# limitations under the License.
# ============================================================================
""" test transformer"""
import os
import shutil
import numpy as np
import pytest
import mindspore
from mindspore import Tensor
from mindspore.common import dtype
from mindspore.parallel.nn import MultiHeadAttention, FeedForward, TransformerEncoderLayer, TransformerEncoder, \
@ -271,3 +276,60 @@ def test_sparse_attention():
v = Tensor(np.ones((2, 1024, 512)), dtype.float16)
mask = Tensor(np.ones((2, 1024, 1024)), dtype.float32)
_cell_graph_executor.compile(model, q, k, v, mask)
class TestBasicWarningValidator:
log_envs = dict(GLOG_v=None, GLOG_logtostderr=None, GLOG_log_dir=None, logger_maxBytes=None,
logger_backupCount=None)
log_path = './TestBasicWarningValidator'
def setup_method(self):
for env in self.log_envs:
self.log_envs[env] = os.environ.get(env, None)
os.environ['GLOG_log_dir'] = self.log_path
os.environ['GLOG_v'] = '1'
os.environ['GLOG_logtostderr'] = '0'
# Force to generate the logger again
# pylint: disable=W0212
mindspore.log._global_logger = None
def teardown_method(self):
for env in self.log_envs:
if self.log_envs.get(env, False):
os.environ[env] = self.log_envs.get(env, "False")
shutil.rmtree(os.path.join(self.log_path))
def check_warning_log(self):
cmd = f'cd {self.log_path} && grep WARNING rank_0/logs/mindspore.log.* |wc -l'
file_count = os.popen(cmd).read().strip()
assert file_count == "0"
def test_cross_entory_no_warning(self):
"""
Feature: Test the warning log
Description: Test a forward compile has no warning error
Expectation: To compile passed
"""
# Force to rebuild the logger
test_cross_entroy()
self.check_warning_log()
def test_transformer_encoder_no_warning(self):
"""
Feature: Test the warning log
Description: Test a forward compile has no warning error
Expectation: To compile passed
"""
# Force to rebuild the logger
test_transformer_encoder_only()
self.check_warning_log()
def test_transformer_decoder_no_warning(self):
"""
Feature: Test the warning log
Description: Test a forward compile has no warning error
Expectation: To compile passed
"""
# Force to rebuild the logger
test_transformer_decoder()
self.check_warning_log()