forked from mindspore-Ecosystem/mindspore
!9786 fix wide&deep readme
From: @yao_yf Reviewed-by: @zhunaipan,@stsuteng Signed-off-by: @stsuteng
This commit is contained in:
commit
561ced751d
|
@ -1888,7 +1888,7 @@ class UnsortedSegmentSum(PrimitiveWithInfer):
|
||||||
output_min_shape = list(num_segments['min_value'])
|
output_min_shape = list(num_segments['min_value'])
|
||||||
else:
|
else:
|
||||||
if isinstance(num_segments_type, type(mstype.tensor)):
|
if isinstance(num_segments_type, type(mstype.tensor)):
|
||||||
raise ValueError("In dynamic shape scene, the num_segments should contains max_value and min_value")
|
raise ValueError("Num_segments only support int type when it is not a dynamic value")
|
||||||
output_max_shape = [num_segments_v]
|
output_max_shape = [num_segments_v]
|
||||||
output_min_shape = [num_segments_v]
|
output_min_shape = [num_segments_v]
|
||||||
if 'max_shape' in x and 'min_shape' in x:
|
if 'max_shape' in x and 'min_shape' in x:
|
||||||
|
|
|
@ -1,4 +1,6 @@
|
||||||
# Contents
|
# Contents
|
||||||
|
|
||||||
|
- [Contents](#contents)
|
||||||
- [Wide&Deep Description](#widedeep-description)
|
- [Wide&Deep Description](#widedeep-description)
|
||||||
- [Model Architecture](#model-architecture)
|
- [Model Architecture](#model-architecture)
|
||||||
- [Dataset](#dataset)
|
- [Dataset](#dataset)
|
||||||
|
@ -24,11 +26,12 @@
|
||||||
- [Description of Random Situation](#description-of-random-situation)
|
- [Description of Random Situation](#description-of-random-situation)
|
||||||
- [ModelZoo Homepage](#modelzoo-homepage)
|
- [ModelZoo Homepage](#modelzoo-homepage)
|
||||||
|
|
||||||
|
|
||||||
# [Wide&Deep Description](#contents)
|
# [Wide&Deep Description](#contents)
|
||||||
|
|
||||||
Wide&Deep model is a classical model in Recommendation and Click Prediction area. This is an implementation of Wide&Deep as described in the [Wide & Deep Learning for Recommender System](https://arxiv.org/pdf/1606.07792.pdf) paper.
|
Wide&Deep model is a classical model in Recommendation and Click Prediction area. This is an implementation of Wide&Deep as described in the [Wide & Deep Learning for Recommender System](https://arxiv.org/pdf/1606.07792.pdf) paper.
|
||||||
|
|
||||||
# [Model Architecture](#contents)
|
# [Model Architecture](#contents)
|
||||||
|
|
||||||
Wide&Deep model jointly trained wide linear models and deep neural network, which combined the benefits of memorization and generalization for recommender systems.
|
Wide&Deep model jointly trained wide linear models and deep neural network, which combined the benefits of memorization and generalization for recommender systems.
|
||||||
|
|
||||||
Currently we support host-device mode with column partition and parameter server mode.
|
Currently we support host-device mode with column partition and parameter server mode.
|
||||||
|
@ -38,6 +41,7 @@ Currently we support host-device mode with column partition and parameter serve
|
||||||
- [1] A dataset used in Guo H , Tang R , Ye Y , et al. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction[J]. 2017.
|
- [1] A dataset used in Guo H , Tang R , Ye Y , et al. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction[J]. 2017.
|
||||||
|
|
||||||
# [Environment Requirements](#contents)
|
# [Environment Requirements](#contents)
|
||||||
|
|
||||||
- Hardware(Ascend or GPU)
|
- Hardware(Ascend or GPU)
|
||||||
- Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
|
- Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
|
||||||
- Framework
|
- Framework
|
||||||
|
@ -46,42 +50,50 @@ Currently we support host-device mode with column partition and parameter serve
|
||||||
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
|
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
|
||||||
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
|
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# [Quick Start](#contents)
|
# [Quick Start](#contents)
|
||||||
|
|
||||||
1. Clone the Code
|
1. Clone the Code
|
||||||
```
|
|
||||||
|
```bash
|
||||||
git clone https://gitee.com/mindspore/mindspore.git
|
git clone https://gitee.com/mindspore/mindspore.git
|
||||||
cd mindspore/model_zoo/official/recommend/wide_and_deep
|
cd mindspore/model_zoo/official/recommend/wide_and_deep
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Download the Dataset
|
2. Download the Dataset
|
||||||
|
|
||||||
> Please refer to [1] to obtain the download link
|
> Please refer to [1] to obtain the download link
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
mkdir -p data/origin_data && cd data/origin_data
|
mkdir -p data/origin_data && cd data/origin_data
|
||||||
wget DATA_LINK
|
wget DATA_LINK
|
||||||
tar -zxvf dac.tar.gz
|
tar -zxvf dac.tar.gz
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Use this script to preprocess the data. This may take about one hour and the generated mindrecord data is under data/mindrecord.
|
3. Use this script to preprocess the data. This may take about one hour and the generated mindrecord data is under data/mindrecord.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python src/preprocess_data.py --data_path=./data/ --dense_dim=13 --slot_dim=26 --threshold=100 --train_line_count=45840617 --skip_id_convert=0
|
python src/preprocess_data.py --data_path=./data/ --dense_dim=13 --slot_dim=26 --threshold=100 --train_line_count=45840617 --skip_id_convert=0
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Start Training
|
4. Start Training
|
||||||
|
|
||||||
Once the dataset is ready, the model can be trained and evaluated on the single device(Ascend) by the command as follows:
|
Once the dataset is ready, the model can be trained and evaluated on the single device(Ascend) by the command as follows:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python train_and_eval.py --data_path=./data/mindrecord --data_type=mindrecord
|
python train_and_eval.py --data_path=./data/mindrecord --dataset_type=mindrecord
|
||||||
```
|
|
||||||
To evaluate the model, command as follows:
|
|
||||||
```bash
|
|
||||||
python eval.py --data_path=./data/mindrecord --data_type=mindrecord
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
To evaluate the model, command as follows:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python eval.py --data_path=./data/mindrecord --dataset_type=mindrecord
|
||||||
|
```
|
||||||
|
|
||||||
# [Script Description](#contents)
|
# [Script Description](#contents)
|
||||||
|
|
||||||
## [Script and Sample Code](#contents)
|
## [Script and Sample Code](#contents)
|
||||||
```
|
|
||||||
|
```bash
|
||||||
└── wide_and_deep
|
└── wide_and_deep
|
||||||
├── eval.py
|
├── eval.py
|
||||||
├── README.md
|
├── README.md
|
||||||
|
@ -121,8 +133,7 @@ python eval.py --data_path=./data/mindrecord --data_type=mindrecord
|
||||||
|
|
||||||
The parameters is same for ``train.py``,``train_and_eval.py`` ,``train_and_eval_distribute.py`` and ``train_and_eval_auto_parallel.py``
|
The parameters is same for ``train.py``,``train_and_eval.py`` ,``train_and_eval_distribute.py`` and ``train_and_eval_auto_parallel.py``
|
||||||
|
|
||||||
|
```python
|
||||||
```
|
|
||||||
usage: train.py [-h] [--device_target {Ascend,GPU}] [--data_path DATA_PATH]
|
usage: train.py [-h] [--device_target {Ascend,GPU}] [--data_path DATA_PATH]
|
||||||
[--epochs EPOCHS] [--full_batch FULL_BATCH]
|
[--epochs EPOCHS] [--full_batch FULL_BATCH]
|
||||||
[--batch_size BATCH_SIZE] [--eval_batch_size EVAL_BATCH_SIZE]
|
[--batch_size BATCH_SIZE] [--eval_batch_size EVAL_BATCH_SIZE]
|
||||||
|
@ -164,8 +175,10 @@ optional arguments:
|
||||||
--dataset_type The data type of the training files, chosen from tfrecord/mindrecord/hd5.(Default:tfrecord)
|
--dataset_type The data type of the training files, chosen from tfrecord/mindrecord/hd5.(Default:tfrecord)
|
||||||
--parameter_server Open parameter server of not.(Default:0)
|
--parameter_server Open parameter server of not.(Default:0)
|
||||||
```
|
```
|
||||||
|
|
||||||
### [Preprocess Script Parameters](#contents)
|
### [Preprocess Script Parameters](#contents)
|
||||||
```
|
|
||||||
|
```python
|
||||||
usage: generate_synthetic_data.py [-h] [--output_file OUTPUT_FILE]
|
usage: generate_synthetic_data.py [-h] [--output_file OUTPUT_FILE]
|
||||||
[--label_dim LABEL_DIM]
|
[--label_dim LABEL_DIM]
|
||||||
[--number_examples NUMBER_EXAMPLES]
|
[--number_examples NUMBER_EXAMPLES]
|
||||||
|
@ -183,7 +196,7 @@ optional arguments:
|
||||||
--random_slot_values 0 or 1. If 1, the id is generated by the random. If 0, the id is set by the row_index mod part_size, where part_size is the vocab size for each slot
|
--random_slot_values 0 or 1. If 1, the id is generated by the random. If 0, the id is set by the row_index mod part_size, where part_size is the vocab size for each slot
|
||||||
```
|
```
|
||||||
|
|
||||||
```
|
```python
|
||||||
usage: preprocess_data.py [-h]
|
usage: preprocess_data.py [-h]
|
||||||
[--data_path DATA_PATH] [--dense_dim DENSE_DIM]
|
[--data_path DATA_PATH] [--dense_dim DENSE_DIM]
|
||||||
[--slot_dim SLOT_DIM] [--threshold THRESHOLD]
|
[--slot_dim SLOT_DIM] [--threshold THRESHOLD]
|
||||||
|
@ -203,28 +216,35 @@ usage: preprocess_data.py [-h]
|
||||||
### [Process the Real World Data](#content)
|
### [Process the Real World Data](#content)
|
||||||
|
|
||||||
1. Download the Dataset and place the raw dataset under a certain path, such as: ./data/origin_data
|
1. Download the Dataset and place the raw dataset under a certain path, such as: ./data/origin_data
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
mkdir -p data/origin_data && cd data/origin_data
|
mkdir -p data/origin_data && cd data/origin_data
|
||||||
wget DATA_LINK
|
wget DATA_LINK
|
||||||
tar -zxvf dac.tar.gz
|
tar -zxvf dac.tar.gz
|
||||||
```
|
```
|
||||||
|
|
||||||
> Please refer to [1] to obtain the download link
|
> Please refer to [1] to obtain the download link
|
||||||
|
|
||||||
2. Use this script to preprocess the data
|
2. Use this script to preprocess the data
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python src/preprocess_data.py --data_path=./data/ --dense_dim=13 --slot_dim=26 --threshold=100 --train_line_count=45840617 --skip_id_convert=0
|
python src/preprocess_data.py --data_path=./data/ --dense_dim=13 --slot_dim=26 --threshold=100 --train_line_count=45840617 --skip_id_convert=0
|
||||||
```
|
```
|
||||||
|
|
||||||
### [Generate and Process the Synthetic Data](#content)
|
### [Generate and Process the Synthetic Data](#content)
|
||||||
|
|
||||||
1. The following command will generate 40 million lines of click data, in the format of
|
1. The following command will generate 40 million lines of click data, in the format of
|
||||||
|
|
||||||
> "label\tdense_feature[0]\tdense_feature[1]...\tsparse_feature[0]\tsparse_feature[1]...".
|
> "label\tdense_feature[0]\tdense_feature[1]...\tsparse_feature[0]\tsparse_feature[1]...".
|
||||||
```
|
|
||||||
|
```bash
|
||||||
mkdir -p syn_data/origin_data
|
mkdir -p syn_data/origin_data
|
||||||
python src/generate_synthetic_data.py --output_file=syn_data/origin_data/train.txt --number_examples=40000000 --dense_dim=13 --slot_dim=51 --vocabulary_size=2000000000 --random_slot_values=0
|
python src/generate_synthetic_data.py --output_file=syn_data/origin_data/train.txt --number_examples=40000000 --dense_dim=13 --slot_dim=51 --vocabulary_size=2000000000 --random_slot_values=0
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Preprocess the generated data
|
2. Preprocess the generated data
|
||||||
```
|
|
||||||
|
```python
|
||||||
python src/preprocess_data.py --data_path=./syn_data/ --dense_dim=13 --slot_dim=51 --threshold=0 --train_line_count=40000000 --skip_id_convert=1
|
python src/preprocess_data.py --data_path=./syn_data/ --dense_dim=13 --slot_dim=51 --threshold=0 --train_line_count=40000000 --skip_id_convert=1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -233,25 +253,30 @@ python src/preprocess_data.py --data_path=./syn_data/ --dense_dim=13 --slot_dim
|
||||||
### [SingleDevice](#contents)
|
### [SingleDevice](#contents)
|
||||||
|
|
||||||
To train and evaluate the model, command as follows:
|
To train and evaluate the model, command as follows:
|
||||||
```
|
|
||||||
|
```python
|
||||||
python train_and_eval.py
|
python train_and_eval.py
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
### [Distribute Training](#contents)
|
### [Distribute Training](#contents)
|
||||||
|
|
||||||
To train the model in data distributed training, command as follows:
|
To train the model in data distributed training, command as follows:
|
||||||
```
|
|
||||||
|
```bash
|
||||||
# configure environment path before training
|
# configure environment path before training
|
||||||
bash run_multinpu_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE
|
bash run_multinpu_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE
|
||||||
```
|
```
|
||||||
|
|
||||||
To train the model in model parallel training, commands as follows:
|
To train the model in model parallel training, commands as follows:
|
||||||
```
|
|
||||||
|
```bash
|
||||||
# configure environment path before training
|
# configure environment path before training
|
||||||
bash run_auto_parallel_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE
|
bash run_auto_parallel_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE
|
||||||
```
|
```
|
||||||
|
|
||||||
To train the model in clusters, command as follows:'''
|
To train the model in clusters, command as follows:'''
|
||||||
```
|
|
||||||
|
```bash
|
||||||
# deploy wide&deep script in clusters
|
# deploy wide&deep script in clusters
|
||||||
# CLUSTER_CONFIG is a json file, the sample is in script/.
|
# CLUSTER_CONFIG is a json file, the sample is in script/.
|
||||||
# EXECUTE_PATH is the scripts path after the deploy.
|
# EXECUTE_PATH is the scripts path after the deploy.
|
||||||
|
@ -262,9 +287,12 @@ bash deploy_cluster.sh CLUSTER_CONFIG_PATH EXECUTE_PATH
|
||||||
bash start_cluster.sh CLUSTER_CONFIG_PATH EPOCH_SIZE VOCAB_SIZE EMB_DIM
|
bash start_cluster.sh CLUSTER_CONFIG_PATH EPOCH_SIZE VOCAB_SIZE EMB_DIM
|
||||||
DATASET ENV_SH RANK_TABLE_FILE MODE
|
DATASET ENV_SH RANK_TABLE_FILE MODE
|
||||||
```
|
```
|
||||||
|
|
||||||
### [Parameter Server](#contents)
|
### [Parameter Server](#contents)
|
||||||
|
|
||||||
To train and evaluate the model in parameter server mode, command as follows:'''
|
To train and evaluate the model in parameter server mode, command as follows:'''
|
||||||
```
|
|
||||||
|
```bash
|
||||||
# SERVER_NUM is the number of parameter servers for this task.
|
# SERVER_NUM is the number of parameter servers for this task.
|
||||||
# SCHED_HOST is the IP address of scheduler.
|
# SCHED_HOST is the IP address of scheduler.
|
||||||
# SCHED_PORT is the port of scheduler.
|
# SCHED_PORT is the port of scheduler.
|
||||||
|
@ -272,11 +300,11 @@ To train and evaluate the model in parameter server mode, command as follows:'''
|
||||||
bash run_parameter_server_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE SERVER_NUM SCHED_HOST SCHED_PORT
|
bash run_parameter_server_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE SERVER_NUM SCHED_HOST SCHED_PORT
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## [Evaluation Process](#contents)
|
## [Evaluation Process](#contents)
|
||||||
|
|
||||||
To evaluate the model, command as follows:
|
To evaluate the model, command as follows:
|
||||||
```
|
|
||||||
|
```python
|
||||||
python eval.py
|
python eval.py
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -301,8 +329,6 @@ python eval.py
|
||||||
| Parms(M) | 75.84 | 75.84 | 75.84 | 75.84 |
|
| Parms(M) | 75.84 | 75.84 | 75.84 | 75.84 |
|
||||||
| Checkpoint for inference | 233MB(.ckpt file) | 230MB(.ckpt) | 233MB(.ckpt file) | 233MB(.ckpt file) |
|
| Checkpoint for inference | 233MB(.ckpt file) | 230MB(.ckpt) | 233MB(.ckpt file) | 233MB(.ckpt file) |
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
All executable scripts can be found in [here](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/recommend/wide_and_deep/script)
|
All executable scripts can be found in [here](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/recommend/wide_and_deep/script)
|
||||||
|
|
||||||
Note: The result of GPU is tested under the master version. The parameter server mode of the Wide&Deep model is still under development.
|
Note: The result of GPU is tested under the master version. The parameter server mode of the Wide&Deep model is still under development.
|
||||||
|
@ -322,11 +348,11 @@ Note: The result of GPU is tested under the master version. The parameter server
|
||||||
# [Description of Random Situation](#contents)
|
# [Description of Random Situation](#contents)
|
||||||
|
|
||||||
There are three random situations:
|
There are three random situations:
|
||||||
|
|
||||||
- Shuffle of the dataset.
|
- Shuffle of the dataset.
|
||||||
- Initialization of some model weights.
|
- Initialization of some model weights.
|
||||||
- Dropout operations.
|
- Dropout operations.
|
||||||
|
|
||||||
|
|
||||||
# [ModelZoo Homepage](#contents)
|
# [ModelZoo Homepage](#contents)
|
||||||
|
|
||||||
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
|
Loading…
Reference in New Issue