008b91b2a1 | ||
---|---|---|
.. | ||
script | ||
src | ||
README.md | ||
eval.py | ||
train.py | ||
train_and_eval.py | ||
train_and_eval_auto_parallel.py | ||
train_and_eval_distribute.py |
README.md
Recommendation Model
Overview
This is an implementation of WideDeep as described in the Wide & Deep Learning for Recommender System paper.
WideDeep model jointly trained wide linear models and deep neural network, which combined the benefits of memorization and generalization for recommender systems.
Requirements
-
Install MindSpore.
-
Download the dataset and convert the dataset to mindrecord, command as follows:
python src/preprocess_data.py
Arguments:
--data_path
: Dataset storage path (Default: ./criteo_data/).
Dataset
The Criteo datasets are used for model training and evaluation.
Running Code
Code Structure
The entire code structure is as following:
|--- wide_and_deep/
train_and_eval.py "Entrance of Wide&Deep model training and evaluation"
eval.py "Entrance of Wide&Deep model evaluation"
train.py "Entrance of Wide&Deep model training"
train_and_eval_multinpu.py "Entrance of Wide&Deep model data parallel training and evaluation"
train_and_eval_auto_parallel.py
|--- src/ "Entrance of training and evaluation"
config.py "Parameters configuration"
dataset.py "Dataset loader class"
process_data.py "Process dataset"
preprocess_data.py "Pre_process dataset"
wide_and_deep.py "Model structure"
callbacks.py "Callback class for training and evaluation"
metrics.py "Metric class"
|--- script/ "Run shell dir"
run_multinpu_train.sh "Run data parallel"
run_auto_parallel_train.sh "Run auto parallel"
Train and evaluate model
To train and evaluate the model, command as follows:
python train_and_eval.py
Arguments:
--device_target
: Device where the code will be implemented (Default: Ascend).--data_path
: This should be set to the same directory given to the data_download's data_dir argument.--epochs
: Total train epochs.--batch_size
: Training batch size.--eval_batch_size
: Eval batch size.--field_size
: The number of features.--vocab_size
: The total features of dataset.--emb_dim
: The dense embedding dimension of sparse feature.--deep_layers_dim
: The dimension of all deep layers.--deep_layers_act
: The activation of all deep layers.--dropout_flag
: Whether do dropout.--keep_prob
: The rate to keep in dropout layer.--ckpt_path
:The location of the checkpoint file.--eval_file_name
: Eval output file.--loss_file_name
: Loss output file.
To train the model in one device, command as follows:
python train.py
Arguments:
--device_target
: Device where the code will be implemented (Default: Ascend).--data_path
: This should be set to the same directory given to the data_download's data_dir argument.--epochs
: Total train epochs.--batch_size
: Training batch size.--eval_batch_size
: Eval batch size.--field_size
: The number of features.--vocab_size
: The total features of dataset.--emb_dim
: The dense embedding dimension of sparse feature.--deep_layers_dim
: The dimension of all deep layers.--deep_layers_act
: The activation of all deep layers.--dropout_flag
: Whether do dropout.--keep_prob
: The rate to keep in dropout layer.--ckpt_path
:The location of the checkpoint file.--eval_file_name
: Eval output file.--loss_file_name
: Loss output file.
To train the model in distributed, command as follows:
# configure environment path before training
bash run_multinpu_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE
# configure environment path before training
bash run_auto_parallel_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE
To evaluate the model, command as follows:
python eval.py
Arguments:
--device_target
: Device where the code will be implemented (Default: Ascend).--data_path
: This should be set to the same directory given to the data_download's data_dir argument.--epochs
: Total train epochs.--batch_size
: Training batch size.--eval_batch_size
: Eval batch size.--field_size
: The number of features.--vocab_size
: The total features of dataset.--emb_dim
: The dense embedding dimension of sparse feature.--deep_layers_dim
: The dimension of all deep layers.--deep_layers_act
: The activation of all deep layers.--keep_prob
: The rate to keep in dropout layer.--ckpt_path
:The location of the checkpoint file.--eval_file_name
: Eval output file.--loss_file_name
: Loss output file.
There are other arguments about models and training process. Use the --help
or -h
flag to get a full list of possible arguments with detailed descriptions.