mindspore/model_zoo/wide_and_deep
anzhengqi 008b91b2a1 inject epoch ctrl op in the execution tree and send eos at the end of epoch 2020-07-20 13:02:47 +08:00
..
script add wide&deep stanalone training script for gpu in model zoo 2020-06-24 14:55:28 +08:00
src fix model_zoo 2020-07-15 20:37:20 +08:00
README.md update readme 2020-07-16 16:32:31 +08:00
eval.py add wide&deep stanalone training script for gpu in model zoo 2020-06-24 14:55:28 +08:00
train.py inject epoch ctrl op in the execution tree and send eos at the end of epoch 2020-07-20 13:02:47 +08:00
train_and_eval.py inject epoch ctrl op in the execution tree and send eos at the end of epoch 2020-07-20 13:02:47 +08:00
train_and_eval_auto_parallel.py inject epoch ctrl op in the execution tree and send eos at the end of epoch 2020-07-20 13:02:47 +08:00
train_and_eval_distribute.py inject epoch ctrl op in the execution tree and send eos at the end of epoch 2020-07-20 13:02:47 +08:00

README.md

Recommendation Model

Overview

This is an implementation of WideDeep as described in the Wide & Deep Learning for Recommender System paper.

WideDeep model jointly trained wide linear models and deep neural network, which combined the benefits of memorization and generalization for recommender systems.

Requirements

  • Install MindSpore.

  • Download the dataset and convert the dataset to mindrecord, command as follows:

python src/preprocess_data.py

Arguments:

  • --data_path: Dataset storage path (Default: ./criteo_data/).

Dataset

The Criteo datasets are used for model training and evaluation.

Running Code

Code Structure

The entire code structure is as following:

|--- wide_and_deep/
    train_and_eval.py                "Entrance of Wide&Deep model training and evaluation"
    eval.py                          "Entrance of Wide&Deep model evaluation"
    train.py                         "Entrance of Wide&Deep model training"
    train_and_eval_multinpu.py       "Entrance of Wide&Deep model data parallel training and evaluation"
    train_and_eval_auto_parallel.py
    |--- src/                        "Entrance of training and evaluation"
        config.py                    "Parameters configuration"
        dataset.py                   "Dataset loader class"
        process_data.py              "Process dataset"
        preprocess_data.py           "Pre_process dataset"
        wide_and_deep.py             "Model structure"
        callbacks.py                 "Callback class for training and evaluation"
        metrics.py                   "Metric class"
    |--- script/                     "Run shell dir"
        run_multinpu_train.sh        "Run data parallel"
        run_auto_parallel_train.sh   "Run auto parallel"

Train and evaluate model

To train and evaluate the model, command as follows:

python train_and_eval.py

Arguments:

  • --device_target: Device where the code will be implemented (Default: Ascend).
  • --data_path: This should be set to the same directory given to the data_download's data_dir argument.
  • --epochs: Total train epochs.
  • --batch_size: Training batch size.
  • --eval_batch_size: Eval batch size.
  • --field_size: The number of features.
  • --vocab_size The total features of dataset.
  • --emb_dim The dense embedding dimension of sparse feature.
  • --deep_layers_dim The dimension of all deep layers.
  • --deep_layers_act The activation of all deep layers.
  • --dropout_flag Whether do dropout.
  • --keep_prob The rate to keep in dropout layer.
  • --ckpt_pathThe location of the checkpoint file.
  • --eval_file_name : Eval output file.
  • --loss_file_name : Loss output file.

To train the model in one device, command as follows:

python train.py

Arguments:

  • --device_target: Device where the code will be implemented (Default: Ascend).
  • --data_path: This should be set to the same directory given to the data_download's data_dir argument.
  • --epochs: Total train epochs.
  • --batch_size: Training batch size.
  • --eval_batch_size: Eval batch size.
  • --field_size: The number of features.
  • --vocab_size The total features of dataset.
  • --emb_dim The dense embedding dimension of sparse feature.
  • --deep_layers_dim The dimension of all deep layers.
  • --deep_layers_act The activation of all deep layers.
  • --dropout_flag Whether do dropout.
  • --keep_prob The rate to keep in dropout layer.
  • --ckpt_pathThe location of the checkpoint file.
  • --eval_file_name : Eval output file.
  • --loss_file_name : Loss output file.

To train the model in distributed, command as follows:

# configure environment path before training
bash run_multinpu_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE 
# configure environment path before training
bash run_auto_parallel_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE 

To evaluate the model, command as follows:

python eval.py

Arguments:

  • --device_target: Device where the code will be implemented (Default: Ascend).
  • --data_path: This should be set to the same directory given to the data_download's data_dir argument.
  • --epochs: Total train epochs.
  • --batch_size: Training batch size.
  • --eval_batch_size: Eval batch size.
  • --field_size: The number of features.
  • --vocab_size The total features of dataset.
  • --emb_dim The dense embedding dimension of sparse feature.
  • --deep_layers_dim The dimension of all deep layers.
  • --deep_layers_act The activation of all deep layers.
  • --keep_prob The rate to keep in dropout layer.
  • --ckpt_pathThe location of the checkpoint file.
  • --eval_file_name : Eval output file.
  • --loss_file_name : Loss output file.

There are other arguments about models and training process. Use the --help or -h flag to get a full list of possible arguments with detailed descriptions.