modify code formats for master

This commit is contained in:
lvmingfu 2021-01-12 17:48:37 +08:00
parent bc276774e3
commit 27848587a3
31 changed files with 771 additions and 573 deletions

View File

@ -120,7 +120,7 @@ The following optimizers add the target interface: Adam, FTRL, LazyAdam, Proxim
</tr>
</table>
###### `export` Modify the input parameters and export's file name ([!7385](https://gitee.com/mind_spore/dashboard/projects/mindspore/mindspore/pulls/7385?tab=diffs) [!9057](https://gitee.com/mindspore/mindspore/pulls/9057/files))
###### `export` Modify the input parameters and export's file name ([!7385](https://gitee.com/mindspore/mindspore/pulls/7385) [!9057](https://gitee.com/mindspore/mindspore/pulls/9057/files))
Export the MindSpore prediction model to a file in the specified format.
@ -227,7 +227,7 @@ However, from a user's perspective, tensor.size and tensor.ndim (methods -> prop
</tr>
</table>
###### `EmbeddingLookup` add a config in the interface: sparse ([!8202](https://gitee.com/mind_spore/dashboard/projects/mindspore/mindspore/pulls/8202?tab=diffs))
###### `EmbeddingLookup` add a config in the interface: sparse ([!8202](https://gitee.com/mindspore/mindspore/pulls/8202))
sparse (bool): Using sparse mode. When 'target' is set to 'CPU', 'sparse' has to be true. Default: True.
@ -878,7 +878,7 @@ Contributions of any kind are welcome!
- Fix bug of list cannot be used as input in pynative mode([!1765](https://gitee.com/mindspore/mindspore/pulls/1765))
- Fix bug of kernel select ([!2103](https://gitee.com/mindspore/mindspore/pulls/2103))
- Fix bug of pattern matching for batchnorm fusion in the case of auto mix precision.([!1851](https://gitee.com/mindspore/mindspore/pulls/1851))
- Fix bug of generate hccl's kernel info.([!2393](https://gitee.com/mindspore/mindspore/mindspore/pulls/2393))
- Fix bug of generate hccl's kernel info.([!2393](https://gitee.com/mindspore/mindspore/pulls/2393))
- GPU platform
- Fix bug of summary feature invalid([!2173](https://gitee.com/mindspore/mindspore/pulls/2173))
- Data processing

View File

@ -1,22 +1,31 @@
This folder contains miscellaneous utilities used by the dataset code. We will describe a couple important classes in this file.
## Thread Management
This picture summarizes a few important classes that we will cover in the next few sections.
![Thread management](https://images.gitee.com/uploads/images/2020/0601/220111_9b07c8fa_7342120.jpeg "task_manager.JPG")
## Task
A Task object corresponds to an instance of std::future returning from std::async. In general, a user will not create a Task object directly. Most work will go through TaskManager's TaskGroup interface which we will cover later in this document. Here are some important members and functions of Task class.
```cpp
std::function<Status()> fnc_obj_;
```
It is the entry function when the thead is spawned. The function does not take any input and will return a Status object. The returned Status object will be saved in this member
```cpp
Status rc_;
```
To retrieve the executed result from the entry function, call the following function
```cpp
Status Task::GetTaskErrorIfAny();
```
Here is roughly the pseudo code of a lifetime of a Task. Some extra works needed to spawn the thread are omitted for the purpose of simplicity. As mentioned previously, a user never spawn a thread directly using a Task class without using any helper.
```cpp
@ -27,12 +36,14 @@ Here is roughly the pseudo code of a lifetime of a Task. Some extra works needed
5 RETURN_IF_NOT_OK(tk.Join();)
6 RETURN_IF_NOT_OK(tk.GetTaskErrorIfAny());
```
In the above example line 1 to 3 we use Task constructor to prepare a thread that we are going to create and what it will be running. We also assign a name to this thread. The name is for eye catcher purpose. The second parameter is the real job for this thread to run.
In the above example line 1 to 3 we use Task constructor to prepare a thread that we are going to create and what it will be running. We also assign a name to this thread. The name is for eye catcher purpose. The second parameter is the real job for this thread to run.
<br/>Line 4 we spawn the thread. In the above example, the thread will execute the lambda function which does nothing but return a OK Status object.
<br/>Line 5 We wait for the thread to complete
<br/>Line 6 We retrieve the result from running the thread which should be the OK Status object.
Another purpose of Task object is to wrap around the entry function and capture any possible exceptions thrown by running the entry function but not being caught within the entry function.
```cpp
try {
rc_ = fnc_obj_();
@ -42,23 +53,30 @@ Another purpose of Task object is to wrap around the entry function and capture
rc_ = Status(StatusCode::kUnexpectedError, __LINE__, __FILE__, e.what());
}
```
Note that
Note that
```cpp
Status Task::Run();
```
is not returning the Status of running the entry function func_obj_. It merely indicates if the spawn is successful or not. This function returns immediately.
is not returning the Status of running the entry function func_obj_. It merely indicates if the spawn is successful or not. This function returns immediately.
Another thing to point out that Task::Run() is not designed to re-run the thread repeatedly, say after it has returned. Result will be unexpected if a Task object is re-run.
For the function
```cpp
Status Task::Join(WaitFlag wf = WaitFlag::kBlocking);
```
where
```cpp
enum class WaitFlag : int { kBlocking, kNonBlocking };
```
is also not returning the Status of running the entry function func_obj_ like the function Run(). It can return some other unexpected error while waiting for the thread to return.
is also not returning the Status of running the entry function func_obj_ like the function Run(). It can return some other unexpected error while waiting for the thread to return.
This function blocks (kBlocking) by default until the spawned thread returns.
@ -71,37 +89,49 @@ while (thrd_.wait_for(std::chrono::seconds(1)) != std::future_status::ready) {
// Do something if the thread is blocked on a conditional variable
}
```
The main use of this form of Join() is after we have interrupted the thread.
A design alternative is to use
```cpp
std::future<Status>
```
to spawn the thread asynchronously and we can get the result using std::future::get(). But get() can only be called once and it is then more convenient to save the returned result in the rc_ member for unlimited number of retrieval. As we shall see later, the value of rc_ will be propagated to high level classes like TaskGroup, master thread.
to spawn the thread asynchronously and we can get the result using std::future::get(). But get() can only be called once and it is then more convenient to save the returned result in the rc_member for unlimited number of retrieval. As we shall see later, the value of rc_ will be propagated to high level classes like TaskGroup, master thread.
Currently it is how the thread is defined in Task class
```cpp
std::future<void> thrd_;
```
and spawned by this line of code.
```cpp
thrd_ = std::async(std::launch::async, std::ref(*this));
```
Every thread can access its own Task object using the FindMe() function.
```cpp
Task * TaskManager::FindMe();
```
There are other attributes of Task such as interrupt which we will cover later in this document.
## TaskGroup
The first helper in managing Task objects is TaskGroup. Technically speaking a TaskGroup is a collection of related Tasks. As of this writing, every Task must belong to a TaskGroup. We spawn a thread using the following function
```cpp
Status TaskGroup::CreateAsyncTask(const std::string &my_name, const std::function<Status()> &f, Task **pTask = nullptr);
```
The created Task object is added to the TaskGroup object. In many cases, user do not need to get a reference to the newly created Task object. But the CreateAsyncTask can return one if requested.
There is no other way to add a Task object to a TaskGroup other than by calling TaskGroup::CreateAsyncTask. As a result, no Task object can belong to multiple TaskGroup's by design. Every Task object has a back pointer to the TaskGroup it belongs to :
```cpp
TaskGroup *Task::MyTaskGroup();
```
@ -110,48 +140,64 @@ Task objects in the same TaskGroup will form a linked list with newly created Ta
Globally we support multiple TaskGroups's running concurrently. TaskManager (discussed in the next section) will chain all Task objects from all TaskGroup's in a single LRU linked list.
###### HandShaking
### HandShaking
As of this writing, the following handshaking logic is required. Suppose a thread T1 create another thread, say T2 by calling TaskGroup::CreateAsyncTask. T1 will block on a WaitPost area until T2 post back signalling T1 can resume.
```cpp
// Entry logic of T2
auto *myTask = TaskManager::FindMe();
myTask->Post();
```
If T2 is going to spawn more threads, say T3 and T4, it is *highly recommended* that T2 wait for T3 and T4 to post before it posts back to T1.
The purpose of the handshake is to provide a way for T2 to synchronize with T1 if necessary.
The purpose of the handshake is to provide a way for T2 to synchronize with T1 if necessary.
TaskGroup provides similar functions as Task but at a group level.
```cpp
void TaskGroup::interrupt_all() noexcept;
```
This interrupt all the threads currently running in the TaskGroup. The function returns immediately. We will cover more details on the mechanism of interrupt later in this document.
```cpp
Status TaskGroup::join_all(Task::WaitFlag wf = Task::WaitFlag::kBlocking);
```
This performs Task::Join() on all the threads in the group. This is a blocking call by default.
```cpp
Status TaskGroup::GetTaskErrorIfAny();
```
A TaskGroup does not save records for all the Task::rc_ for all the threads in this group. Only the first error is saved. For example, if thread T1 reports error rc1 and later on T2 reports error rc2, only rc1 is saved in the TaskGroup and rc2 is ignored. TaskGroup::GetTaskErrorIfAny() will return rc1 in this case.
```cpp
int size() const noexcept;
```
This returns the size of the TaskGroup.
## TaskManager
TaskManager is a singleton, meaning there is only one such class object. It is created by another Services singleton object which we will cover it in the later section.
```cpp
TaskManager &TaskManager::GetInstance()
```
provides the method to access the singleton.
TaskManager manages all the TaskGroups and all the Tasks objects ever created.
```cpp
List<Task> lru_;
List<Task> free_lst_;
std::set<TaskGroup *> grp_list_;
```
As mentioned previously, all the Tasks in the same TaskGroup are linked in a linked list local to this TaskGroup. At the TaskManager level, all Task objects from all the TaskGroups are linked in the lru_ list.
When a thread finished its job and returned, its corresponding Task object is saved for reuse in the free_lst_. When a new thread is created, TaskManager will first look into the free_lst_ before allocating memory for the new Task object.
@ -159,23 +205,29 @@ When a thread finished its job and returned, its corresponding Task object is sa
```cpp
std::shared_ptr<Task> master_;
```
The master thread itself also has a corresponding **fake** Task object in the TaskManager singleton object. But this fake Task is not in any of the List<Task>
###### Passing error to the master thread
### Passing error to the master thread
```cpp
void TaskManager::InterruptGroup(Task &);
void TaskManager::InterruptMaster(const Status &);
Status Status::GetMasterThreadRc();
```
When a thread encounters some unexpected error, it performs the following actions before returning
* It saves the error rc in the TaskGroup it belongs (assuming it is the first error reported in the TaskGroup).
* It interrupts every other threads in the TaskGroup by calling TaskManager::InterruptGroup.
* It interrupts the master thread and copy the error rc to the TaskManager::master_::rc_ by calling TaskManager::InterruptMaster(rc). However, because there can be many TaskGroups running in parallel or back to back, if the TaskManager::master_::rc_ is already set to some error from earlier TaskGroup run but not yet retrieved, the old error code will **not** be overwritten by the new error code.
Master thread can query the result using TaskGroup::GetTaskErrorIfAny or TaskManager::GetMasterThreadRc. The first form is the *preferred* method. For the second form, TaskManager::master_::rc_ will be reset to OK() once retrieved such that future call of TaskManager::InterruptMaster() will populate the error to the master thread again.
###### WatchDog
### WatchDog
TaskManager will spawn an additional thread with "Watchdog" as name catcher. It executes the following function once startup
```cpp
Status TaskManager::WatchDog() {
TaskManager::FindMe()->Post();
@ -190,45 +242,57 @@ Status TaskManager::WatchDog() {
return Status::OK();
}
```
Its main purpose is to handle Control-C and stop all the threads from running by interrupting all of them. We will cover more on the function call ServiceStop() when we reach the section about Service class.
WatchDog has its own TaskGroup to follow the protocol but it is not in the set of all the TaskGroup.
## Interrupt
C++ std::thread and std::async do not provide a way to stop a thread. So we implement interrupt mechanism to stop a thread from running and exit.
The initial design can be considered as a polling method. A bit or a flag may be set in some global shared area. The running thread will periodically check this bit/flag. If it is set, interrupt has been sent and the thread will quit. This method has a requirement that even if the thread is waiting on a std::conditional_variable, it can't do an unconditional wait() call. That is, it must do a wait_for() with a time out. Once returned from the wait_for() call, the thread must check if it is woken up due to time out or due to the condition is satisfied.
The initial design can be considered as a polling method. A bit or a flag may be set in some global shared area. The running thread will periodically check this bit/flag. If it is set, interrupt has been sent and the thread will quit. This method has a requirement that even if the thread is waiting on a std::conditional_variable, it can't do an unconditional wait() call. That is, it must do a wait_for() with a time out. Once returned from the wait_for() call, the thread must check if it is woken up due to time out or due to the condition is satisfied.
The cons of this approach is the performance cost and we design a pushing method approach.
The cons of this approach is the performance cost and we design a pushing method approach.
To begin with we define an abstract class that describe objects that are interruptible.
```cpp
class IntrpResource { ... };
```
It has two states:
```cpp
enum class State : int { kRunning, kInterrupted };
```
either it is in the state of running or being interrupted.
There are two virtual functions that any class inherit can override
```cpp
virtual Status Interrupt();
virtual void ResetIntrpState();
```
Interrupt() in the base class change the state of the object to kInterrupted. ResetIntrpState() is doing the opposite to reset the state. Any class that inherits the base class can implement its own Interrupt(), for example, we will later on see how a CondVar class (a wrapper for std::condition_variable) deals with interrupt on its own.
All related IntrpResource can register to a
```cpp
class IntrpService {...}
```
It provides the public method
```cpp
void InterruptAll() noexcept;
```
which goes through all registered IntrpResource objects and call the corresponding Interrupt().
A IntrpResource is always associated with a TaskGroup:
```cpp
class TaskGroup {
...
@ -240,45 +304,62 @@ class TaskGroup {
As of this writing, both push and poll methods are used. There are still a few places (e.g. a busy while loop) where a thread must periodically check for interrupt.
## CondVar
A CondVar class is a wrapper of std::condition_variable
A CondVar class is a wrapper of std::condition_variable
```cpp
std::condition_variable cv_;
```
and is interruptible :
```cpp
class CondVar : public IntrpResource { ... }
```
It overrides the Interrupt() method with its own
```cpp
void CondVar::Interrupt() {
IntrpResource::Interrupt();
cv_.notify_all();
}
```
It provides a Wait() method and is equivalent to std::condition_variable::wait.
```cpp
Status Wait(std::unique_lock<std::mutex> *lck, const std::function<bool()> &pred);
```
The main difference is Wait() is interruptible. Thread returning from Wait must check Status return code if it is being interrupted.
Note that once a CondVar is interrupted, its state remains interrupted until it is reset.
## WaitPost
A WaitPost is an implementation of <a href="https://en.wikipedia.org/wiki/Event_(synchronization_primitive)">Event</a>. In brief, it consists of a boolean state and provides methods to synchronize running threads.
* Wait(). If the boolean state is false, the calling threads will block until the boolean state becomes true or an interrupt has occurred.
* Set(). Change the boolean state to true. All blocking threads will be released.
* Clear(). Reset the boolean state back to false.
WaitPost is implemented on top of CondVar and hence is interruptible, that is, caller of
WaitPost is implemented on top of CondVar and hence is interruptible, that is, caller of
```cpp
Status Wait();
```
must check the return Status for interrupt.
The initial boolean state is false when a WaitPost object is created. Note that once a Set() call is invoked, the boolean state remains true until it is reset.
## List
A List is the implementation of doubly linked list. It is not thread safe and so user must provide methods to serialize the access to the list.
The main feature of List is it allows an element to be inserted into multiple Lists. Take the Task class as an example. It can be in its TaskGroup list and at the same time linked in the global TaskManager task list. When a Task is done, it will be in the free list.
```cpp
class Task {
...
@ -299,7 +380,9 @@ class TaskManager {
...
};
```
where Node<T> is defined as
where Node<T> is defined as
```cpp
template <typename T>
struct Node {
@ -314,10 +397,13 @@ struct Node {
}
};
```
The constructor List class will take Node<> as input so it will follow this Node element to form a doubly linked chain. For example, List<Task> lru_ takes Task::node in its constructor while TaskGroup::grp_list_ takes Task::group in its constructor. This way we allow a Task to appear in two distinct linked lists.
The constructor List class will take Node<> as input so it will follow this Node element to form a doubly linked chain. For example, List<Task> lru_takes Task::node in its constructor while TaskGroup::grp_list_ takes Task::group in its constructor. This way we allow a Task to appear in two distinct linked lists.
## Queue
A Queue is a thread safe solution to producer-consumer problem. Every queue is of finite capacity and its size must be provided to the constructor of the Queue. Few methods are provided
* Add(). It appends an element to queue and will be blocked if the queue is full or an interrupt has occurred.
* EmplaceBack(). Same as an Add() but construct the element in place.
* PopFront(). Remove the first element from the queue and will be blocked if the queue is empty or an interrupt has occurred.
@ -325,16 +411,21 @@ A Queue is a thread safe solution to producer-consumer problem. Every queue is o
Queue is implemented on top of CondVar class and hence is interruptible. So callers of the above functions must check for Status return code for interrupt.
## Locking
C++11 does not provide any shared lock support. So we implement some simple locking classes for our own benefits.
###### SpinLock
### SpinLock
It is a simple exclusive lock based on CAS (compared and swap). The caller repeatedly trying (and hence the name spinning) to acquire the lock until successful. It is best used when the critical section is very short.
SpinLock is not interruptible.
There is helper class LockGuard to ensure the lock is released if it is acquired.
###### RWLock
### RWLock
It is a simple Read Write Lock where the implementation favors writers. Reader will acquire the lock in S (share) mode while writer will acquire the lock in X (exclusive) mode. X mode is not compatible with S and X. S is compatible with S but not X. In addition, we also provide additional functions
* Upgrade(). Upgrade a S lock to X lock.
* Downgrade(). Downgrade a X lock to S lock.
@ -343,15 +434,19 @@ RWLock is not interruptible.
Like LockGuard helper class, there are helper classes SharedLock and UniqueLock to release the lock when the lock goes out of scope.
## Treap
A Treap is the combination of BST (Binary Search Tree) and a heap. Each key is given a priority. The priority for any non-leaf node is greater than or equal to the priority of its children.
Treap supports the following basic operations
* To search for a given key value. Standard binary search algorithm is applied, ignoring the priorities.
* To insert a new key X into the treap. Heap properties of the tree is maintained by tree rotation.
* To delete a key from a treap. Heap properties of the tree is maintained by tree rotation.
## MemoryPool
A MemoryPool is an abstract class to allow memory blocks to be dynamically allocated from a designated memory region. Any class that implements MemoryPool must provide the following implementations.
```cpp
// Allocate a block of size n
virtual Status Allocate(size_t, void **) = 0;
@ -362,59 +457,83 @@ A MemoryPool is an abstract class to allow memory blocks to be dynamically alloc
// Free a pointer
virtual void Deallocate(void *) = 0;
```
There are several implementations of MemoryPool
###### Arena
Arena is a fixed size memory region which is allocated up front. Each Allocate() will sub-allocate a block from this region.
### Arena
Arena is a fixed size memory region which is allocated up front. Each Allocate() will sub-allocate a block from this region.
Internally free blocks are organized into a Treap where the address of the block is the key and its block size is the priority. So the top of the tree is the biggest free block that can be found. Memory allocation is always fast and at a constant cost. Contiguous free blocks are merged into one single free block. Similar algorithm is used to enlarge a block to avoid memory copy.
The main advantage of Arena is we do not need to free individual memory block and simply free the whole region instead.
###### CircularPool
### CircularPool
It is still an experimental class. It consists of one single Arena or multiple Arenas. To allocate memory we circle through the Arenas before new Arena is added. It has an assumption that memory is not kept for too long and will be released at some point in the future, and memory allocation strategy is based on this assumption.
## B+ tree
We also provide B+ tree support. Compared to std::map, we provide the following additional features
* Thread safe
* Concurrent insert/update/search support.
As of this writing, no delete support has been implemented yet.
## Service
Many of the internal class inherit from a Service abstract class. A Service class simply speaking it provides service. A Service class consists of four states
```cpp
enum class STATE : int { kStartInProg = 1, kRunning, kStopInProg, kStopped };
```
Any class that inherits from Service class must implement the following two methods.
```cpp
virtual Status DoServiceStart() = 0;
virtual Status DoServiceStop() = 0;
```
###### Service::ServiceStart()
### Service::ServiceStart()
This function brings up the service and moves the state to kRunning. This function is thread safe. If another thread is bringing up the same service at the same time, only one of them will drive the service up. ServiceStart() will call DoServiceStart() provided by the child class when the state reaches kStartInProg.
An example will be TaskManager which inherits from Service. Its implementation of DoServiceStart will be to spawn off the WatchDog thread.
###### Service::ServiceStop()
### Service::ServiceStop()
This function shut down the service and moves the state to kStopped. This function is thread safe. If another thread is bringing down the same service at the same time, only one of them will drive the service down. ServiceStop() will call DoServiceStop() provided by the child class when the states reaches kStopInProg.
As an example, Both TaskManager and TaskGroup during service shutdown will generates interrupts to all the threads.
###### State checking
### State checking
Other important use of Service is to synchronize operations. For example, TaskGroup::CreateAsyncTask will return interrupt error if the current state of TaskGroup is not kRunning. This way we can assure no new thread is allowed to create and added to a TaskGroup while the TaskGroup is going out of scope. Without this state check, we can have Task running without its TaskGroup, and may run into situation the Task is blocked on a CondVar and not returning.
## Services
Services is a singleton and is the first and only one singleton created as a result of calling
Services is a singleton and is the first and only one singleton created as a result of calling
```cpp
mindspore::dataset::GlobalInit();
```
The first thing Services singleton do is to create a small 16M circular memory pool. This pool is used by many important classes to ensure basic operation will not fail due to out of memory. The most important example is TaskManager. Each Task memory is allocated from this memory pool.
The first thing Services singleton do is to create a small 16M circular memory pool. This pool is used by many important classes to ensure basic operation will not fail due to out of memory. The most important example is TaskManager. Each Task memory is allocated from this memory pool.
The next thing Services do is to spawn another singletons in some specific orders. One of the problems of multiple singletons is we have very limited control on the order of creation and destruction of singletons. Sometimes we need to control which singleton to allocate first and which one to deallocate last. One good example is logger. Logger is usually the last one to shutdown.
Services singleton has a requirement on the list of singletons it bring up. They must inherit the Service class. Services singleton will bring each one up by calling the corresponding ServiceStart() function. The destructor of Services singleton will call ServiceStop() to bring down these singletons. TaskManager is a good example. It is invoked by Services singleton.
Services singleton also provide other useful services like
Services singleton also provide other useful services like
* return the current hostname
* return the current username
* generate a random string
## Path
Path class provides many operating system specific functions to shield the user to write functions for different platforms. As of this writing, the following functions are provided.
```cpp
bool Exists();
bool IsDirectory();
@ -423,4 +542,5 @@ Path class provides many operating system specific functions to shield the user
std::string Extension() const;
std::string ParentPath();
```
Simple "/" operators are also provided to allow folders and/or files to be concatenated and work on all platforms including Windows.

View File

@ -2,11 +2,15 @@
<!-- TOC -->
- [目录](#目录)
- [概述](#概述)
- [数据集](#数据集)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本详述](#脚本详述)
- [模型准备](#模型准备)
- [模型训练](#模型训练)
- [工程目录](#工程目录)
<!-- /TOC -->
@ -14,7 +18,7 @@
本文主要讲解如何在端侧进行LeNet模型训练。首先在服务器或个人笔记本上进行模型转换然后在安卓设备上训练模型。LeNet由2层卷积和3层全连接层组成模型结构简单因此可以在设备上快速训练。
# Dataset
# 数据集
本例使用[MNIST手写字数据集](http://yann.lecun.com/exdb/mnist/)
@ -40,8 +44,9 @@ mnist/
# 环境要求
- 服务器或个人笔记本
- [MindSpore Framework](https://www.mindspore.cn/install/en): 建议使用Docker安装
- [MindSpore ToD Framework](https://www.mindspore.cn/tutorial/tod/en/use/prparation.html)
- [MindSpore Framework](https://www.mindspore.cn/install): 建议使用Docker安装
- [MindSpore ToD Download](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/downloads.html)
- [MindSpore ToD Build](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/build.html)
- [Android NDK r20b](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip)
- [Android SDK](https://developer.android.com/studio?hl=zh-cn#cmdline-tools)
- Android移动设备
@ -116,4 +121,4 @@ train_lenet/
│   ├── model
│   │   └── lenet_tod.ms # model to train
│   └── train.sh # on-device script that load the initial model and train it
```
```

View File

@ -2,10 +2,14 @@
<!-- TOC -->
- [目录](#目录)
- [概述](#概述)
- [数据集](#环境要求)
- [数据集](#数据集)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本详述](#脚本详述)
- [模型准备](#模型准备)
- [模型训练](#模型训练)
- [工程目录](#工程目录)
<!-- /TOC -->
@ -22,6 +26,7 @@
- 数据格式jpeg
> 注意
>
> - 当前发布版本中数据通过dataset.cc中自定义的`DataSet`类加载。我们使用[ImageMagick convert tool](https://imagemagick.org/)进行数据预处理包括图像裁剪、转换为BMP格式。
> - 本例将使用10分类而不是365类。
> - 训练、验证和测试数据集的比例分别是3:1:1。
@ -42,7 +47,8 @@ places
- 服务端
- [MindSpore Framework](https://www.mindspore.cn/install/en) - 建议使用安装docker环境
- [MindSpore ToD Framework](https://www.mindspore.cn/tutorial/tod/en/use/prparation.html)
- [MindSpore ToD Download](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/downloads.html)
- [MindSpore ToD Build](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/build.html)
- [Android NDK r20b](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip)
- [Android SDK](https://developer.android.com/studio?hl=zh-cn#cmdline-tools)
- [ImageMagick convert tool](https://imagemagick.org/)

View File

@ -1,6 +1,9 @@
# Contents
- [CenterFace Description](#CenterFace-description)
<!-- TOC -->
- [Contents](#contents)
- [CenterFace Description](#centerface-description)
- [Model Architecture](#model-architecture)
- [Dataset](#dataset)
- [Environment Requirements](#environment-requirements)
@ -11,7 +14,7 @@
- [Training Process](#training-process)
- [Training](#training)
- [Testing Process](#testing-process)
- [Evaluation](#testing)
- [Testing](#testing)
- [Evaluation Process](#evaluation-process)
- [Evaluation](#evaluation)
- [Convert Process](#convert-process)
@ -20,8 +23,11 @@
- [Performance](#performance)
- [Evaluation Performance](#evaluation-performance)
- [Inference Performance](#inference-performance)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
<!-- /TOC -->
# [CenterFace Description](#contents)
CenterFace is a practical anchor-free face detection and alignment method for edge devices, we support training and evaluation on Ascend910.
@ -80,8 +86,8 @@ other datasets need to use the same format as WiderFace.
- Framework
- [MindSpore](https://cmc-szv.clouddragon.huawei.com/cmcversion/index/search?searchKey=Do-MindSpore%20V100R001C00B622)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
@ -226,7 +232,7 @@ sh eval_all.sh
the command is: python train.py [train parameters]
Major parameters train.py as follows:
```python
```text
--lr: learning rate
--per_batch_size: batch size on each device
--is_distributed: multi-device or not

View File

@ -22,12 +22,12 @@
- [How to use](#how-to-use)
- [Inference](#inference)
- [Continue Training on the Pretrained Model](#continue-training-on-the-pretrained-model)
- [Transfer Learning](#transfer-learning)
- [Transfer Learning](#transfer-learning)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [CNNCTC Description](#contents)
This paper proposes three major contributions to addresses scene text recognition (STR).
First, we examine the inconsistencies of training and evaluation datasets, and the performance gap results from inconsistencies.
Second, we introduce a unified four-stage STR framework that most existing STR models fit into.
@ -38,10 +38,9 @@ comparisons to understand the performance gain of the existing modules.
[Paper](https://arxiv.org/abs/1904.01906): J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee, “What is wrong with scene text recognition model comparisons? dataset and model analysis,” ArXiv, vol. abs/1904.01906, 2019.
# [Model Architecture](#contents)
This is an example of training CNN+CTC model for text recognition on MJSynth and SynthText dataset with MindSpore.
# [Dataset](#contents)
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
@ -49,14 +48,18 @@ Note that you can run the scripts based on the dataset mentioned in original pap
The [MJSynth](https://www.robots.ox.ac.uk/~vgg/data/text/) and [SynthText](https://github.com/ankush-me/SynthText) dataset are used for model training. The [The IIIT 5K-word dataset](https://cvit.iiit.ac.in/research/projects/cvit-projects/the-iiit-5k-word-dataset) dataset is used for evaluation.
- step 1:
All the datasets have been preprocessed and stored in .lmdb format and can be downloaded [**HERE**](https://drive.google.com/drive/folders/192UfE9agQUMNq6AgU3_E05_FcPZK4hyt).
- step 2:
Uncompress the downloaded file, rename the MJSynth dataset as MJ, the SynthText dataset as ST and the IIIT dataset as IIIT.
- step 3:
Move above mentioned three datasets into `cnnctc_data` folder, and the structure should be as below:
```
```text
|--- CNNCTC/
|--- cnnctc_data/
|--- ST/
@ -68,13 +71,15 @@ Move above mentioned three datasets into `cnnctc_data` folder, and the structure
|--- IIIT/
data.mdb
lock.mdb
......
```
- step 4:
Preprocess the dataset by running:
```
```bash
python src/preprocess_dataset.py
```
@ -84,31 +89,27 @@ This takes around 75 minutes.
## Mixed Precision
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
# [Environment Requirements](#contents)
- HardwareAscend
- Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
- Install dependencies:
```
```bash
pip install lmdb
pip install Pillow
pip install tqdm
@ -116,25 +117,30 @@ pip install six
```
- Standalone Training:
```
```bash
bash scripts/run_standalone_train_ascend.sh $PRETRAINED_CKPT
```
- Distributed Training:
```
```bash
bash scripts/run_distribute_train_ascend.sh $RANK_TABLE_FILE $PRETRAINED_CKPT
```
- Evaluation:
```
```bash
bash scripts/run_eval_ascend.sh $TRAINED_CKPT
```
# [Script Description](#contents)
## [Script and Sample Code](#contents)
The entire code structure is as following:
```
```text
|--- CNNCTC/
|---README.md // descriptions about cnnctc
|---train.py // train scripts
@ -154,39 +160,41 @@ The entire code structure is as following:
```
## [Script Parameters](#contents)
Parameters for both training and evaluation can be set in `config.py`.
Arguments:
* `--CHARACTER`: Character labels.
* `--NUM_CLASS`: The number of classes including all character labels and the <blank> label for CTCLoss.
* `--HIDDEN_SIZE`: Model hidden size.
* `--FINAL_FEATURE_WIDTH`: The number of features.
* `--IMG_H` The height of input image.
* `--IMG_W` The width of input image.
* `--TRAIN_DATASET_PATH` The path to training dataset.
* `--TRAIN_DATASET_INDEX_PATH` The path to training dataset index file which determines the order .
* `--TRAIN_BATCH_SIZE` Training batch size. The batch size and index file must ensure input data is in fixed shape.
* `--TRAIN_DATASET_SIZE` Training dataset size.
* `--TEST_DATASET_PATH` The path to test dataset.
* `--TEST_BATCH_SIZE` Test batch size.
* `--TRAIN_EPOCHS`Total training epochs.
* `--CKPT_PATH`The path to model checkpoint file, can be used to resume training and evaluation.
* `--SAVE_PATH`The path to save model checkpoint file.
* `--LR`Learning rate for standalone training.
* `--LR_PARA`Learning rate for distributed training.
* `--MOMENTUM`Momentum.
* `--LOSS_SCALE`Loss scale to prevent gradient underflow.
* `--SAVE_CKPT_PER_N_STEP`Save model checkpoint file per N steps.
* `--KEEP_CKPT_MAX_NUM`The maximum number of saved model checkpoint file.
- `--CHARACTER`: Character labels.
- `--NUM_CLASS`: The number of classes including all character labels and the <blank> label for CTCLoss.
- `--HIDDEN_SIZE`: Model hidden size.
- `--FINAL_FEATURE_WIDTH`: The number of features.
- `--IMG_H` The height of input image.
- `--IMG_W` The width of input image.
- `--TRAIN_DATASET_PATH` The path to training dataset.
- `--TRAIN_DATASET_INDEX_PATH` The path to training dataset index file which determines the order .
- `--TRAIN_BATCH_SIZE` Training batch size. The batch size and index file must ensure input data is in fixed shape.
- `--TRAIN_DATASET_SIZE` Training dataset size.
- `--TEST_DATASET_PATH` The path to test dataset.
- `--TEST_BATCH_SIZE` Test batch size.
- `--TRAIN_EPOCHS`Total training epochs.
- `--CKPT_PATH`The path to model checkpoint file, can be used to resume training and evaluation.
- `--SAVE_PATH`The path to save model checkpoint file.
- `--LR`Learning rate for standalone training.
- `--LR_PARA`Learning rate for distributed training.
- `--MOMENTUM`Momentum.
- `--LOSS_SCALE`Loss scale to prevent gradient underflow.
- `--SAVE_CKPT_PER_N_STEP`Save model checkpoint file per N steps.
- `--KEEP_CKPT_MAX_NUM`The maximum number of saved model checkpoint file.
## [Training Process](#contents)
### Training
- Standalone Training:
```
```bash
bash scripts/run_standalone_train_ascend.sh $PRETRAINED_CKPT
```
@ -195,22 +203,22 @@ Results and checkpoints are written to `./train` folder. Log can be found in `./
`$PRETRAINED_CKPT` is the path to model checkpoint and it is **optional**. If none is given the model will be trained from scratch.
- Distributed Training:
```
```bash
bash scripts/run_distribute_train_ascend.sh $RANK_TABLE_FILE $PRETRAINED_CKPT
```
Results and checkpoints are written to `./train_parallel_{i}` folder for device `i` respectively.
Log can be found in `./train_parallel_{i}/log_{i}.log` and loss values are recorded in `./train_parallel_{i}/loss.log`.
`$RANK_TABLE_FILE` is needed when you are running a distribute task on ascend.
`$RANK_TABLE_FILE` is needed when you are running a distribute task on ascend.
`$PATH_TO_CHECKPOINT` is the path to model checkpoint and it is **optional**. If none is given the model will be trained from scratch.
### Training Result
Training result will be stored in the example path, whose folder name begins with "train" or "train_parallel". You can find checkpoint file together with result like the followings in loss.log.
```
```text
# distribute training result(8p)
epoch: 1 step: 1 , loss is 76.25, average time per step is 0.235177839748392712
epoch: 1 step: 2 , loss is 73.46875, average time per step is 0.25798572540283203
@ -234,18 +242,20 @@ epoch: 1 step: 8698 , loss is 9.708542263610315, average time per step is 0.2184
## [Evaluation Process](#contents)
### Evaluation
- Evaluation:
```
```bash
bash scripts/run_eval_ascend.sh $TRAINED_CKPT
```
The model will be evaluated on the IIIT dataset, sample results and overall accuracy will be printed.
# [Model Description](#contents)
## [Performance](#contents)
### Training Performance
### Training Performance
| Parameters | CNNCTC |
| -------------------------- | ----------------------------------------------------------- |
@ -260,8 +270,7 @@ The model will be evaluated on the IIIT dataset, sample results and overall accu
| Speed | 1pc: 250 ms/step; 8pcs: 260 ms/step |
| Total time | 1pc: 15 hours; 8pcs: 1.92 hours |
| Parameters (M) | 177 |
| Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/cnnctc |
| Scripts | <https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/cnnctc> |
### Evaluation Performance
@ -278,13 +287,14 @@ The model will be evaluated on the IIIT dataset, sample results and overall accu
| Model for inference | 675M (.ckpt file) |
## [How to use](#contents)
### Inference
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html). Following the steps below, this is a simple example:
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/migrate_3rd_scripts.html). Following the steps below, this is a simple example:
- Running on Ascend
```
```python
# Set context
context.set_context(mode=context.GRAPH_HOME, device_target=cfg.device_target)
context.set_context(device_id=cfg.device_id)
@ -315,7 +325,7 @@ If you need to use the trained model to perform inference on multiple hardware p
- running on Ascend
```
```python
# Load dataset
dataset = create_dataset(cfg.data_path, 1)
batch_num = dataset.get_dataset_size()
@ -349,6 +359,6 @@ If you need to use the trained model to perform inference on multiple hardware p
print("train success")
```
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -11,18 +11,18 @@
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本说明](#脚本说明)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [训练](#训练)
- [训练结果](#训练结果)
- [评估过程](#评估过程)
- [评估过程](#评估过程)
- [评估](#评估)
- [模型描述](#模型描述)
- [性能](#性能)
- [性能](#性能)
- [训练性能](#训练性能)
- [评估性能](#评估性能)
- [用法](#用法)
- [用法](#用法)
- [推理](#推理)
- [在预训练模型上继续训练](#在预训练模型上继续训练)
- [ModelZoo主页](#modelzoo主页)
@ -101,12 +101,12 @@ python src/preprocess_dataset.py
- 框架
- [MindSpore](https://www.mindspore.cn/install)
- [MindSpore](https://www.mindspore.cn/install)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
# 快速入门

View File

@ -67,7 +67,7 @@ All the models in this repository are trained and validated on ImageNet-1K. The
## [Mixed Precision](#contents)
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
# [Environment Requirements](#contents)
@ -81,8 +81,8 @@ To run the python scripts in the repository, you need to prepare the environment
- Easydict
- MXNet 1.6.0 if running the script `param_convert.py`
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)

View File

@ -50,7 +50,7 @@ MaskRCNN是一个两级目标检测网络作为FasterRCNN的扩展模型
- 注释241M包括实例、字幕、人物关键点等
- 数据格式图像及JSON文件
- 注:数据在[dataset.py](http://dataset.py/)中处理。
- 注:数据在`dataset.py`中处理。
# 环境要求
@ -583,7 +583,7 @@ Accumulating evaluation results...
# 随机情况说明
[dataset.py](http://dataset.py/)中设置了“create_dataset”函数内的种子同时还使用[train.py](http://train.py/)中的随机种子进行权重初始化。
`dataset.py`中设置了“create_dataset”函数内的种子同时还使用`train.py`中的随机种子进行权重初始化。
# ModelZoo主页

View File

@ -58,7 +58,7 @@ MobileNetV2总体网络架构如下
- 硬件Ascend/GPU/CPU
- 使用Ascend、GPU或CPU处理器来搭建硬件环境。如需试用Ascend处理器请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com审核通过即可获得资源。
- 框架
- [MindSpore](https://www.mindspore.cn/install/en)
- [MindSpore](https://www.mindspore.cn/install)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
@ -222,7 +222,7 @@ python export.py --platform [PLATFORM] --ckpt_file [CKPT_PATH] --file_format [EX
# 随机情况说明
<!-- [dataset.py](http://dataset.py/)中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。-->
<!-- `dataset.py`中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。-->
在train.py中设置了numpy.random、minspore.common.Initializer、minspore.ops.composite.random_ops和minspore.nn.probability.distribution所使用的种子。
# ModelZoo主页

View File

@ -1,4 +1,5 @@
# 目录
<!-- TOC -->
- [目录](#目录)
@ -30,7 +31,6 @@
# MobileNetV2描述
MobileNetV2结合硬件感知神经网络架构搜索NAS和NetAdapt算法已经可以移植到手机CPU上运行后续随新架构进一步优化改进。2019年11月20日
[论文](https://arxiv.org/pdf/1905.02244)Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al."Searching for MobileNetV2."In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314-1324.2019.
@ -47,12 +47,13 @@ MobileNetV2总体网络架构如下
使用的数据集:[imagenet](http://www.image-net.org/)
-数据集大小125G共1000个类、1.2万张彩色图像
- 训练集: 120G共1.2万张图像
- 测试集5G共5万张图像
- 数据格式RGB
- 注数据在src/dataset.py中处理。
- 数据集大小125G共1000个类、1.2万张彩色图像
- 训练集: 120G共1.2万张图像
- 测试集5G共5万张图像
- 数据格式RGB
- 注数据在src/dataset.py中处理。
# 特性
@ -64,13 +65,12 @@ MobileNetV2总体网络架构如下
# 环境要求
- 硬件昇腾处理器Ascend
- 使用昇腾处理器来搭建硬件环境。如需试用昇腾处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com审核通过即可获得资源。
- 使用昇腾处理器来搭建硬件环境。如需试用昇腾处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com审核通过即可获得资源。
- 框架
- [MindSpore](https://www.mindspore.cn/install/en)
- [MindSpore](https://www.mindspore.cn/install)
- 如需查看详情,请参见如下资源
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
# 脚本说明
@ -94,7 +94,6 @@ MobileNetV2总体网络架构如下
├── export.py # 导出检查点文件到air/onnx中
```
## 脚本参数
在config.py中可以同时配置训练参数和评估参数。
@ -123,13 +122,11 @@ MobileNetV2总体网络架构如下
### 用法
使用python或shell脚本开始训练。shell脚本的使用方法如下
- bash run_train.sh [Ascend] [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(可选)
- bash run_train.sh [GPU] [DEVICE_ID_LIST] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(可选)
### 启动
``` bash
@ -143,7 +140,7 @@ MobileNetV2总体网络架构如下
训练结果保存在示例路径中。`Ascend`处理器训练的检查点默认保存在`./train/device$i/checkpoint`,训练日志重定向到`./train/device$i/train.log`。`GPU`处理器训练的检查点默认保存在`./train/checkpointckpt_$i`中,训练日志重定向到`./train/train.log`中。
`train.log`内容如下:
```
```text
epoch:[ 0/200], step:[ 624/ 625], loss:[5.258/5.258], time:[140412.236], lr:[0.100]
epoch time:140522.500, per step time:224.836, avg loss:5.258
epoch:[ 1/200], step:[ 624/ 625], loss:[3.917/3.917], time:[138221.250], lr:[0.200]
@ -160,7 +157,7 @@ epoch time:138331.250, per step time:221.330, avg loss:3.917
### 启动
```
```bash
# 推理示例
shell:
Ascend: sh run_infer_quant.sh Ascend ~/imagenet/val/ ~/train/mobilenet-60_1601.ckpt
@ -172,7 +169,7 @@ epoch time:138331.250, per step time:221.330, avg loss:3.917
推理结果保存在示例路径,可以在`./val/infer.log`中找到如下结果:
```
```text
result:{'acc':0.71976314102564111}
```
@ -218,7 +215,7 @@ result:{'acc':0.71976314102564111}
# 随机情况说明
[dataset.py](http://dataset.py/)中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。
`dataset.py`中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。
# ModelZoo主页

View File

@ -1,4 +1,5 @@
# 目录
<!-- TOC -->
- [目录](#目录)
@ -27,7 +28,6 @@
# MobileNetV3描述
MobileNetV3结合硬件感知神经网络架构搜索NAS和NetAdapt算法已经可以移植到手机CPU上运行后续随新架构进一步优化改进。2019年11月20日
[论文](https://arxiv.org/pdf/1905.02244)Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al."Searching for mobilenetv3."In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314-1324.2019.
@ -43,38 +43,36 @@ MobileNetV3总体网络架构如下
使用的数据集:[imagenet](http://www.image-net.org/)
- 数据集大小125G共1000个类、1.2万张彩色图像
- 训练集120G共1.2万张图像
- 测试集5G共5万张图像
- 训练集120G共1.2万张图像
- 测试集5G共5万张图像
- 数据格式RGB
- 注数据在src/dataset.py中处理。
- 注数据在src/dataset.py中处理。
# 环境要求
- 硬件GPU
- 准备GPU处理器搭建硬件环境。
- 准备GPU处理器搭建硬件环境。
- 框架
- [MindSpore](https://www.mindspore.cn/install/en)
- [MindSpore](https://www.mindspore.cn/install)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
# 脚本说明
## 脚本和样例代码
```python
├── MobileNetV3
├── Readme.md # MobileNetV3相关描述
├── scripts
│ ├──run_train.sh # 用于训练的shell脚本
│ ├──run_eval.sh # 用于评估的shell脚本
├── src
│ ├──config.py # 参数配置
├── MobileNetV3
├── Readme.md # MobileNetV3相关描述
├── scripts
│ ├──run_train.sh # 用于训练的shell脚本
│ ├──run_eval.sh # 用于评估的shell脚本
├── src
│ ├──config.py # 参数配置
│ ├──dataset.py # 创建数据集
│ ├──launch.py # 启动python脚本
│ ├──lr_generator.py # 配置学习率
│ ├──lr_generator.py # 配置学习率
│ ├──mobilenetV3.py # MobileNetV3架构
├── train.py # 训练脚本
├── eval.py # 评估脚本
@ -91,7 +89,7 @@ MobileNetV3总体网络架构如下
### 启动
```
```bash
# 训练示例
python:
GPU: python train.py --dataset_path ~/imagenet/train/ --device_targe GPU
@ -101,9 +99,9 @@ MobileNetV3总体网络架构如下
### 结果
训练结果保存在示例路径中。检查点默认保存在`./checkpoint`中,训练日志重定向到`./train/train.log`,如下所示:
训练结果保存在示例路径中。检查点默认保存在`./checkpoint`中,训练日志重定向到`./train/train.log`,如下所示:
```
```text
epoch:[ 0/200], step:[ 624/ 625], loss:[5.258/5.258], time:[140412.236], lr:[0.100]
epoch time:140522.500, per step time:224.836, avg loss:5.258
epoch:[ 1/200], step:[ 624/ 625], loss:[3.917/3.917], time:[138221.250], lr:[0.200]
@ -120,7 +118,7 @@ epoch time:138331.250, per step time:221.330, avg loss:3.917
### 启动
```
```bash
# 推理示例
python:
GPU: python eval.py --dataset_path ~/imagenet/val/ --checkpoint_path mobilenet_199.ckpt --device_targe GPU
@ -129,13 +127,13 @@ epoch time:138331.250, per step time:221.330, avg loss:3.917
GPU: sh run_infer.sh GPU ~/imagenet/val/ ~/train/mobilenet-200_625.ckpt
```
> 训练过程中可以生成检查点。
> 训练过程中可以生成检查点。
### 结果
推理结果保存示例路径中,可以在`val.log`中找到如下结果:
推理结果保存示例路径中,可以在`val.log`中找到如下结果:
```
```text
result:{'acc':0.71976314102564111} ckpt=/path/to/checkpoint/mobilenet-200_625.ckpt
```
@ -143,7 +141,7 @@ result:{'acc':0.71976314102564111} ckpt=/path/to/checkpoint/mobilenet-200_625.ck
修改`src/config.py`文件中的`export_mode`和`export_file`, 运行`export.py`。
```
```bash
python export.py --device_target [PLATFORM] --checkpoint_path [CKPT_PATH]
```
@ -173,8 +171,8 @@ python export.py --device_target [PLATFORM] --checkpoint_path [CKPT_PATH]
# 随机情况说明
[dataset.py](http://dataset.py/)中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。
`dataset.py`中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。
# ModelZoo主页
请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。
请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。

View File

@ -52,7 +52,7 @@
- 框架
- [MindSpore](https://www.mindspore.cn/install)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutory/training/en/master/index.html)
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
- 安装Mindspore
- 安装[pyblind11](https://github.com/pybind/pybind11)

View File

@ -491,7 +491,7 @@ result:{'top_5_accuracy':0.9342589628681178, 'top_1_accuracy':0.768065781049936}
# 随机情况说明
[dataset.py](http://dataset.py/)中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。
`dataset.py`中设置了“create_dataset”函数内的种子同时还使用了train.py中的随机种子。
# ModelZoo主页

View File

@ -5,7 +5,7 @@
- [Pretrain Model](#pretrain-model)
- [Dataset](#dataset)
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Quick Start](#quick-start)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
@ -22,10 +22,9 @@
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [RetinaFace Description](#contents)
Retinaface is a face detection model, which was proposed in 2019 and achieved the best results on the wideface dataset at that time. Retinaface, the full name of the paper is retinaface: single stage dense face localization in the wild. Compared with s3fd and mtcnn, it has a significant improvement, and has a higher recall rate for small faces. It is not good for multi-scale face detection. In order to solve these problems, retinaface feature pyramid structure is used for feature fusion between different scales, and SSH module is added.
Retinaface is a face detection model, which was proposed in 2019 and achieved the best results on the wideface dataset at that time. Retinaface, the full name of the paper is retinaface: single stage dense face localization in the wild. Compared with s3fd and mtcnn, it has a significant improvement, and has a higher recall rate for small faces. It is not good for multi-scale face detection. In order to solve these problems, retinaface feature pyramid structure is used for feature fusion between different scales, and SSH module is added.
[Paper](https://arxiv.org/abs/1905.00641v2): Jiankang Deng, Jia Guo, Yuxiang Zhou, Jinke Yu, Irene Kotsia, Stefanos Zafeiriou. "RetinaFace: Single-stage Dense Face Localisation in the Wild". 2019.
@ -33,6 +32,7 @@ Retinaface is a face detection model, which was proposed in 2019 and achieved th
Retinaface needs a resnet50 backbone to extract image features for detection. You could get resnet50 train script from our modelzoo and modify the pad structure of resnet50 according to resnet in ./src/network.py, Final train it on imagenet2012 to get resnet50 pretrain model.
Steps:
1. Get resnet50 train script from our modelzoo.
2. Modify the resnet50 architecture according to resnet in ```./src/network.py```.(You can also leave the structure of a unchanged, but the accuracy will be 2-3 percentage points lower.)
3. Train resnet50 on imagenet2012.
@ -41,47 +41,44 @@ Steps:
Specifically, the retinaface network is based on retinanet. The feature pyramid structure of retinanet is used in the network, and SSH structure is added. Besides the traditional detection branch, the prediction branch of key points and self-monitoring branch are added in the network. The paper indicates that the two branches can improve the performance of the model. Here we do not implement the self-monitoring branch.
# [Dataset](#contents)
Dataset used: [WIDERFACE](<http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html>)
Dataset used: [WIDERFACE](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/WiderFace_Results.html)
Dataset acquisition:
1. Get the dataset and annotations from [here](<https://github.com/peteryuX/retinaface-tf2>).
2. Get the eval ground truth label from [here](<https://github.com/peteryuX/retinaface-tf2/tree/master/widerface_evaluate/ground_truth>).
Dataset acquisition:
1. Get the dataset and annotations from [here](https://github.com/peteryuX/retinaface-tf2).
2. Get the eval ground truth label from [here](https://github.com/peteryuX/retinaface-tf2/tree/master/widerface_evaluate/ground_truth).
- Dataset size3.42G32,203 colorful images
- Train1.36G12,800 images
- Val345.95M3,226 images
- Test1.72G16,177 images
- Train1.36G12,800 images
- Val345.95M3,226 images
- Test1.72G16,177 images
# [Environment Requirements](#contents)
- HardwareGPU
- Prepare hardware environment with GPU processor.
- Prepare hardware environment with GPU processor.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
After installing MindSpore via the official website and download the dataset, you can start training and evaluation as follows:
After installing MindSpore via the official website and download the dataset, you can start training and evaluation as follows:
- running on GPU
```python
# run training example
export CUDA_VISIBLE_DEVICES=0
python train.py > train.log 2>&1 &
python train.py > train.log 2>&1 &
# run distributed training example
bash scripts/run_distribute_gpu_train.sh 4 0,1,2,3
# run evaluation example
export CUDA_VISIBLE_DEVICES=0
python eval.py > eval.log 2>&1 &
@ -89,34 +86,32 @@ After installing MindSpore via the official website and download the dataset, yo
bash run_standalone_gpu_eval.sh 0
```
# [Script Description](#contents)
## [Script and Sample Code](#contents)
```
```text
├── model_zoo
├── README.md // descriptions about all the models
├── retinaface
├── retinaface
├── README.md // descriptions about googlenet
├── scripts
├── scripts
│ ├──run_distribute_gpu_train.sh // shell script for distributed on GPU
│ ├──run_standalone_gpu_eval.sh // shell script for evaluation on GPU
├── src
├── src
│ ├──dataset.py // creating dataset
│ ├──network.py // retinaface architecture
│ ├──config.py // parameter configuration
│ ├──augmentation.py // data augment method
│ ├──loss.py // loss function
│ ├──config.py // parameter configuration
│ ├──augmentation.py // data augment method
│ ├──loss.py // loss function
│ ├──utils.py // data preprocessing
│ ├──lr_schedule.py // learning rate schedule
├── data
├── data
│ ├──widerface // dataset data
│ ├──resnet50_pretrain.ckpt // resnet50 imagenet pretrain model
│ ├──ground_truth // eval label
├── train.py // training script
├── eval.py // evaluation script
├── train.py // training script
├── eval.py // evaluation script
```
## [Script Parameters](#contents)
@ -163,39 +158,36 @@ Parameters for both training and evaluation can be set in config.py
'val_nms_threshold': 0.4, # Threshold for val NMS
'val_iou_threshold': 0.5, # Threshold for val IOU
'val_save_result': False, # Whether save the resultss
'val_predict_save_folder': './widerface_result', # Result save path
'val_predict_save_folder': './widerface_result', # Result save path
'val_gt_dir': './data/ground_truth/', # Path of val set ground_truth
```
## [Training Process](#contents)
### Training
### Training
- running on GPU
```
```bash
export CUDA_VISIBLE_DEVICES=0
python train.py > train.log 2>&1 &
python train.py > train.log 2>&1 &
```
The python command above will run in the background, you can view the results through the file `train.log`.
After training, you'll get some checkpoint files under the folder `./checkpoint/` by default.
After training, you'll get some checkpoint files under the folder `./checkpoint/` by default.
### Distributed Training
- running on GPU
```
```bash
bash scripts/run_distribute_gpu_train.sh 4 0,1,2,3
```
The above shell script will run distribute training in the background. You can view the results through the file `train/train.log`.
After training, you'll get some checkpoint files under the folder `./checkpoint/ckpt_0/` by default.
The above shell script will run distribute training in the background. You can view the results through the file `train/train.log`.
After training, you'll get some checkpoint files under the folder `./checkpoint/ckpt_0/` by default.
## [Evaluation Process](#contents)
@ -204,15 +196,15 @@ Parameters for both training and evaluation can be set in config.py
- evaluation on WIDERFACE dataset when running on GPU
Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path in src/config.py, e.g., "username/retinaface/checkpoint/ckpt_0/RetinaFace-100_402.ckpt".
```
```bash
export CUDA_VISIBLE_DEVICES=0
python eval.py > eval.log 2>&1 &
```
The above python command will run in the background. You can view the results through the file "eval.log". The result of the test dataset will be as follows:
```
```text
# grep "Val AP" eval.log
Easy Val AP : 0.9422
Medium Val AP : 0.9325
@ -221,28 +213,26 @@ Parameters for both training and evaluation can be set in config.py
OR,
```
```bash
bash run_standalone_gpu_eval.sh 0
```
The above python command will run in the background. You can view the results through the file "eval/eval.log". The result of the test dataset will be as follows:
```
```text
# grep "Val AP" eval.log
Easy Val AP : 0.9422
Medium Val AP : 0.9325
Hard Val AP : 0.8900
```
# [Model Description](#contents)
## [Performance](#contents)
### Evaluation Performance
### Evaluation Performance
| Parameters | GPU |
| Parameters | GPU |
| -------------------------- | -------------------------------------------------------------|
| Model Version | RetinaFace + Resnet50 |
| Resource | NV SMX2 V100-16G |
@ -260,17 +250,16 @@ Parameters for both training and evaluation can be set in config.py
| Checkpoint for Fine tuning | 336.3M (.ckpt file) |
| Scripts | [retinaface script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/retinaface) |
## [How to use](#contents)
### Continue Training on the Pretrained Model
### Continue Training on the Pretrained Model
- running on GPU
```
```python
# Load dataset
ds_train = create_dataset(training_dataset, cfg, batch_size, multiprocessing=True, num_worker=cfg['num_workers'])
# Define model
multibox_loss = MultiBoxLoss(num_classes, cfg['num_anchor'], negative_ratio, cfg['batch_size'])
lr = adjust_learning_rate(initial_lr, gamma, stepvalues, steps_per_epoch, max_epoch, warmup_epoch=cfg['warmup_epoch'])
@ -278,24 +267,24 @@ Parameters for both training and evaluation can be set in config.py
weight_decay=weight_decay, loss_scale=1)
backbone = resnet50(1001)
net = RetinaFace(phase='train', backbone=backbone)
# Continue training if resume_net is not None
pretrain_model_path = cfg['resume_net']
param_dict_retinaface = load_checkpoint(pretrain_model_path)
load_param_into_net(net, param_dict_retinaface)
net = RetinaFaceWithLossCell(net, multibox_loss, cfg)
net = TrainingWrapper(net, opt)
model = Model(net)
# Set callbacks
# Set callbacks
config_ck = CheckpointConfig(save_checkpoint_steps=cfg['save_checkpoint_steps'],
keep_checkpoint_max=cfg['keep_checkpoint_max'])
ckpoint_cb = ModelCheckpoint(prefix="RetinaFace", directory=cfg['ckpt_path'], config=config_ck)
time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())
callback_list = [LossMonitor(), time_cb, ckpoint_cb]
# Start training
model.train(max_epoch, ds_train, callbacks=callback_list,
dataset_sink_mode=False)
@ -305,6 +294,6 @@ Parameters for both training and evaluation can be set in config.py
In train.py, we set the seed with setup_seed function.
# [ModelZoo Homepage](#contents)
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -67,8 +67,8 @@ RetinaFace使用ResNet50骨干提取图像特征进行检测。从ModelZoo获取
- 框架
- [MindSpore](https://www.mindspore.cn/install)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
# 快速入门

View File

@ -53,7 +53,7 @@ Dataset used: COCO2017
## [Mixed Precision](#contents)
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
# [Environment Requirements](#contents)
@ -68,8 +68,8 @@ To run the python scripts in the repository, you need to prepare the environment
- opencv-python 4.3.0.36
- pycocotools 2.0
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)

View File

@ -20,10 +20,10 @@
- [Inference Performance](#inference-performance)
- [ModelZoo Homepage](#modelzoo-homepage)
# [YOLOv4 Description](#contents)
YOLOv4 is a state-of-the-art detector which is faster (FPS) and more accurate (MS COCO AP50...95 and AP50) than all available alternative detectors.
YOLOv4 has verified a large number of features, and selected for use such of them for improving the accuracy of both the classifier and the detector.
YOLOv4 has verified a large number of features, and selected for use such of them for improving the accuracy of both the classifier and the detector.
These features can be used as best-practice for future studies and developments.
[Paper](https://arxiv.org/pdf/2004.10934.pdf):
@ -39,7 +39,8 @@ Dataset support: [MS COCO] or datasetd with the same format as MS COCO
Annotation support: [MS COCO] or annotation as the same format as MS COCO
- The directory structure is as follows, the name of directory and file is user define:
```
```text
├── dataset
├── YOLOv4
├── annotations
@ -55,23 +56,25 @@ Annotation support: [MS COCO] or annotation as the same format as MS COCO
└─picturen.jpg
```
we suggest user to use MS COCO dataset to experience our model,
other datasets need to use the same format as MS COCO.
# [Environment Requirements](#contents)
- HardwareAscend
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Framework
- [MindSpore](https://www.mindspore.cn/)
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
After installing MindSpore via the official website, you can start training and evaluation as follows:
```
```text
# The cspdarknet53_backbone.ckpt in the follow script is got from cspdarknet53 training like paper.
# The parameter of training_shape define image shape for network, default is
[416, 416],
@ -88,7 +91,7 @@ After installing MindSpore via the official website, you can start training and
# It means use 11 kinds of shape as input shape, or it can be set some kind of shape.
```
```
```bash
#run training example(1p) by python command
python train.py \
--data_dir=./dataset/xxx \
@ -102,17 +105,17 @@ python train.py \
--lr_scheduler=cosine_annealing > log.txt 2>&1 &
```
```
```bash
# standalone training example(1p) by shell script
sh run_standalone_train.sh dataset/xxx cspdarknet53_backbone.ckpt
```
```
```bash
# For Ascend device, distributed training example(8p) by shell script
sh run_distribute_train.sh dataset/xxx cspdarknet53_backbone.ckpt rank_table_8p.json
```
```
```bash
# run evaluation by python command
python eval.py \
--data_dir=./dataset/xxx \
@ -120,7 +123,7 @@ python eval.py \
--testing_shape=416 > log.txt 2>&1 &
```
```
```bash
# run evaluation by shell script
sh run_eval.sh dataset/xxx checkpoint/xxx.ckpt
```
@ -128,11 +131,12 @@ sh run_eval.sh dataset/xxx checkpoint/xxx.ckpt
# [Script Description](#contents)
## [Script and Sample Code](#contents)
```
└─yolov4
```text
└─yolov4
├─README.md
├─mindspore_hub_conf.py # config for mindspore hub
├─scripts
├─scripts
├─run_standalone_train.sh # launch standalone training(1p) in ascend
├─run_distribute_train.sh # launch distributed training(8p) in ascend
└─run_eval.sh # launch evaluating in ascend
@ -151,15 +155,17 @@ sh run_eval.sh dataset/xxx checkpoint/xxx.ckpt
├─util.py # util function
├─yolo.py # yolov4 network
├─yolo_dataset.py # create dataset for YOLOV4
├─eval.py # evaluate val results
├─test.py# # evaluate test results
└─train.py # train net
```
## [Script Parameters](#contents)
Major parameters train.py as follows:
```
```text
optional arguments:
-h, --help show this help message and exit
--device_target device where the code will be implemented: "Ascend", default is "Ascend"
@ -219,16 +225,21 @@ optional arguments:
```
## [Training Process](#contents)
YOLOv4 can be trained from the scratch or with the backbone named cspdarknet53.
YOLOv4 can be trained from the scratch or with the backbone named cspdarknet53.
Cspdarknet53 is a classifier which can be trained on some dataset like ImageNet(ILSVRC2012).
It is easy for users to train Cspdarknet53. Just replace the backbone of Classifier Resnet50 with cspdarknet53.
It is easy for users to train Cspdarknet53. Just replace the backbone of Classifier Resnet50 with cspdarknet53.
Resnet50 is easy to get in mindspore model zoo.
### Training
For Ascend device, standalone training example(1p) by shell script
```
```bash
sh run_standalone_train.sh dataset/coco2017 cspdarknet53_backbone.ckpt
```
```
```text
python train.py \
--data_dir=/dataset/xxx \
--pretrained_backbone=cspdarknet53_backbone.ckpt \
@ -240,10 +251,12 @@ python train.py \
--training_shape=416 \
--lr_scheduler=cosine_annealing > log.txt 2>&1 &
```
The python command above will run in the background, you can view the results through the file log.txt.
After training, you'll get some checkpoint files under the outputs folder by default. The loss value will be achieved as follows:
```
```text
# grep "loss:" train/log.txt
2020-10-16 15:00:37,483:INFO:epoch[0], iter[0], loss:8248.610352, 0.03 imgs/sec, lr:2.0466639227834094e-07
@ -259,13 +272,16 @@ After training, you'll get some checkpoint files under the outputs folder by def
```
### Distributed Training
For Ascend device, distributed training example(8p) by shell script
```
```bash
sh run_distribute_train.sh dataset/coco2017 cspdarknet53_backbone.ckpt rank_table_8p.json
```
The above shell script will run distribute training in the background. You can view the results through the file train_parallel[X]/log.txt. The loss value will be achieved as follows:
```
```text
# distribute training result(8p, shape=416)
...
2020-10-16 14:58:25,142:INFO:epoch[0], iter[1000], loss:242.509259, 388.73 imgs/sec, lr:0.00032783843926154077
@ -286,7 +302,7 @@ The above shell script will run distribute training in the background. You can v
```
```
```text
# distribute training result(8p, dynamic shape)
...
2020-10-16 20:40:17,148:INFO:epoch[0], iter[800], loss:283.765033, 248.93 imgs/sec, lr:0.00026233625249005854
@ -305,12 +321,11 @@ The above shell script will run distribute training in the background. You can v
...
```
## [Evaluation Process](#contents)
### Valid
```
```bash
python eval.py \
--data_dir=./dataset/coco2017 \
--pretrained=yolov4.ckpt \
@ -320,7 +335,8 @@ sh run_eval.sh dataset/coco2017 checkpoint/yolov4.ckpt
```
The above python command will run in the background. You can view the results through the file "log.txt". The mAP of the test dataset will be as follows:
```
```text
# log.txt
=============coco eval reulst=========
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.442
@ -336,8 +352,10 @@ The above python command will run in the background. You can view the results th
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.638
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.717
```
### Test-dev
```
```bash
python test.py \
--data_dir=./dataset/coco2017 \
--pretrained=yolov4.ckpt \
@ -345,11 +363,13 @@ python test.py \
OR
sh run_test.sh dataset/coco2017 checkpoint/yolov4.ckpt
```
The predict_xxx.json will be found in test/outputs/%Y-%m-%d_time_%H_%M_%S/.
Rename the file predict_xxx.json to detections_test-dev2017_yolov4_results.json and compress it to detections_test-dev2017_yolov4_results.zip
Submit file detections_test-dev2017_yolov4_results.zip to the MS COCO evaluation server for the test-dev2019 (bbox) https://competitions.codalab.org/competitions/20794#participate
Submit file detections_test-dev2017_yolov4_results.zip to the MS COCO evaluation server for the test-dev2019 (bbox) <https://competitions.codalab.org/competitions/20794#participate>
You will get such results in the end of file View scoring output log.
```
```text
overall performance
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.447
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.642
@ -364,9 +384,11 @@ overall performance
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.627
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.711
```
## [Convert Process](#contents)
### Convert
If you want to infer the network on Ascend 310, you should convert the model to AIR:
```python
@ -378,6 +400,7 @@ python src/export.py --pretrained=[PRETRAINED_BACKBONE] --batch_size=[BATCH_SIZE
## [Performance](#contents)
### Evaluation Performance
YOLOv4 on 118K images(The annotation and data format must be the same as coco2017)
| Parameters | YOLOv4 |
@ -394,9 +417,10 @@ YOLOv4 on 118K images(The annotation and data format must be the same as coco201
| Speed | 1p 53FPS 8p 390FPS(shape=416) 220FPS(dynamic shape) |
| Total time | 48h(dynamic shape) |
| Checkpoint for Fine tuning | about 500M (.ckpt file) |
| Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/ |
| Scripts | <https://gitee.com/mindspore/mindspore/tree/master/model_zoo/> |
### Inference Performance
YOLOv4 on 20K images(The annotation and data format must be the same as coco test2017 )
| Parameters | YOLOv4 |
@ -416,4 +440,5 @@ In dataset.py, we set the seed inside ```create_dataset``` function.
In var_init.py, we set seed for weight initilization
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -52,7 +52,7 @@ Note that you can run the scripts based on the dataset mentioned in original pap
- Install [MindSpore](https://www.mindspore.cn/install/en).
- For more information, please check the resources below:
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
## Software

View File

@ -550,8 +550,8 @@ The comparisons between MASS and other baseline methods in terms of PPL on Corne
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
## Requirements
@ -562,7 +562,7 @@ subword-nmt
rouge
```
<https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html>
<https://www.mindspore.cn/tutorial/training/en/master/advanced_use/migrate_3rd_scripts.html>
# Get started
@ -624,7 +624,7 @@ Get the log and output files under the path `./train_mass_*/`, and the model fil
## Inference
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html).
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/migrate_3rd_scripts.html).
For inference, config the options in `config.json` firstly:
- Assign the `test_dataset` under `dataset_config` node to the dataset path.

View File

@ -1,4 +1,5 @@
# Contents
- [Contents](#contents)
- [TinyBERT Description](#tinybert-description)
- [Model Architecture](#model-architecture)
@ -6,58 +7,64 @@
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
- [General Distill](#general-distill)
- [Task Distill](#task-distill)
- [Options and Parameters](#options-and-parameters)
- [Options:](#options)
- [Parameters:](#parameters)
- [Training Process](#training-process)
- [Training](#training)
- [running on Ascend](#running-on-ascend)
- [running on GPU](#running-on-gpu)
- [Distributed Training](#distributed-training)
- [running on Ascend](#running-on-ascend-1)
- [running on GPU](#running-on-gpu-1)
- [Evaluation Process](#evaluation-process)
- [Evaluation](#evaluation)
- [evaluation on SST-2 dataset](#evaluation-on-sst-2-dataset)
- [evaluation on MNLI dataset](#evaluation-on-mnli-dataset)
- [evaluation on QNLI dataset](#evaluation-on-qnli-dataset)
- [Model Description](#model-description)
- [Performance](#performance)
- [training Performance](#training-performance)
- [Inference Performance](#inference-performance)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
- [General Distill](#general-distill)
- [Task Distill](#task-distill)
- [Options and Parameters](#options-and-parameters)
- [Options:](#options)
- [Parameters:](#parameters)
- [Training Process](#training-process)
- [Training](#training)
- [running on Ascend](#running-on-ascend)
- [running on GPU](#running-on-gpu)
- [Distributed Training](#distributed-training)
- [running on Ascend](#running-on-ascend-1)
- [running on GPU](#running-on-gpu-1)
- [Evaluation Process](#evaluation-process)
- [Evaluation](#evaluation)
- [evaluation on SST-2 dataset](#evaluation-on-sst-2-dataset)
- [evaluation on MNLI dataset](#evaluation-on-mnli-dataset)
- [evaluation on QNLI dataset](#evaluation-on-qnli-dataset)
- [Model Description](#model-description)
- [Performance](#performance)
- [training Performance](#training-performance)
- [Inference Performance](#inference-performance)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [TinyBERT Description](#contents)
[TinyBERT](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/TinyBERT) is 7.5x smalller and 9.4x faster on inference than [BERT-base](https://github.com/google-research/bert) (the base version of BERT model) and achieves competitive performances in the tasks of natural language understanding. It performs a novel transformer distillation at both the pre-training and task-specific learning stages.
[Paper](https://arxiv.org/abs/1909.10351): Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu. [TinyBERT: Distilling BERT for Natural Language Understanding](https://arxiv.org/abs/1909.10351). arXiv preprint arXiv:1909.10351.
[Paper](https://arxiv.org/abs/1909.10351): Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu. [TinyBERT: Distilling BERT for Natural Language Understanding](https://arxiv.org/abs/1909.10351). arXiv preprint arXiv:1909.10351.
# [Model Architecture](#contents)
The backbone structure of TinyBERT is transformer, the transformer contains four encoder modules, one encoder contains one selfattention module and one selfattention module contains one attention module.
# [Dataset](#contents)
- Download the zhwiki or enwiki dataset for general distillation. Extract and clean text in the dataset with [WikiExtractor](https://github.com/attardi/wikiextractor). Convert the dataset to TFRecord format, please refer to create_pretraining_data.py which in [BERT](https://github.com/google-research/bert) repository.
- Download glue dataset for task distillation. Convert dataset files from json format to tfrecord format, please refer to run_classifier.py which in [BERT](https://github.com/google-research/bert) repository.
# [Environment Requirements](#contents)
- HardwareAscend/GPU
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Framework
- [MindSpore](https://gitee.com/mindspore/mindspore)
- [MindSpore](https://gitee.com/mindspore/mindspore)
- For more information, please check the resources below
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
After installing MindSpore via the official website, you can start general distill, task distill and evaluation as follows:
```bash
```text
# run standalone general distill example
bash scripts/run_standalone_gd.sh
bash scripts/run_standalone_gd.sh
Before running the shell script, please set the `load_teacher_ckpt_path`, `data_dir`, `schema_dir` and `dataset_type` in the run_standalone_gd.sh file first. If running on GPU, please set the `device_target=GPU`.
@ -70,7 +77,7 @@ Before running the shell script, please set the `load_teacher_ckpt_path`, `data_
bash scripts/run_distributed_gd_gpu.sh 8 1 /path/data/ /path/schema.json /path/teacher.ckpt
# run task distill and evaluation example
bash scripts/run_standalone_td.sh
bash scripts/run_standalone_td.sh
Before running the shell script, please set the `task_name`, `load_teacher_ckpt_path`, `load_gd_ckpt_path`, `train_data_dir`, `eval_data_dir`, `schema_dir` and `dataset_type` in the run_standalone_td.sh file first.
If running on GPU, please set the `device_target=GPU`.
@ -80,39 +87,41 @@ For distributed training on Ascend, a hccl configuration file with JSON format n
Please follow the instructions in the link below:
https:gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools.
For dataset, if you want to set the format and parameters, a schema configuration file with JSON format needs to be created, please refer to [tfrecord](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/dataset_loading.html#tfrecord) format.
```
For dataset, if you want to set the format and parameters, a schema configuration file with JSON format needs to be created, please refer to [tfrecord](https://www.mindspore.cn/doc/programming_guide/en/master/dataset_loading.html#tfrecord) format.
```text
For general task, schema file contains ["input_ids", "input_mask", "segment_ids"].
For task distill and eval phase, schema file contains ["input_ids", "input_mask", "segment_ids", "label_ids"].
For task distill and eval phase, schema file contains ["input_ids", "input_mask", "segment_ids", "label_ids"].
`numRows` is the only option which could be set by user, the others value must be set according to the dataset.
For example, the dataset is cn-wiki-128, the schema file for general distill phase as following:
{
"datasetType": "TF",
"numRows": 7680,
"columns": {
"input_ids": {
"type": "int64",
"rank": 1,
"shape": [256]
},
"input_mask": {
"type": "int64",
"rank": 1,
"shape": [256]
},
"segment_ids": {
"type": "int64",
"rank": 1,
"shape": [256]
}
}
"datasetType": "TF",
"numRows": 7680,
"columns": {
"input_ids": {
"type": "int64",
"rank": 1,
"shape": [256]
},
"input_mask": {
"type": "int64",
"rank": 1,
"shape": [256]
},
"segment_ids": {
"type": "int64",
"rank": 1,
"shape": [256]
}
}
}
```
# [Script Description](#contents)
## [Script and Sample Code](#contents)
```shell
@ -134,19 +143,21 @@ For example, the dataset is cn-wiki-128, the schema file for general distill pha
├─tinybert_model.py # backbone code of network
├─utils.py # util function
├─__init__.py
├─run_general_distill.py # train net for general distillation
├─run_task_distill.py # train and eval net for task distillation
├─run_general_distill.py # train net for general distillation
├─run_task_distill.py # train and eval net for task distillation
```
## [Script Parameters](#contents)
### General Distill
```
usage: run_general_distill.py [--distribute DISTRIBUTE] [--epoch_size N] [----device_num N] [--device_id N]
```text
usage: run_general_distill.py [--distribute DISTRIBUTE] [--epoch_size N] [----device_num N] [--device_id N]
[--device_target DEVICE_TARGET] [--do_shuffle DO_SHUFFLE]
[--enable_data_sink ENABLE_DATA_SINK] [--data_sink_steps N]
[--enable_data_sink ENABLE_DATA_SINK] [--data_sink_steps N]
[--save_ckpt_path SAVE_CKPT_PATH]
[--load_teacher_ckpt_path LOAD_TEACHER_CKPT_PATH]
[--save_checkpoint_step N] [--max_ckpt_num N]
[--save_checkpoint_step N] [--max_ckpt_num N]
[--data_dir DATA_DIR] [--schema_dir SCHEMA_DIR] [--dataset_type DATASET_TYPE] [train_steps N]
options:
@ -155,7 +166,7 @@ options:
--epoch_size epoch size: N, default is 1
--device_id device id: N, default is 0
--device_num number of used devices: N, default is 1
--save_ckpt_path path to save checkpoint files: PATH, default is ""
--save_ckpt_path path to save checkpoint files: PATH, default is ""
--max_ckpt_num max number for saving checkpoint files: N, default is 1
--do_shuffle enable shuffle: "true" | "false", default is "true"
--enable_data_sink enable data sink: "true" | "false", default is "true"
@ -166,14 +177,15 @@ options:
--schema_dir path to schema.json file, PATH, default is ""
--dataset_type the dataset type which can be tfrecord/mindrecord, default is tfrecord
```
### Task Distill
```
usage: run_general_task.py [--device_target DEVICE_TARGET] [--do_train DO_TRAIN] [--do_eval DO_EVAL]
[--td_phase1_epoch_size N] [--td_phase2_epoch_size N]
```text
usage: run_general_task.py [--device_target DEVICE_TARGET] [--do_train DO_TRAIN] [--do_eval DO_EVAL]
[--td_phase1_epoch_size N] [--td_phase2_epoch_size N]
[--device_id N] [--do_shuffle DO_SHUFFLE]
[--enable_data_sink ENABLE_DATA_SINK] [--save_ckpt_step N]
[--max_ckpt_num N] [--data_sink_steps N]
[--enable_data_sink ENABLE_DATA_SINK] [--save_ckpt_step N]
[--max_ckpt_num N] [--data_sink_steps N]
[--load_teacher_ckpt_path LOAD_TEACHER_CKPT_PATH]
[--load_gd_ckpt_path LOAD_GD_CKPT_PATH]
[--load_td1_ckpt_path LOAD_TD1_CKPT_PATH]
@ -188,8 +200,8 @@ options:
--td_phase1_epoch_size epoch size for td phase1: N, default is 10
--td_phase2_epoch_size epoch size for td phase2: N, default is 3
--device_id device id: N, default is 0
--do_shuffle enable shuffle: "true" | "false", default is "true"
--enable_data_sink enable data sink: "true" | "false", default is "true"
--do_shuffle enable shuffle: "true" | "false", default is "true"
--enable_data_sink enable data sink: "true" | "false", default is "true"
--save_ckpt_step steps for saving checkpoint files: N, default is 1000
--max_ckpt_num max number for saving checkpoint files: N, default is 1
--data_sink_steps set data sink steps: N, default is 1
@ -204,14 +216,17 @@ options:
```
## Options and Parameters
`gd_config.py` and `td_config.py` contain parameters of BERT model and options for optimizer and lossscale.
### Options:
```
### Options
```text
batch_size batch size of input dataset: N, default is 16
Parameters for lossscale:
loss_scale_value initial value of loss scale: N, default is 2^8
scale_factor factor used to update loss scale: N, default is 2
scale_window steps for once updatation of loss scale: N, default is 50
scale_window steps for once updatation of loss scale: N, default is 50
Parameters for optimizer:
learning_rate value of learning rate: Q
@ -221,8 +236,9 @@ Parameters for optimizer:
eps term added to the denominator to improve numerical stability: Q
```
### Parameters:
```
### Parameters
```text
Parameters for bert network:
seq_length length of input sequence: N, default is 128
vocab_size size of each embedding vector: N, must be consistant with the dataset you use. Default is 30522
@ -242,15 +258,22 @@ Parameters for bert network:
dtype data type of input: mstype.float16 | mstype.float32, default is mstype.float32
compute_type compute type in BertTransformer: mstype.float16 | mstype.float32, default is mstype.float16
```
## [Training Process](#contents)
### Training
#### running on Ascend
Before running the command below, please check `load_teacher_ckpt_path`, `data_dir` and `schma_dir` has been set. Please set the path to be the absolute full path, e.g:"/username/checkpoint_100_300.ckpt".
```
```bash
bash scripts/run_standalone_gd.sh
```
The command above will run in the background, you can view the results the file log.txt. After training, you will get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
```
```text
# grep "epoch" log.txt
epoch: 1, step: 100, outpus are (Tensor(shape=[1], dtype=Float32, 28.2093), Tensor(shape=[], dtype=Bool, False), Tensor(shape=[], dtype=Float32, 65536))
epoch: 2, step: 200, outpus are (Tensor(shape=[1], dtype=Float32, 30.1724), Tensor(shape=[], dtype=Bool, False), Tensor(shape=[], dtype=Float32, 65536))
@ -260,25 +283,34 @@ epoch: 2, step: 200, outpus are (Tensor(shape=[1], dtype=Float32, 30.1724), Tens
> **Attention** This will bind the processor cores according to the `device_num` and total processor numbers. If you don't expect to run pretraining with binding processor cores, remove the operations about `taskset` in `scripts/run_distributed_gd_ascend.sh`
#### running on GPU
Before running the command below, please check `load_teacher_ckpt_path`, `data_dir` `schma_dir` and `device_target=GPU` has been set. Please set the path to be the absolute full path, e.g:"/username/checkpoint_100_300.ckpt".
```
```bash
bash scripts/run_standalone_gd.sh
```
The command above will run in the background, you can view the results the file log.txt. After training, you will get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
```
```text
# grep "epoch" log.txt
epoch: 1, step: 100, outpus are 28.2093
...
```
### Distributed Training
#### running on Ascend
Before running the command below, please check `load_teacher_ckpt_path`, `data_dir` and `schma_dir` has been set. Please set the path to be the absolute full path, e.g:"/username/checkpoint_100_300.ckpt".
```
```bash
bash scripts/run_distributed_gd_ascend.sh 8 1 /path/hccl.json
```
The command above will run in the background, you can view the results the file log.txt. After training, you will get some checkpoint files under the LOG* folder by default. The loss value will be achieved as follows:
```
```text
# grep "epoch" LOG*/log.txt
epoch: 1, step: 100, outpus are (Tensor(shape=[1], dtype=Float32, 28.1478), Tensor(shape=[], dtype=Bool, False), Tensor(shape=[], dtype=Float32, 65536))
...
@ -287,25 +319,35 @@ epoch: 1, step: 100, outpus are (Tensor(shape=[1], dtype=Float32, 30.5901), Tens
```
#### running on GPU
Please input the path to be the absolute full path, e.g:"/username/checkpoint_100_300.ckpt".
```
```bash
bash scripts/run_distributed_gd_gpu.sh 8 1 /path/data/ /path/schema.json /path/teacher.ckpt
```
The command above will run in the background, you can view the results the file log.txt. After training, you will get some checkpoint files under the LOG* folder by default. The loss value will be achieved as follows:
```
```text
# grep "epoch" LOG*/log.txt
epoch: 1, step: 1, outpus are 63.4098
...
```
## [Evaluation Process](#contents)
### Evaluation
If you want to after running and continue to eval, please set `do_train=true` and `do_eval=true`, If you want to run eval alone, please set `do_train=false` and `do_eval=true`. If running on GPU, please set `device_target=GPU`.
#### evaluation on SST-2 dataset
```
```bash
bash scripts/run_standalone_td.sh
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
```bash
# grep "The best acc" log.txt
The best acc is 0.872685
@ -315,13 +357,18 @@ The best acc is 0.899305
The best acc is 0.902777
...
```
#### evaluation on MNLI dataset
Before running the command below, please check the load pretrain checkpoint path has been set. Please set the checkpoint path to be the absolute full path, e.g:"/username/pretrain/checkpoint_100_300.ckpt".
```
```bash
bash scripts/run_standalone_td.sh
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
```text
# grep "The best acc" log.txt
The best acc is 0.803206
The best acc is 0.803308
@ -330,13 +377,18 @@ The best acc is 0.810355
The best acc is 0.813929
...
```
#### evaluation on QNLI dataset
Before running the command below, please check the load pretrain checkpoint path has been set. Please set the checkpoint path to be the absolute full path, e.g:"/username/pretrain/checkpoint_100_300.ckpt".
```
```bash
bash scripts/run_standalone_td.sh
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
```text
# grep "The best acc" log.txt
The best acc is 0.870772
The best acc is 0.871691
@ -345,10 +397,13 @@ The best acc is 0.875183
The best acc is 0.891176
...
```
## [Model Description](#contents)
## [Performance](#contents)
### training Performance
| Parameters | Ascend | GPU |
| -------------------------- | ---------------------------------------------------------- | ------------------------- |
| Model Version | TinyBERT | TinyBERT |
@ -364,13 +419,13 @@ The best acc is 0.891176
| Speed | 35.4ms/step | 98.654ms/step |
| Total time | 17.3h(3poch, 8p) | 48h(3poch, 8p) |
| Params (M) | 15M | 15M |
| Checkpoint for task distill| 74M(.ckpt file) | 74M(.ckpt file) |
| Checkpoint for task distill| 74M(.ckpt file) | 74M(.ckpt file) |
| Scripts | [TinyBERT](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/nlp/tinybert) | |
#### Inference Performance
| Parameters | Ascend | GPU |
| -------------------------- | ----------------------------- | ------------------------- |
| -------------------------- | ----------------------------- | ------------------------- |
| Model Version | | |
| Resource | Ascend 910 | NV SMX2 V100-32G |
| uploaded Date | 08/20/2020 | 08/24/2020 |
@ -384,12 +439,12 @@ The best acc is 0.891176
# [Description of Random Situation](#contents)
In run_standaloned_td.sh, we set do_shuffle to shuffle the dataset.
In run_standaloned_td.sh, we set do_shuffle to shuffle the dataset.
In gd_config.py and td_config.py, we set the hidden_dropout_prob and attention_pros_dropout_prob to dropout some network node.
In run_general_distill.py, we set the random seed to make sure distribute training has the same init weight.
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -6,7 +6,7 @@
- [Features](#features)
- [Mixed Precision](#mixed-precision)
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Quick Start](#quick-start)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
@ -20,46 +20,48 @@
- [Evaluation Performance](#evaluation-performance)
- [Inference Performance](#evaluation-performance)
- [How to use](#how-to-use)
- [Inference](#inference)
- [Inference](#inference)
- [Continue Training on the Pretrained Model](#continue-training-on-the-pretrained-model)
- [Transfer Learning](#transfer-learning)
- [Transfer Learning](#transfer-learning)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [NCF Description](#contents)
NCF is a general framework for collaborative filtering of recommendations in which a neural network architecture is used to model user-item interactions. Unlike traditional models, NCF does not resort to Matrix Factorization (MF) with an inner product on latent features of users and items. It replaces the inner product with a multi-layer perceptron that can learn an arbitrary function from data.
[Paper](https://arxiv.org/abs/1708.05031): He X, Liao L, Zhang H, et al. Neural collaborative filtering[C]//Proceedings of the 26th international conference on world wide web. 2017: 173-182.
# [Model Architecture](#contents)
Two instantiations of NCF are Generalized Matrix Factorization (GMF) and Multi-Layer Perceptron (MLP). GMF applies a linear kernel to model the latent feature interactions, and and MLP uses a nonlinear kernel to learn the interaction function from data. NeuMF is a fused model of GMF and MLP to better model the complex user-item interactions, and unifies the strengths of linearity of MF and non-linearity of MLP for modeling the user-item latent structures. NeuMF allows GMF and MLP to learn separate embeddings, and combines the two models by concatenating their last hidden layer. [neumf_model.py](neumf_model.py) defines the architecture details.
# [Dataset](#contents)
The [MovieLens datasets](http://files.grouplens.org/datasets/movielens/) are used for model training and evaluation. Specifically, we use two datasets: **ml-1m** (short for MovieLens 1 million) and **ml-20m** (short for MovieLens 20 million).
### ml-1m
## ml-1m
ml-1m dataset contains 1,000,209 anonymous ratings of approximately 3,706 movies made by 6,040 users who joined MovieLens in 2000. All ratings are contained in the file "ratings.dat" without header row, and are in the following format:
```
```cpp
UserID::MovieID::Rating::Timestamp
```
- UserIDs range between 1 and 6040.
- MovieIDs range between 1 and 3952.
- Ratings are made on a 5-star scale (whole-star ratings only).
### ml-20m
- UserIDs range between 1 and 6040.
- MovieIDs range between 1 and 3952.
- Ratings are made on a 5-star scale (whole-star ratings only).
## ml-20m
ml-20m dataset contains 20,000,263 ratings of 26,744 movies by 138493 users. All ratings are contained in the file "ratings.csv". Each line of this file after the header row represents one rating of one movie by one user, and has the following format:
```
```text
userId,movieId,rating,timestamp
```
- The lines within this file are ordered first by userId, then, within user, by movieId.
- Ratings are made on a 5-star scale, with half-star increments (0.5 stars - 5.0 stars).
- The lines within this file are ordered first by userId, then, within user, by movieId.
- Ratings are made on a 5-star scale, with half-star increments (0.5 stars - 5.0 stars).
In both datasets, the timestamp is represented in seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970. Each user has at least 20 ratings.
@ -67,26 +69,22 @@ In both datasets, the timestamp is represented in seconds since midnight Coordin
## Mixed Precision
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
# [Environment Requirements](#contents)
- HardwareAscend/GPU
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
After installing MindSpore via the official website, you can start training and evaluation as follows:
After installing MindSpore via the official website, you can start training and evaluation as follows:
```python
#run data process
@ -102,34 +100,31 @@ sh scripts/run_train.sh rank_table.json
sh run_eval.sh
```
# [Script Description](#contents)
## [Script and Sample Code](#contents)
```
├── ModelZoo_NCF_ME
```text
├── ModelZoo_NCF_ME
├── README.md // descriptions about NCF
├── scripts
│ ├──run_train.sh // shell script for train
│ ├──run_distribute_train.sh // shell script for distribute train
│ ├──run_eval.sh // shell script for evaluation
│ ├──run_download_dataset.sh // shell script for dataget and process
│ ├──run_transfer_ckpt_to_air.sh // shell script for transfer model style
├── src
├── scripts
│ ├──run_train.sh // shell script for train
│ ├──run_distribute_train.sh // shell script for distribute train
│ ├──run_eval.sh // shell script for evaluation
│ ├──run_download_dataset.sh // shell script for dataget and process
│ ├──run_transfer_ckpt_to_air.sh // shell script for transfer model style
├── src
│ ├──dataset.py // creating dataset
│ ├──ncf.py // ncf architecture
│ ├──config.py // parameter configuration
│ ├──movielens.py // data download file
│ ├──callbacks.py // model loss and eval callback file
│ ├──constants.py // the constants of model
│ ├──export.py // export checkpoint files into geir/onnx
│ ├──config.py // parameter configuration
│ ├──movielens.py // data download file
│ ├──callbacks.py // model loss and eval callback file
│ ├──constants.py // the constants of model
│ ├──export.py // export checkpoint files into geir/onnx
│ ├──metrics.py // the file for auc compute
│ ├──stat_utils.py // the file for data process functions
├── train.py // training script
├── eval.py // evaluation script
├── train.py // training script
├── eval.py // evaluation script
```
## [Script Parameters](#contents)
@ -149,15 +144,15 @@ Parameters for both training and evaluation can be set in config.py.
* `--num_factors`The Embedding size of MF model.
* `--output_path`The location of the output file.
* `--eval_file_name` : Eval output file.
* `--loss_file_name` : Loss output file.
* `--loss_file_name` : Loss output file.
```
## [Training Process](#contents)
### Training
### Training
```python
bash scripts/run_train.sh
bash scripts/run_train.sh
```
The python command above will run in the background, you can view the results through the file `train.log`. After training, you'll get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
@ -171,7 +166,7 @@ Parameters for both training and evaluation can be set in config.py.
...
```
The model checkpoint will be saved in the current directory.
The model checkpoint will be saved in the current directory.
## [Evaluation Process](#contents)
@ -182,7 +177,7 @@ Parameters for both training and evaluation can be set in config.py.
Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "checkpoint/ncf-125_390.ckpt".
```python
sh scripts/run_eval.sh
sh scripts/run_eval.sh
```
The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows:
@ -192,12 +187,11 @@ Parameters for both training and evaluation can be set in config.py.
HR:0.6846,NDCG:0.410
```
# [Model Description](#contents)
## [Performance](#contents)
### Evaluation Performance
### Evaluation Performance
| Parameters | Ascend |
| -------------------------- | ------------------------------------------------------------ |
@ -213,90 +207,86 @@ Parameters for both training and evaluation can be set in config.py.
| Speed | 1pc: 0.575 ms/step |
| Total time | 1pc: 5 mins |
### Inference Performance
| Parameters | Ascend |
| ------------------- | --------------------------- |
| Model Version | NCF |
| Resource | Ascend 910 |
| Uploaded Date | 10/23/2020 (month/day/year) |
| Parameters | Ascend |
| ------------------- | --------------------------- |
| Model Version | NCF |
| Resource | Ascend 910 |
| Uploaded Date | 10/23/2020 (month/day/year) |
| MindSpore Version | 1.0.0 |
| Dataset | ml-1m |
| batch_size | 256 |
| outputs | probability |
| Accuracy | HR:0.6846,NDCG:0.410 |
| Dataset | ml-1m |
| batch_size | 256 |
| outputs | probability |
| Accuracy | HR:0.6846,NDCG:0.410 |
## [How to use](#contents)
### Inference
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html). Following the steps below, this is a simple example:
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/migrate_3rd_scripts.html). Following the steps below, this is a simple example:
https://www.mindspore.cn/tutorial/zh-CN/master/use/multi_platform_inference.html
<https://www.mindspore.cn/tutorial/inference/en/master/multi_platform_inference.html>
```
```python
# Load unseen dataset for inference
dataset = dataset.create_dataset(cfg.data_path, 1, False)
# Define model
# Define model
net = GoogleNet(num_classes=cfg.num_classes)
opt = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), 0.01,
cfg.momentum, weight_decay=cfg.weight_decay)
loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
model = Model(net, loss_fn=loss, optimizer=opt, metrics={'acc'})
# Load pre-trained model
param_dict = load_checkpoint(cfg.checkpoint_path)
load_param_into_net(net, param_dict)
net.set_train(False)
# Make predictions on the unseen dataset
acc = model.eval(dataset)
print("accuracy: ", acc)
```
### Continue Training on the Pretrained Model
### Continue Training on the Pretrained Model
```
```python
# Load dataset
dataset = create_dataset(cfg.data_path, cfg.epoch_size)
batch_num = dataset.get_dataset_size()
# Define model
net = GoogleNet(num_classes=cfg.num_classes)
# Continue training if set pre_trained to be True
if cfg.pre_trained:
param_dict = load_checkpoint(cfg.checkpoint_path)
load_param_into_net(net, param_dict)
lr = lr_steps(0, lr_max=cfg.lr_init, total_epochs=cfg.epoch_size,
lr = lr_steps(0, lr_max=cfg.lr_init, total_epochs=cfg.epoch_size,
steps_per_epoch=batch_num)
opt = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()),
opt = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()),
Tensor(lr), cfg.momentum, weight_decay=cfg.weight_decay)
loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
model = Model(net, loss_fn=loss, optimizer=opt, metrics={'acc'},
amp_level="O2", keep_batchnorm_fp32=False, loss_scale_manager=None)
# Set callbacks
config_ck = CheckpointConfig(save_checkpoint_steps=batch_num * 5,
# Set callbacks
config_ck = CheckpointConfig(save_checkpoint_steps=batch_num * 5,
keep_checkpoint_max=cfg.keep_checkpoint_max)
time_cb = TimeMonitor(data_size=batch_num)
ckpoint_cb = ModelCheckpoint(prefix="train_googlenet_cifar10", directory="./",
ckpoint_cb = ModelCheckpoint(prefix="train_googlenet_cifar10", directory="./",
config=config_ck)
loss_cb = LossMonitor()
# Start training
model.train(cfg.epoch_size, dataset, callbacks=[time_cb, ckpoint_cb, loss_cb])
print("train success")
```
# [Description of Random Situation](#contents)
In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py.
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -32,7 +32,7 @@ FCN-4 is a convolutional neural network architecture, its name FCN-4 comes from
### Mixed Precision
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
## [Environment Requirements](#contents)
@ -42,8 +42,8 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
## [Quick Start](#contents)

View File

@ -90,8 +90,8 @@ We use about 91K face images as training dataset and 11K as evaluating dataset i
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below:
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Script Description](#contents)

View File

@ -74,8 +74,8 @@ We use about 13K images as training dataset and 3K as evaluating dataset in this
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Script Description](#contents)

View File

@ -72,8 +72,8 @@ We use about 122K face images as training dataset and 2K as evaluating dataset i
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below:
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Script Description](#contents)

View File

@ -60,8 +60,8 @@ The directory structure is as follows:
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Script Description](#contents)
@ -241,4 +241,4 @@ sh run_export.sh 16 0 ./0-1_1.ckpt
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -60,8 +60,8 @@ The directory structure is as follows:
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below:
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Script Description](#contents)

View File

@ -37,7 +37,7 @@ In the current model, we use CenterNet to estimate multi-person pose. The DLA(De
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
Dataset used: [COCO2017](<https://cocodataset.org/>)
Dataset used: [COCO2017](https://cocodataset.org/)
- Dataset size26G
- Train19G118000 images
@ -81,8 +81,8 @@ Dataset used: [COCO2017](<https://cocodataset.org/>)
- Framework
- [MindSpore](https://cmc-szv.clouddragon.huawei.com/cmcversion/index/search?searchKey=Do-MindSpore%20V100R001C00B622)
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
- Download the dataset COCO2017.
- We use COCO2017 as training dataset in this example by default, and you can also use your own datasets.

View File

@ -4,7 +4,7 @@
- [Model Architecture](#model-architecture)
- [Dataset](#dataset)
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Quick Start](#quick-start)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
@ -17,20 +17,18 @@
- [Evaluation Performance](#evaluation-performance)
- [Inference Performance](#evaluation-performance)
- [How to use](#how-to-use)
- [Inference](#inference)
- [Inference](#inference)
- [Continue Training on the Pretrained Model](#continue-training-on-the-pretrained-model)
- [Transfer Learning](#transfer-learning)
- [Transfer Learning](#transfer-learning)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [DS-CNN Description](#contents)
DS-CNN, depthwise separable convolutional neural network, was first used in Keyword Spotting in 2017. KWS application has highly constrained power budget and typically runs on tiny microcontrollers with limited memory and compute capability. depthwise separable convolutions are more efficient both in number of parameters and operations, which makes deeper and wider architecture possible even in the resource-constrained microcontroller devices.
DS-CNN, depthwise separable convolutional neural network, was first used in Keyword Spotting in 2017. KWS application has highly constrained power budget and typically runs on tiny microcontrollers with limited memory and compute capability. depthwise separable convolutions are more efficient both in number of parameters and operations, which makes deeper and wider architecture possible even in the resource-constrained microcontroller devices.
[Paper](https://arxiv.org/abs/1711.07128): Zhang, Yundong, Naveen Suda, Liangzhen Lai, and Vikas Chandra. "Hello edge: Keyword spotting on microcontrollers." arXiv preprint arXiv:1711.07128 (2017).
# [Model Architecture](#contents)
The overall network architecture of DS-CNN is show below:
@ -38,49 +36,47 @@ The overall network architecture of DS-CNN is show below:
# [Dataset](#contents)
Dataset used: [Speech commands dataset version 1](<https://ai.googleblog.com/2017/08/launching-speech-commands-dataset.html>)
Dataset used: [Speech commands dataset version 1](https://ai.googleblog.com/2017/08/launching-speech-commands-dataset.html)
- Dataset size2.02GiB, 65,000 one-second long utterances of 30 short words, by thousands of different people
- Train 80%
- Val 10%
- Test 10%
- Train 80%
- Val 10%
- Test 10%
- Data formatWAVE format file, with the sample data encoded as linear 16-bit single-channel PCM values, at a 16 KHz rate
- NoteData will be processed in download_process_data.py
- NoteData will be processed in download_process_data.py
Dataset used: [Speech commands dataset version 2](<https://arxiv.org/abs/1804.03209>)
Dataset used: [Speech commands dataset version 2](https://arxiv.org/abs/1804.03209)
- Dataset size 8.17 GiB. 105,829 a one-second (or less) long utterances of 35 words by 2,618 speakers
- Train 80%
- Val 10%
- Test 10%
- Train 80%
- Val 10%
- Test 10%
- Data formatWAVE format file, with the sample data encoded as linear 16-bit single-channel PCM values, at a 16 KHz rate
- NoteData will be processed in download_process_data.py
- NoteData will be processed in download_process_data.py
# [Environment Requirements](#contents)
- HardwareAscend/GPU
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- [MindSpore](https://www.mindspore.cn/install/en)
- Third party open source packageif have
- numpy
- soundfile
- python_speech_features
- numpy
- soundfile
- python_speech_features
- For more information, please check the resources below
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html)
- [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html)
- [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
After installing MindSpore via the official website, you can start training and evaluation as follows:
First set the config for data, train, eval in src/config.py
- download and process dataset
```
```bash
python src/download_process_data.py
```
@ -88,8 +84,8 @@ First set the config for data, train, eval in src/config.py
```python
# run training example
python train.py
python train.py
# run evaluation example
# if you want to eval a specific model, you should specify model_dir to the ckpt path:
python eval.py --model_dir your_ckpt_path
@ -102,14 +98,14 @@ First set the config for data, train, eval in src/config.py
## [Script and Sample Code](#contents)
```
├── dscnn
```text
├── dscnn
├── README.md // descriptions about ds-cnn
├── scripts
│ ├──run_download_process_data.sh // shell script for download dataset and prepare feature and label
├── scripts
│ ├──run_download_process_data.sh // shell script for download dataset and prepare feature and label
│ ├──run_train_ascend.sh // shell script for train on ascend
│ ├──run_eval_ascend.sh // shell script for evaluation on ascend
├── src
│ ├──run_eval_ascend.sh // shell script for evaluation on ascend
├── src
│ ├──callback.py // callbacks
│ ├──config.py // parameter configuration of data, train and eval
│ ├──dataset.py // creating dataset
@ -118,10 +114,10 @@ First set the config for data, train, eval in src/config.py
│ ├──log.py // logging class
│ ├──loss.py // loss function
│ ├──lr_scheduler.py // lr_scheduler
│ ├──models.py // load ckpt
│ ├──utils.py // some function for prepare data
├── train.py // training script
├── eval.py // evaluation script
│ ├──models.py // load ckpt
│ ├──utils.py // some function for prepare data
├── train.py // training script
├── eval.py // evaluation script
├── export.py // export checkpoint files into air/geir
├── requirements.txt // Third party open source package
```
@ -130,21 +126,21 @@ First set the config for data, train, eval in src/config.py
Parameters for both training and evaluation can be set in config.py.
- config for dataset for Speech commands dataset version 1
- config for dataset for Speech commands dataset version 1
```python
'data_url': 'http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz'
'data_url': 'http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz'
# Location of speech training data archive on the web
'data_dir': 'data' # Where to download the dataset
'feat_dir': 'feat' # Where to save the feature and label of audios
'feat_dir': 'feat' # Where to save the feature and label of audios
'background_volume': 0.1 # How loud the background noise should be, between 0 and 1.
'background_frequency': 0.8 # How many of the training samples have background noise mixed in.
'silence_percentage': 10.0 # How much of the training data should be silence.
'unknown_percentage': 10.0 # How much of the training data should be unknown words
'time_shift_ms': 100.0 # Range to randomly shift the training audio by in time
'silence_percentage': 10.0 # How much of the training data should be silence.
'unknown_percentage': 10.0 # How much of the training data should be unknown words
'time_shift_ms': 100.0 # Range to randomly shift the training audio by in time
'testing_percentage': 10 # What percentage of wavs to use as a test set
'validation_percentage': 10 # What percentage of wavs to use as a validation set
'wanted_words': 'yes,no,up,down,left,right,on,off,stop,go'
'wanted_words': 'yes,no,up,down,left,right,on,off,stop,go'
# Words to use (others will be added to an unknown label)
'sample_rate': 16000 # Expected sample rate of the wavs
'device_id': 1000 # device ID used to train or evaluate the dataset.
@ -153,25 +149,25 @@ Parameters for both training and evaluation can be set in config.py.
'window_stride_ms': 20.0 # How long each spectrogram timeslice is
'dct_coefficient_count': 20 # How many bins to use for the MFCC fingerprint
```
- config for DS-CNN and train parameters of Speech commands dataset version 1
- config for DS-CNN and train parameters of Speech commands dataset version 1
```python
'model_size_info': [6, 276, 10, 4, 2, 1, 276, 3, 3, 2, 2, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1]
'model_size_info': [6, 276, 10, 4, 2, 1, 276, 3, 3, 2, 2, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1]
# Model dimensions - different for various models
'drop': 0.9 # dropout
'pretrained': '' # model_path, local pretrained model to load
'drop': 0.9 # dropout
'pretrained': '' # model_path, local pretrained model to load
'use_graph_mode': 1 # use graph mode or feed mode
'val_interval': 1 # validate interval
'per_batch_size': 100 # batch size for per gpu
'lr_scheduler': 'multistep' # lr-scheduler, option type: multistep, cosine_annealing
'lr_scheduler': 'multistep' # lr-scheduler, option type: multistep, cosine_annealing
'lr': 0.1 # learning rate of the training
'lr_epochs': '20,40,60,80' # epoch of lr changing
'lr_epochs': '20,40,60,80' # epoch of lr changing
'lr_gamma': 0.1 # decrease lr by a factor of exponential lr_scheduler
'eta_min': 0 # eta_min in cosine_annealing scheduler
'T_max': 80 # T-max in cosine_annealing scheduler
'max_epoch': 80 # max epoch num to train the model
'warmup_epochs': 0 # warmup epoch
'warmup_epochs': 0 # warmup epoch
'weight_decay': 0.001 # weight decay
'momentum': 0.98 # weight decay
'log_interval': 100 # logging interval
@ -179,12 +175,12 @@ Parameters for both training and evaluation can be set in config.py.
'ckpt_interval': 100 # save ckpt_interval
```
- config for DS-CNN and evaluation parameters of Speech commands dataset version 1
- config for DS-CNN and evaluation parameters of Speech commands dataset version 1
```python
'feat_dir': 'feat' # Where to save the feature of audios
'model_dir': '' # which folder the models are saved in or specific path of one model
'wanted_words': 'yes,no,up,down,left,right,on,off,stop,go'
'model_dir': '' # which folder the models are saved in or specific path of one model
'wanted_words': 'yes,no,up,down,left,right,on,off,stop,go'
# Words to use (others will be added to an unknown label)
'sample_rate': 16000 # Expected sample rate of the wavs
'device_id': 1000 # device ID used to train or evaluate the dataset.
@ -192,33 +188,35 @@ Parameters for both training and evaluation can be set in config.py.
'window_size_ms': 40.0 # How long each spectrogram timeslice is
'window_stride_ms': 20.0 # How long each spectrogram timeslice is
'dct_coefficient_count': 20 # How many bins to use for the MFCC fingerprint
'model_size_info': [6, 276, 10, 4, 2, 1, 276, 3, 3, 2, 2, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1]
'model_size_info': [6, 276, 10, 4, 2, 1, 276, 3, 3, 2, 2, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1, 276, 3, 3, 1, 1]
# Model dimensions - different for various models
'pre_batch_size': 100 # batch size for eval
'drop': 0.9 # dropout in train
'log_path': 'eval_outputs' # path to save eval log
```
## [Training Process](#contents)
### Training
### Training
- running on Ascend
for shell script:
```python
# sh srcipts/run_train_ascend.sh [device_id]
sh srcipts/run_train_ascend.sh 0
```
for python script:
```python
# python train.py --device_id [device_id]
python train.py --device_id 0
```
you can see the args and loss, acc info on your screen, you also can view the results in folder train_outputs
```python
epoch[1], iter[443], loss:0.73811543, mean_wps:12102.26 wavs/sec
Eval: top1_cor:737, top5_cor:1699, tot:3000, acc@1=24.57%, acc@5=56.63%
@ -229,9 +227,7 @@ Parameters for both training and evaluation can be set in config.py.
Best epoch:41 acc:93.73%
```
The checkpoints and log will be saved in the train_outputs.
The checkpoints and log will be saved in the train_outputs.
## [Evaluation Process](#contents)
@ -242,17 +238,20 @@ Parameters for both training and evaluation can be set in config.py.
Before running the command below, please check the checkpoint path used for evaluation. Please set model_dir in config.py or pass model_dir in your command line.
for shell scripts:
```python
```bash
# sh scripts/run_eval_ascend.sh device_id model_dir
sh scripts/run_eval_ascend.sh 0 train_outputs/*/*.ckpt
or
or
sh scripts/run_eval_ascend.sh 0 train_outputs/*/
```
for python scripts:
```python
```bash
# python eval.py --device_id device_id --model_dir model_dir
python eval.py --device_id 0 --model_dir train_outputs/*/*.ckpt
or
or
python eval.py --device_id 0 --model_dir train_outputs/*
```
@ -264,51 +263,49 @@ Parameters for both training and evaluation can be set in config.py.
```
# [Model Description](#contents)
## [Performance](#contents)
### Train Performance
### Train Performance
| Parameters | Ascend |
| Parameters | Ascend |
| -------------------------- | ------------------------------------------------------------ |
| Model Version | DS-CNN |
| Model Version | DS-CNN |
| Resource | Ascend 910 CPU 2.60GHz56coresMemory314G |
| uploaded Date | 27/09/2020 (month/day/year) |
| MindSpore Version | 1.0.0 |
| Dataset | Speech commands dataset version 1 |
| Training Parameters | epoch=80, batch_size = 100, lr=0.1 |
| Optimizer | Momentum |
| Dataset | Speech commands dataset version 1 |
| Training Parameters | epoch=80, batch_size = 100, lr=0.1 |
| Optimizer | Momentum |
| Loss Function | Softmax Cross Entropy |
| outputs | probability |
| outputs | probability |
| Loss | 0.0019 |
| Speed | 2s/epoch |
| Speed | 2s/epoch |
| Total time | 4 mins |
| Parameters (K) | 500K |
| Parameters (K) | 500K |
| Checkpoint for Fine tuning | 3.3M (.ckpt file) |
| Script | [Link]() | [Link]() |
### Inference Performance
| Parameters | Ascend |
| ------------------- | --------------------------- |
| Model Version | DS-CNN |
| ------------------- | --------------------------- |
| Model Version | DS-CNN |
| Resource | Ascend 910 |
| Uploaded Date | 09/27/2020 |
| MindSpore Version | 1.0.0 |
| Dataset |Speech commands dataset version 1 |
| Training Parameters | src/config.py |
| outputs | probability |
| Accuracy | 93.96% |
| Total time | 3min |
| Params (K) | 500K |
|Checkpoint for Fine tuning (M) | 3.3M |
| MindSpore Version | 1.0.0 |
| Dataset |Speech commands dataset version 1 |
| Training Parameters | src/config.py |
| outputs | probability |
| Accuracy | 93.96% |
| Total time | 3min |
| Params (K) | 500K |
|Checkpoint for Fine tuning (M) | 3.3M |
# [Description of Random Situation](#contents)
In download_process_data.py, we set the seed for split train, val, test set.
In download_process_data.py, we set the seed for split train, val, test set.
# [ModelZoo Homepage](#contents)
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

View File

@ -27,7 +27,7 @@ Specifically, the TextRCNN is mainly composed of three parts: a recurrent struct
## [Dataset](#contents)
Dataset used: [Sentence polarity dataset v1.0](<http://www.cs.cornell.edu/people/pabo/movie-review-data/>)
Dataset used: [Sentence polarity dataset v1.0](http://www.cs.cornell.edu/people/pabo/movie-review-data/)
- Dataset size10662 movie comments in 2 classes, 9596 comments for train set, 1066 comments for test set.
- Data formattext files. The processed data is in ```./data/```
@ -36,7 +36,7 @@ Dataset used: [Sentence polarity dataset v1.0](<http://www.cs.cornell.edu/people
- Hardware: Ascend
- Framework: [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below[MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/master/index.html), [MindSpore API](https://www.mindspore.cn/api/zh-CN/master/index.html).
- For more information, please check the resources below[MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html), [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html).
## [Quick Start](#contents)