Commit Graph

5915 Commits

Author SHA1 Message Date
mindspore-ci-bot e6c596a9d9 !3653 add epoch_num description master
Merge pull request !3653 from panfengfeng/add_epoch_num_description_master
2020-07-30 09:53:56 +08:00
mindspore-ci-bot 6ea2aa4e73 !3672 fix serving input numbers
Merge pull request !3672 from hexia/fix_input_check
2020-07-30 09:17:16 +08:00
mindspore-ci-bot 389cb35740 !3661 Alarm modification
Merge pull request !3661 from shenwei41/sw_master
2020-07-30 09:10:01 +08:00
mindspore-ci-bot 8f35d2ed29 !3664 Modify the order of init and open of TDT
Merge pull request !3664 from hanjun996/master
2020-07-30 09:07:19 +08:00
mindspore-ci-bot b73ea6a7aa !3668 Modify collecting graph and dataset graph to step end stage
Merge pull request !3668 from ougongchang/fix_collect_dataset
2020-07-30 09:06:42 +08:00
mindspore-ci-bot 567509affc !3522 add tinybert scripts
Merge pull request !3522 from wanghua/c75_tinybert
2020-07-30 09:05:38 +08:00
mindspore-ci-bot 6f70146153 !3660 modify readme for maskrcnn
Merge pull request !3660 from meixiaowei/master
2020-07-29 22:39:57 +08:00
mindspore-ci-bot d66e6b33bf !3665 support multy node training in deeplabv3
Merge pull request !3665 from zhouyaqiang0/master
2020-07-29 21:54:24 +08:00
mindspore-ci-bot afce1c3a40 !3341 GPU maxpool with argmax op
Merge pull request !3341 from tom_chen/maxpool_with_argmax
2020-07-29 21:24:38 +08:00
ougongchang 1dafb2c6f5 Modify collecting graph and dataset graph to step end stage
We collect graph and dataset graph in begin stage before,
If there compile graph fail in GPU, we also collect graph
and dataset graph to summary dir, it will confuse user.

So we collect graph and dataset graph in step end stage now,
If there compile graph fail, we will not collect graph and dataset
graph.
2020-07-29 20:23:08 +08:00
shenwei41 051c290d8b Modify patches and alerts 2020-07-29 19:26:06 +08:00
zhouyaqiang b0004a1791 support multy node training and remove code 2020-07-29 19:10:28 +08:00
mindspore-ci-bot 387dac5832 !3651 change num_samples definition
Merge pull request !3651 from jiangzhiwen/dataset/change_num_samples
2020-07-29 19:03:54 +08:00
hanjun996 20ccf83826 modify tdt 2020-07-29 19:03:12 +08:00
mindspore-ci-bot a3e7c4c754 !3625 Optimize tensor data
Merge pull request !3625 from hewei/optimize_tensor_data
2020-07-29 18:50:20 +08:00
mindspore-ci-bot fe514bd1cc !3644 [MD] fix minddataset core dump when file list size ia greater than 1000.
Merge pull request !3644 from liyong126/fix_mindrecord_bug
2020-07-29 18:28:49 +08:00
hexia 3100824703 fix input 2020-07-29 18:25:35 +08:00
mindspore-ci-bot a337a02732 !3638 fix codex and support akg op profiling
Merge pull request !3638 from geekun/yjk_master
2020-07-29 17:30:35 +08:00
meixiaowei 8950952fe3 modify readme 2020-07-29 16:43:28 +08:00
mindspore-ci-bot 1b69923472 !3643 Throw exception if different communication ops which are divided to the same segement share the same input
Merge pull request !3643 from huanghui/communication-op-fusion
2020-07-29 16:10:59 +08:00
mindspore-ci-bot d4b52ac59f !3489 use kernelruntime::mem_manager to reduce rtMalloc and rtFree time in trans data format
Merge pull request !3489 from lvchangquan/master
2020-07-29 16:10:21 +08:00
panfengfeng 4644085e92 add epoch_num 2020-07-29 16:10:06 +08:00
wanghua 7dd5e78fde add tinybert scripts 2020-07-29 16:07:04 +08:00
jiangzhiwen 1eda0ef071 change num_samples definition 2020-07-29 15:37:59 +08:00
mindspore-ci-bot fcdad59ce6 !3594 fix batchnorm issue under mix precision in pynative mode
Merge pull request !3594 from wangqiuliang/fix-batchnorm-under-mix-precision-in-pynative
2020-07-29 15:27:23 +08:00
mindspore-ci-bot 12a150bb5d !3630 not reuse refnode input's memory
Merge pull request !3630 from laiyongqiang/refnode_input_fix
2020-07-29 14:59:44 +08:00
mindspore-ci-bot c57ad1528f !3635 fix dataset & train gil lock of gpu process master
Merge pull request !3635 from panfengfeng/fix_dataset_train_gil_of_gpu_master
2020-07-29 14:41:32 +08:00
mindspore-ci-bot 44e739ae31 !3627 fix: device occupied tdt hung
Merge pull request !3627 from guozhijian/fix_device_occupied_tdt_hung
2020-07-29 14:40:35 +08:00
geekun 17d71280b8 fix codex and support akg op profiling 2020-07-29 14:36:14 +08:00
mindspore-ci-bot 9ccc6889eb !3624 fix GeneratorDataset time out
Merge pull request !3624 from yanghaitao/yht_generator_timeout
2020-07-29 14:27:18 +08:00
mindspore-ci-bot 4834a6b347 !3574 Rename AnfNode::user_data related functions to follow naming rule
Merge pull request !3574 from hewei/rename_user_data_func
2020-07-29 14:26:21 +08:00
liyong ed70de8070 fix coredump when number of file list more than 1000. 2020-07-29 14:21:19 +08:00
huanghui 311d8ea1f9 add exception when different communication op in one segment shared the same input 2020-07-29 14:15:43 +08:00
mindspore-ci-bot e4a7ca7f08 !3637 Lowering value checking threshold to support training with very small steps or
Merge pull request !3637 from thlinh/dev_Jul28_lower_checking_threshold
2020-07-29 14:10:54 +08:00
lvchangquan fdbe4c19ba use kernel_runtime::mem_manager to reduce rtMalloc and rtFree time in trans data format 2020-07-29 12:36:51 +08:00
He Wei db6aa862d5 Optimize tensor data
Replace std::vector<T> with std::unique_ptr<T[]> for tensor data storage,
it prevent unintended data initialization when data is lazy allocated.
2020-07-29 11:56:15 +08:00
mindspore-ci-bot d06da1d270 !3603 Check that the number columns of names and default matches
Merge pull request !3603 from jiangzhiwen/fix_column_names_exceeded
2020-07-29 11:40:06 +08:00
kingfo fab9fac109 fix batchnorm under mix precision in pynative mode 2020-07-29 11:27:18 +08:00
Hoai Linh Tran b4c57295f7 Lowering value checking threshold to support training with very small steps 2020-07-28 23:26:04 -04:00
mindspore-ci-bot b75943f220 !3620 add mindspore lite
Merge pull request !3620 from 张学同/to_merge
2020-07-29 10:54:41 +08:00
panfengfeng 48ab208148 fix dataset train gil of gpu 2020-07-29 10:42:08 +08:00
mindspore-ci-bot 800b9dc596 !3270 New optimization pass to remove redundant Select ops
Merge pull request !3270 from thlinh/dev_Jul17_removeSelect
2020-07-29 10:19:14 +08:00
mindspore-ci-bot f48ef43647 !3611 dataset: repair problem in vgg cifar(version)
Merge pull request !3611 from ms_yan/vgg_repair
2020-07-29 10:18:04 +08:00
mindspore-ci-bot fa3a1f4a16 !3556 add desc about sink_size
Merge pull request !3556 from jinyaohui/master
2020-07-29 10:13:28 +08:00
mindspore-ci-bot 2f956d7cc2 !3612 modify case
Merge pull request !3612 from changzherui/mod_case
2020-07-29 10:12:17 +08:00
mindspore-ci-bot cf6e13cc48 !3563 fix a bug that causes failure when running muti-p from origin dataset,not from MR
Merge pull request !3563 from zhouyuanshen/master
2020-07-29 10:08:26 +08:00
mindspore-ci-bot c2385e2ede !3615 Move nn/distribution to nn/probability/distribution
Merge pull request !3615 from XunDeng/pp_poc_v3
2020-07-29 10:02:11 +08:00
jonyguo b9d855cbca fix: device occupied tdt hung 2020-07-29 09:45:54 +08:00
mindspore-ci-bot 9e1244934c !3614 Update Convert Switch to use PCNode
Merge pull request !3614 from Giancarlo/update_convert_sw
2020-07-29 09:43:26 +08:00
mindspore-ci-bot 980b67d1c4 !3578 fix maskrcnn dataset rescale bug
Merge pull request !3578 from meixiaowei/master
2020-07-29 09:37:08 +08:00