i-robot
a4f081ca2b
!36350 Fix the stream_id error for async dump
...
Merge pull request !36350 from maning202007/fix_async_dump_stream_id
2022-06-24 08:44:17 +00:00
liuruotao
84223375ce
takedown ascend testcases for better gate performance
2022-06-24 09:47:21 +08:00
maning202007
d3f2d391a7
Fix the stream_id error for async dump
...
Revert "takedown dump related testcases to ensure gate stability"
This reverts commit 24321ded4d
.
2022-06-23 22:07:39 +08:00
yanghaoran
24321ded4d
takedown dump related testcases to ensure gate stability
2022-06-15 19:42:40 +08:00
yanghaoran
7943239558
takedown test_ascend_async_multi_root_graph_dump to ensure gate stability
2022-06-15 16:34:04 +08:00
yanghaoran
e1d3e6edac
take down testcases that are passing all the time
2022-05-28 09:53:29 +08:00
i-robot
84dcf74bbd
!32938 optimize global step for optimizer
...
Merge pull request !32938 from zhangbuxue/optimize_global_step_for_optimizer
2022-04-16 03:53:04 +00:00
buxue
8b64e54432
optimize global step for optimizer
2022-04-16 09:39:00 +08:00
Parastoo Ashtari
0e7cc35549
Fix graph history issue for CPU and GPU
2022-04-11 14:34:31 -04:00
TinaMengtingZhang
67c98923d5
Add st for dump
2022-04-07 16:00:02 -04:00
TinaMengtingZhang
438e261bbc
wait until dumping finished
2022-03-18 11:44:09 -04:00
i-robot
1863766fad
!31063 Fix a+m dump probilistic failure in ci
...
Merge pull request !31063 from TinaMengtingZhang/fix_st_error2
2022-03-16 07:19:48 +00:00
huanghui
4482acf586
Support API set_dump for more type of ops
2022-03-15 10:24:15 +08:00
TinaMengtingZhang
646909d3f4
fix a+m dump probilistic failure in ci: 1. do not parallel to dump statistics and remove file lock. 2. when tensor size is small, do it in single thread
2022-03-14 17:11:55 -04:00
i-robot
dcb5cd670c
!30953 Dynamic Weight Decay
...
Merge pull request !30953 from wanyiming/dynamic_wd
2022-03-14 06:03:24 +00:00
wanyiming
a124ec4de7
add dynamic_decay
2022-03-10 11:02:27 +08:00
TinaMengtingZhang
78449afc49
modify st level for stat dump
2022-03-07 21:56:29 -05:00
TinaMengtingZhang
4a8d3defe7
Enhance the perfomance of A+M dump: Parallelize ConverFormatForTensorAndDump and refactoring
2022-03-01 23:08:38 -05:00
TinaMengtingZhang
e576292862
dump statistic file in kernel by kernel mode
2022-02-15 16:58:00 -05:00
Parastoo Ashtari
f6bebc7d97
use origin_parameter_order to load and dump params mindRT
...
Refactor mindRT code
Fix DumpConstantData issue
2022-02-10 14:18:08 -05:00
i-robot
8d648e0f1e
!19673 Seperate constant from pb file
...
Merge pull request !19673 from sabrinasun_59ee/constant
2021-12-15 23:52:33 +00:00
sabrinasun
542d715217
seperate constant from pb
2021-12-14 13:19:14 -05:00
Jimmy Qi
6f2b70c21f
Print null for min, max, avg values for NaN tensor
2021-12-12 19:31:21 +00:00
Jimmy Qi
b9d1a4920c
Add statistic dump for ascend
2021-12-06 20:11:44 +00:00
TinaMengtingZhang
16a19be56f
Enable cann api allback register function
...
Add testcases for cann api feature
2021-12-03 17:36:15 -05:00
sabrinasun
c095908772
fix enable=false still dump issue for cell dump
2021-11-30 22:10:20 -05:00
TinaMengtingZhang
07b653103e
Support Cann callback api for ascend async dump
2021-11-24 10:35:29 -05:00
sabrinasun
99a434672f
support cell dump when dump_mode=2
2021-11-16 16:01:13 -05:00
Jimmy Qi
b21c099767
Add option to dump tensor statistics in csv format
2021-11-15 19:00:59 +00:00
i-robot
257bb3500d
!25785 change dump path to avoid invalid symbol in ci path
...
Merge pull request !25785 from yelihua/dev
2021-11-04 13:08:01 +00:00
parastooashtari
1a59dc37bf
add graph execution order history to dump
2021-11-02 11:00:38 -04:00
yelihua
f9f2058055
change dump path to avoid invalid symbol in path
2021-11-02 17:29:35 +08:00
sabrinasun
332e0dbb0f
add overflow st test and checkdumpstructure test
2021-10-22 06:28:58 +08:00
sabrinasun
220245f592
add security isolation to online and offline debugger
2021-09-12 23:01:15 -04:00
yelihua
7c3994e48e
use Common::CreatePrefixPath instead of Common::GetRealPath
2021-09-09 20:34:25 +08:00
TinaMengtingZhang
424e94a7d4
Adapt async dump conversion tool with new run package (Sep03 version)
2021-09-07 13:41:26 -04:00
yanghaoran
1fd655787a
upgrade Ascend packages 04 Sep 21
2021-09-06 21:24:07 +08:00
TinaMengtingZhang
b17b2bc687
Fix convert async dump files failed issue and refactor convert_async.py
2021-09-01 15:10:55 -04:00
Xiaoda Zhang
b2703879c6
fix the scope setting error when cloning nodes
2021-08-26 10:25:38 +08:00
ms_yan
36a8886ca2
Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset"
...
This reverts commit b077aa1cab
.
Revert "[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset"
This reverts commit 4e6f7dc97d
.
delete pass_registry_test.cc
comment hiai_nlu_model_multi.pb related line
2021-08-23 01:46:38 +08:00
djc
b077aa1cab
[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset
2021-08-22 16:26:45 +08:00
djc
4e6f7dc97d
[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset
2021-08-22 13:39:37 +08:00
yelihua
a3cba3857e
use temporary dir as dump dir
2021-08-17 20:11:33 +08:00
yelihua
a6dc9a0a07
get rank id when set hccl env for single card train
2021-08-16 13:44:02 +08:00
yelihua
ca835a7d2d
change the async dump testcase level to level1
2021-08-05 19:52:35 +08:00
TinaMengtingZhang
5b8e846fe9
Support MS_DIAGNOSTIC_DATA_PATH in configuring dump path
2021-07-22 13:34:59 -04:00
limingqi107
06a6e8d186
fix the test case of CPU dump
2021-07-12 22:58:34 +08:00
limingqi107
36b1ff25b4
Enable mindRT on CPU device
2021-07-07 13:19:37 +08:00
limingqi107
e761655a42
actor runtime support CPU dump
2021-07-04 11:19:09 +08:00
TinaMengtingZhang
2fa05b66a1
change device_id to rank_id in dump path
2021-06-06 21:27:20 -04:00