Commit Graph

62 Commits

Author SHA1 Message Date
i-robot a4f081ca2b
!36350 Fix the stream_id error for async dump
Merge pull request !36350 from maning202007/fix_async_dump_stream_id
2022-06-24 08:44:17 +00:00
liuruotao 84223375ce takedown ascend testcases for better gate performance 2022-06-24 09:47:21 +08:00
maning202007 d3f2d391a7 Fix the stream_id error for async dump
Revert "takedown dump related testcases to ensure gate stability"

This reverts commit 24321ded4d.
2022-06-23 22:07:39 +08:00
yanghaoran 24321ded4d takedown dump related testcases to ensure gate stability 2022-06-15 19:42:40 +08:00
yanghaoran 7943239558 takedown test_ascend_async_multi_root_graph_dump to ensure gate stability 2022-06-15 16:34:04 +08:00
yanghaoran e1d3e6edac take down testcases that are passing all the time 2022-05-28 09:53:29 +08:00
i-robot 84dcf74bbd
!32938 optimize global step for optimizer
Merge pull request !32938 from zhangbuxue/optimize_global_step_for_optimizer
2022-04-16 03:53:04 +00:00
buxue 8b64e54432 optimize global step for optimizer 2022-04-16 09:39:00 +08:00
Parastoo Ashtari 0e7cc35549 Fix graph history issue for CPU and GPU 2022-04-11 14:34:31 -04:00
TinaMengtingZhang 67c98923d5 Add st for dump 2022-04-07 16:00:02 -04:00
TinaMengtingZhang 438e261bbc wait until dumping finished 2022-03-18 11:44:09 -04:00
i-robot 1863766fad
!31063 Fix a+m dump probilistic failure in ci
Merge pull request !31063 from TinaMengtingZhang/fix_st_error2
2022-03-16 07:19:48 +00:00
huanghui 4482acf586 Support API set_dump for more type of ops 2022-03-15 10:24:15 +08:00
TinaMengtingZhang 646909d3f4 fix a+m dump probilistic failure in ci: 1. do not parallel to dump statistics and remove file lock. 2. when tensor size is small, do it in single thread 2022-03-14 17:11:55 -04:00
i-robot dcb5cd670c
!30953 Dynamic Weight Decay
Merge pull request !30953 from wanyiming/dynamic_wd
2022-03-14 06:03:24 +00:00
wanyiming a124ec4de7 add dynamic_decay 2022-03-10 11:02:27 +08:00
TinaMengtingZhang 78449afc49 modify st level for stat dump 2022-03-07 21:56:29 -05:00
TinaMengtingZhang 4a8d3defe7 Enhance the perfomance of A+M dump: Parallelize ConverFormatForTensorAndDump and refactoring 2022-03-01 23:08:38 -05:00
TinaMengtingZhang e576292862 dump statistic file in kernel by kernel mode 2022-02-15 16:58:00 -05:00
Parastoo Ashtari f6bebc7d97 use origin_parameter_order to load and dump params mindRT
Refactor mindRT code
Fix DumpConstantData issue
2022-02-10 14:18:08 -05:00
i-robot 8d648e0f1e !19673 Seperate constant from pb file
Merge pull request !19673 from sabrinasun_59ee/constant
2021-12-15 23:52:33 +00:00
sabrinasun 542d715217 seperate constant from pb 2021-12-14 13:19:14 -05:00
Jimmy Qi 6f2b70c21f Print null for min, max, avg values for NaN tensor 2021-12-12 19:31:21 +00:00
Jimmy Qi b9d1a4920c Add statistic dump for ascend 2021-12-06 20:11:44 +00:00
TinaMengtingZhang 16a19be56f Enable cann api allback register function
Add testcases for cann api feature
2021-12-03 17:36:15 -05:00
sabrinasun c095908772 fix enable=false still dump issue for cell dump 2021-11-30 22:10:20 -05:00
TinaMengtingZhang 07b653103e Support Cann callback api for ascend async dump 2021-11-24 10:35:29 -05:00
sabrinasun 99a434672f support cell dump when dump_mode=2 2021-11-16 16:01:13 -05:00
Jimmy Qi b21c099767 Add option to dump tensor statistics in csv format 2021-11-15 19:00:59 +00:00
i-robot 257bb3500d !25785 change dump path to avoid invalid symbol in ci path
Merge pull request !25785 from yelihua/dev
2021-11-04 13:08:01 +00:00
parastooashtari 1a59dc37bf add graph execution order history to dump 2021-11-02 11:00:38 -04:00
yelihua f9f2058055 change dump path to avoid invalid symbol in path 2021-11-02 17:29:35 +08:00
sabrinasun 332e0dbb0f add overflow st test and checkdumpstructure test 2021-10-22 06:28:58 +08:00
sabrinasun 220245f592 add security isolation to online and offline debugger 2021-09-12 23:01:15 -04:00
yelihua 7c3994e48e use Common::CreatePrefixPath instead of Common::GetRealPath 2021-09-09 20:34:25 +08:00
TinaMengtingZhang 424e94a7d4 Adapt async dump conversion tool with new run package (Sep03 version) 2021-09-07 13:41:26 -04:00
yanghaoran 1fd655787a upgrade Ascend packages 04 Sep 21 2021-09-06 21:24:07 +08:00
TinaMengtingZhang b17b2bc687 Fix convert async dump files failed issue and refactor convert_async.py 2021-09-01 15:10:55 -04:00
Xiaoda Zhang b2703879c6 fix the scope setting error when cloning nodes 2021-08-26 10:25:38 +08:00
ms_yan 36a8886ca2 Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset"
This reverts commit b077aa1cab.

Revert "[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset"

This reverts commit 4e6f7dc97d.

delete pass_registry_test.cc

comment  hiai_nlu_model_multi.pb related  line
2021-08-23 01:46:38 +08:00
djc b077aa1cab [feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset 2021-08-22 16:26:45 +08:00
djc 4e6f7dc97d [feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset 2021-08-22 13:39:37 +08:00
yelihua a3cba3857e use temporary dir as dump dir 2021-08-17 20:11:33 +08:00
yelihua a6dc9a0a07 get rank id when set hccl env for single card train 2021-08-16 13:44:02 +08:00
yelihua ca835a7d2d change the async dump testcase level to level1 2021-08-05 19:52:35 +08:00
TinaMengtingZhang 5b8e846fe9 Support MS_DIAGNOSTIC_DATA_PATH in configuring dump path 2021-07-22 13:34:59 -04:00
limingqi107 06a6e8d186 fix the test case of CPU dump 2021-07-12 22:58:34 +08:00
limingqi107 36b1ff25b4 Enable mindRT on CPU device 2021-07-07 13:19:37 +08:00
limingqi107 e761655a42 actor runtime support CPU dump 2021-07-04 11:19:09 +08:00
TinaMengtingZhang 2fa05b66a1 change device_id to rank_id in dump path 2021-06-06 21:27:20 -04:00