mindspore/tests/ut/data/dataset/testTokenizerData
Xiao Tianci 0659473535 add TruncateSequencePair, ToNumber C++ API and enable three test cases 2020-12-08 11:12:16 +08:00
..
1.txt Add UnicodeCharTokenizer for nlp 2020-05-21 09:22:45 +08:00
basic_tokenizer.txt Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp 2020-06-17 15:47:04 +08:00
bert_tokenizer.txt BasicTokenizer not case fold on preserverd words 2020-06-28 16:28:00 +08:00
normalize.txt Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp 2020-06-17 15:47:04 +08:00
regex_replace.txt Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp 2020-06-17 15:47:04 +08:00
regex_tokenizer.txt Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp 2020-06-17 15:47:04 +08:00
sentencepiece_tokenizer.txt add sentence piece 2020-07-20 15:50:35 +08:00
to_number.txt add TruncateSequencePair, ToNumber C++ API and enable three test cases 2020-12-08 11:12:16 +08:00
wordpiece_tokenizer.txt Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp 2020-06-17 15:47:04 +08:00