Commit Graph

5 Commits

Author SHA1 Message Date
Xiao Tianci 0659473535 add TruncateSequencePair, ToNumber C++ API and enable three test cases 2020-12-08 11:12:16 +08:00
xulei2020 18b519ae0f add sentence piece 2020-07-20 15:50:35 +08:00
qianlong cae77c0c22 BasicTokenizer not case fold on preserverd words 2020-06-28 16:28:00 +08:00
qianlong 4f16f036be Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp
add CaseFold, NormalizeUTF8

add RegexReplace

add RegexTokenizer

add BasicTokenizer

add WordpieceTokenizer

add BertTokenizer
2020-06-17 15:47:04 +08:00
qianlong 451c20a6f5 Add UnicodeCharTokenizer for nlp 2020-05-21 09:22:45 +08:00