Remaining characters should be discarded once sync() called. If don’t, garbage
characters can be inserted to the front of external buffer in underflow().
Because underflow() copies remaining characters in external buffer to it’s
front. This results wrong characters insertion when seekpos() or seekoff() is
called.
this line should be inserted in sync() just before return:
__extbufnext_ = __extbufend_ = __extbuf_;
2. sync() should use length() rather than out() to calculate offset.
Reversing iterators and calling out() to calculate offset from behind is
working fine in stateless character encoding. However, in stateful encoding,
escape sequences could differ in length. As a result, out() could return wrong
length.
For example, if we have internal buffer converted from this external sequence:
(capital letters mean escape sequence)
… a a a a B b b b b
out() produces this sequence.
b b b b A a a a a
Because out() inserts escape sequence A rather than B, result sequence doesn't
match to external sequence. A and B could have different lengths, result offset
could be wrong value too.
length() method in codecvt is right for calculating offset, but it counts
offset from the beginning of buffer. So it requires another state member
variable to hold state before conversion.
Fixes http://llvm.org/bugs/show_bug.cgi?id=13667
llvm-svn: 162601