Summary:
Detect if the file is valid UTF-8, and if this is the case, count code
points instead of just using number of bytes in all (hopefully) places, where
number of columns is needed. In particular, use the new
FormatToken.CodePointCount instead of TokenLength where appropriate.
Changed BreakableToken implementations to respect utf-8 character boundaries
when in utf-8 mode.
Reviewers: klimek, djasper
Reviewed By: djasper
CC: cfe-commits, rsmith, gribozavr
Differential Revision: http://llvm-reviews.chandlerc.com/D918
llvm-svn: 183312
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.
As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
much more similar
- revamp handling of multi-line comments; we now calculate the
information about lines in multi-line comments similar to normal
tokens, and always issue replacements
As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.
llvm-svn: 182738
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).
Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.
The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.
This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
#define A \
f({ \
g(); \
});
... now gets correctly layouted.
llvm-svn: 182467
Summary:
Added BreakableLineComment, moved common code from
BreakableBlockComment to newly added BreakableComment. As a side-effect of the
rewrite, found another problem with escaped newlines and had to change
code which removes trailing whitespace from line comments not to break after
this patch.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D682
llvm-svn: 179693
Summary:
Both strings and block comments are broken into lines in
breakProtrudingToken. Logic specific for strings or block comments is abstracted
in implementations of the BreakToken interface. Among other goodness, this
change fixes placement of backslashes after a block comment inside a
preprocessor directive (see removed FIXMEs in unit tests).
The code is far from being polished, and some parts of it will be changed for
line comments support.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D665
llvm-svn: 179526