Summary:
This patch enables comment reflowing of lines not matching the comment pragma regex
in multiline comments containing comment pragma lines. Previously, these comments
were dumped without being reindented to the result.
Reviewers: djasper, mprobst
Reviewed By: mprobst
Subscribers: klimek, mprobst, cfe-commits
Differential Revision: https://reviews.llvm.org/D30697
llvm-svn: 297261
Summary:
The comment reflower wasn't taking comment pragmas as reflow stoppers. This patch fixes that.
source:
```
// long long long long
// IWYU pragma:
```
format with column limit = 20 before:
```
// long long long
// long IWYU pragma:
```
format with column limit = 20 after:
```
// long long long
// long
// IWYU pragma:
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29450
llvm-svn: 293898
Summary:
The breaking of line comment sections was misaligning the case where the first comment line is on an unwrapped line containing newlines. In this case, the breaking column must be based on the source column of the last token that is preceded by a newline, not on the first token of the unwrapped line.
source:
```
enum A {
a, // line 1
// line 2
};
```
format before:
```
enum A {
a, // line 1
// line 2
};
```
format after:
```
enum A {
a, // line 1
// line 2
};
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29444
llvm-svn: 293891
Summary:
Comment reflower was adding untouchable tokens in case two consecutive comment lines are aligned in the source code. This disallows the whitespace manager to re-indent them later.
source:
```
int i = f(abc, // line 1
d, // line 2
// line 3
b);
```
Since line 2 and line 3 are aligned, the reflower was marking line 3 as untouchable; however the three comment lines need to be re-aligned.
output before:
```
int i = f(abc, // line 1
d, // line 2
// line 3
b);
```
output after:
```
int i = f(abc, // line 1
d, // line 2
// line 3
b);
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: sammccall, cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29383
llvm-svn: 293755
Summary:
The reflower didn't measure precisely the line column of a line in the middle of
a line comment section that has a prefix that needs to be adapted.
source:
```
/// a
//b
```
format before:
```
/// a
//b
```
format after:
```
/// a
// b
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: sammccall, cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29329
llvm-svn: 293641
Summary:
The reflower was not taking into account the additional leading whitespace in block comment lines.
source:
```
{
/*
* long long long long
* long
* long long long long
*/
}
```
format (with column limit 20) before:
```
{
/*
* long long long
* long long long long
* long long
*/
}
```
format after:
```
{
/*
* long long long
* long long long
* long long long
*/
}
```
Reviewers: djasper, klimek
Reviewed By: djasper
Subscribers: cfe-commits, sammccall, klimek
Differential Revision: https://reviews.llvm.org/D29326
llvm-svn: 293633
Summary:
This fixes a regression that causes example:
```
enum A {
a, // line a
// line b
b
};
```
to be formatted as follows:
```
enum A {
a, // line a
// line b
b
};
```
Reviewers: djasper, klimek
Reviewed By: klimek
Subscribers: cfe-commits, sammccall, djasper, klimek
Differential Revision: https://reviews.llvm.org/D29322
llvm-svn: 293624
Summary:
This patch stops reflowing comment lines starting with '@', since they commonly
have a special meaning.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29323
llvm-svn: 293617
The main motivation behind this is to cleanup the WhitespaceManager and
make it more extensible for future alignment etc. features.
Specifically, WhitespaceManager has started to copy more and more code
that is already present in FormatToken. Instead, I think it makes more
sense to actually store a reference to each FormatToken for each change.
This has as a consequence led to a change in the calculation of indent
levels. Now, we actually compute them for each Token ahead of time,
which should be more efficient as it removes an unsigned value for the
ParenState, which is used during the combinatorial exploration of the
solution space.
No functional changes intended.
Review: https://reviews.llvm.org/D29300
llvm-svn: 293616
Summary:
Consider formatting the following code fragment with column limit 20:
```
{
// line 1
// line 2\
// long long long line
}
```
Before this fix the output is:
```
{
// line 1
// line 2\
// long long
long line
}
```
This patch fixes a regression that breaks the last comment line without
adding the '//' prefix.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29298
llvm-svn: 293548
Summary:
This presents a version of the comment reflowing with less mutable state inside
the comment breakable token subclasses. The state has been pushed into the
driving breakProtrudingToken method. For this, the API of BreakableToken is enriched
by the methods getSplitBefore and getLineLengthAfterSplitBefore.
Reviewers: klimek
Reviewed By: klimek
Subscribers: djasper, klimek, mgorny, cfe-commits, ioeric
Differential Revision: https://reviews.llvm.org/D28764
llvm-svn: 293055
Summary:
Introduces a separate target for comment manipulation.
Currently, comment manipulation is in BreakableComment.cpp.
Towards implementing comment reflowing, we want to factor out the
comment-related functionality, so it can be reused.
Start simple by just moving out getLineCommentIndentPrefix.
Patch by Krasimir Georgiev!
Reviewers: djasper
Subscribers: klimek, beanz, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D25725
llvm-svn: 284573
Splitting:
/**
* multiline block comment
*
*/
Before:
/**
* multiline block
*comment
*
*/
After:
/**
* multiline block
* comment
*
*/
The reason was that the empty line inside the comment (with just the "*") was
confusing the comment breaking logic.
llvm-svn: 236573
definition below all of the header #include lines, clang edition.
If you want more details about this, you can see some of the commits to
Debug.h in LLVM recently. This is just the clang section of a cleanup
I've done for all uses of DEBUG_TYPE in LLVM.
llvm-svn: 206849
Summary:
This patch ensures that the lines of the block comments retain relative
column offsets. In order to do this WhitespaceManager::Changes representing
continuation of block comments keep a pointer on the change representing the
whitespace change before the block comment, and a relative column offset to this
change, so that the correct column can be reconstructed at the end of alignment
process.
Fixes http://llvm.org/PR19325
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://reviews.llvm.org/D3408
llvm-svn: 206472
We cannot simply change the start column to accomodate for the @ in an
ObjC string literal as that will make clang-format happily violate the
column limit.
Use a different workaround instead. However, a better long-term
solution might be to join the @ and the rest of the literal into a
single token.
llvm-svn: 199198
While it is allowed to not have an @ on subsequent lines, it seems
general practice to add them. If undesired, the code author can easily
remove them again and clang-format won't re-add them.
llvm-svn: 198871
Summary:
getStringSplit used to crash, when trying to split a long string
literal containing both printable and unprintable multi-byte UTF-8 characters.
Reviewers: djasper, klimek
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D2268
llvm-svn: 195728
Summary:
Changed UseTab to be a enum with three options: Never, Always,
ForIndentation (true/false are still supported when reading .clang-format).
IndentLevel should currently be propagated correctly for all tokens, except for
block comments. Please take a look at the general idea before I start dealing
with block comments.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1770
llvm-svn: 191527
Summary:
This fixes various issues with mixed tabs and spaces handling, e.g.
when realigning block comments.
Reviewers: klimek, djasper
Reviewed By: djasper
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1608
llvm-svn: 190395
Summary:
Count column width instead of the number of code points. This also
includes correct handling of tabs inside string literals and comments (with an
exception of multiline string literals/comments, where tabs are present before
the first escaped newline).
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D1601
llvm-svn: 190052
Summary:
Fixes problems that lead to incorrect formatting of these and similar snippets:
/*
**
*/
/*
**/
/*
* */
/*
*test
*/
Clang-format used to think that all the cases above use "* " decoration, and
failed to calculate insertion position properly. It also used to remove leading
"* " in the last line.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1113
llvm-svn: 185818
Summary:
Fixes a problem where \t,\v or \f could lead to a crash when placed as
a first character in a line comment. The cause is that rtrim and ltrim handle
these characters, but our code didn't, so some invariants could be broken.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1013
llvm-svn: 184425
Summary: Split strings at word boundaries, when there are no spaces and slashes.
Reviewers: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1003
llvm-svn: 184304
Summary:
Don't remove backslashes from block comments. Previously this
/* \ \ \ \ \ \
*/
would be turned to this:
/*
*/
which spoils some kinds of ASCII-art, people use in their comments. The behavior
was related to handling escaped newlines in block comments inside preprocessor
directives. This patch makes handling it in a more civilized way.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D979
llvm-svn: 183978
Summary:
"//Test" becomes "// Test". This change is aimed to improve code
readability and conformance to certain coding styles. If a comment starts with a
non-alphanumeric character, the space isn't added, e.g. "//-*-c++-*-" stays
unchanged.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D949
llvm-svn: 183750
Summary:
Introduced two new style parameters: PenaltyBreakComment and
PenaltyBreakString. Add penalty for each character of a breakable token beyond
the column limit (this relates mainly to comments, as they are broken only on
whitespace). Tuned PenaltyBreakComment to prefer comment breaking over breaking
inside most binary expressions.
Fixed a bug that prevented *, & and && from being considered TT_BinaryOperator
in the presense of adjacent comments.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D933
llvm-svn: 183530
Summary:
Detect if the file is valid UTF-8, and if this is the case, count code
points instead of just using number of bytes in all (hopefully) places, where
number of columns is needed. In particular, use the new
FormatToken.CodePointCount instead of TokenLength where appropriate.
Changed BreakableToken implementations to respect utf-8 character boundaries
when in utf-8 mode.
Reviewers: klimek, djasper
Reviewed By: djasper
CC: cfe-commits, rsmith, gribozavr
Differential Revision: http://llvm-reviews.chandlerc.com/D918
llvm-svn: 183312
When trying to fall back to search from the end onwards, we
would still find leading whitespace if the leading whitespace
went on after the end of the line.
llvm-svn: 182886
To fully support this, we also need to expand tabs in the text before
the block comment. This patch breaks indentation when there was a
non-standard mixture of spaces and tabs used for indentation, but
fixes a regression in the simple case:
{
/*
* Comment.
*/
int i;
}
Is now formatted correctly, if there were tabs used for indentation
before.
llvm-svn: 182760
Block comment indentation of empty lines regressed, as we did not
have a test for it.
/* Comment with...
*
* empty line. */
is now formatted correctly again.
llvm-svn: 182757
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.
As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
much more similar
- revamp handling of multi-line comments; we now calculate the
information about lines in multi-line comments similar to normal
tokens, and always issue replacements
As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.
llvm-svn: 182738
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).
Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.
The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.
This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
#define A \
f({ \
g(); \
});
... now gets correctly layouted.
llvm-svn: 182467
Summary:
Added BreakableLineComment, moved common code from
BreakableBlockComment to newly added BreakableComment. As a side-effect of the
rewrite, found another problem with escaped newlines and had to change
code which removes trailing whitespace from line comments not to break after
this patch.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D682
llvm-svn: 179693
Summary:
Both strings and block comments are broken into lines in
breakProtrudingToken. Logic specific for strings or block comments is abstracted
in implementations of the BreakToken interface. Among other goodness, this
change fixes placement of backslashes after a block comment inside a
preprocessor directive (see removed FIXMEs in unit tests).
The code is far from being polished, and some parts of it will be changed for
line comments support.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D665
llvm-svn: 179526