Summary:
Count column width instead of the number of code points. This also
includes correct handling of tabs inside string literals and comments (with an
exception of multiline string literals/comments, where tabs are present before
the first escaped newline).
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D1601
llvm-svn: 190052
This patch makes sure we produce the right number of unwrapped lines,
a follow-up patch will make the whitespace formatting consistent.
Before:
void f() {
int i = {[operation setCompletionBlock : ^{ [self onOperationDone];
}]
}
;
}
After:
void f() {
int i = {[operation setCompletionBlock : ^{
[self onOperationDone];
}] };
}
llvm-svn: 189932
Implements parsing of lambdas in the UnwrappedLineParser.
This introduces the correct line breaks; the formatting of
lambda captures are still incorrect, and the braces are also
still formatted as if they were braced init lists instead of
blocks.
llvm-svn: 189818
Summary:
Store first and last newline position in the token text for string literals and
comments to avoid doing .find('\n') for each possible solution.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D1556
llvm-svn: 189758
Almost by accident, clang-format seems to be able to format protocol
buffer definitions (https://code.google.com/p/protobuf/).
The only change is that a space is required between numeric constants
and opening square brackets (for default values). While this might in
theory be used for array subscripts (int val = 4[MyArray]), I have not
seen this pattern in practice much. If this is wrong, we can make this
smarter in the future.
llvm-svn: 189663
Before:
aaaaaaaaaa(bbbbbbbbbbbbbbbbbbbbbbbbb.ccccccccccccccccc(
dddddddddddddddddddddddddddddd));
aaaaaaaaaaa(bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(
dddddddddddddddddddddddddddddd));
After:
aaaaaaaaaaa(bbbbbbbbbbbbbbbbbbbbbbbbb.ccccccccccccccccc(
dddddddddddddddddddddddddddddd));
aaaaaaaaaaa(bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(
dddddddddddddddddddddddddddddd));
This was overlooked when interducing the new builder-type call
detection in r189337. Also, some minor reorganization of a test.
llvm-svn: 189658
While this looks kind of nice, it wastes horizontal space and does not
seem to be common in the LLVM codebase.
Before:
return llvm::StringSwitch<Reference::Kind>(name)
.StartsWith(".eh_frame_hdr", ORDER_EH_FRAMEHDR)
.StartsWith(".eh_frame", ORDER_EH_FRAME)
.StartsWith(".init", ORDER_INIT)
.StartsWith(".fini", ORDER_FINI)
.StartsWith(".hash", ORDER_HASH)
.Default(ORDER_TEXT);
After:
return llvm::StringSwitch<Reference::Kind>(name)
.StartsWith(".eh_frame_hdr", ORDER_EH_FRAMEHDR)
.StartsWith(".eh_frame", ORDER_EH_FRAME)
.StartsWith(".init", ORDER_INIT)
.StartsWith(".fini", ORDER_FINI)
.StartsWith(".hash", ORDER_HASH)
.Default(ORDER_TEXT);
llvm-svn: 189657
Summary:
Calculate characters in the first and the last line correctly so that
we only break before the literal when needed.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D1544
llvm-svn: 189595
We now count the original token's column directly when lexing the
tokens, where we already have all knowledge about where lines start.
Before this patch, formatting:
void f() {
\tg();
\th();
}
would incorrectly count the \t's as 1 character if only the line
containing h() was reformatted, and thus indent h() at offset 1.
llvm-svn: 189585
Two changes:
* Don't add an extra penalty on breaking the same token multiple times.
Generally, we should prefer not to break, but once we break, the
normal line breaking penalties apply.
* Slightly increase the penalty for breaking comments. In general, the
author has put some thought into how to break the comment and we
should not overwrite this unnecessarily.
With a 40-column column limit, formatting
aaaaaa("aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa");
Leads to:
Before:
aaaaaa(
"aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa "
"aaaaaaaaaaaaaaaa");
After:
aaaaaa("aaaaaaaaaaaaaaaa "
"aaaaaaaaaaaaaaaa "
"aaaaaaaaaaaaaaaa");
llvm-svn: 189466
The code leading to a segfault was:
#pragma omp threadprivate(y)), // long comment leading to a line break
This fixes llvm.org/PR16513.
llvm-svn: 189460
If escaped newlines are aligned right
(FormatStyle.AlignEscapedNewlinesLeft == false), and a line contained
too many characters to fit into the column limit, this would result in
a (virtually) endless loop creating a negative number of spaces.
Instead, allow the escaped newlines to be pushed past the column limit
in this case.
This fixes llvm.org/PR16515.
llvm-svn: 189459
In
@implementation ObjcClass
- (void)method;
{
}
@end
the ObjC compiler seems to accept the superfluous comma after "method",
but clang-format used to assert on the subsequent "{".
This fixes llvm.org/PR16604.
llvm-svn: 189453
Specific arrangements of comments after trailing commas could confuse
the column width calculation, e.g. in:
vector<int> x = { a, b,
/* some */ /* comment */ };
llvm-svn: 189211
This should be done, only if we are still in the unary expression's
scope.
Before:
bool aaaa = !aaaaaaaa( // break
aaaaaaaaaaa);
*aaaaaa = aaaaaaa( // break
aaaaaaaaaaaaaaaa);
After:
bool aaaa = !aaaaaaaa( // break
aaaaaaaaaaa); // <- (unchanged)
*aaaaaa = aaaaaaa( // break
aaaaaaaaaaaaaaaa); // <- (no longer indented relative to "*")
llvm-svn: 189108
.. in conjunction with Style.AlwaysBreakBeforeMultilineStrings. Also,
simplify the implementation by handling newly split strings and already
split strings by the same code.
llvm-svn: 189102
Before, this was causing errors.
Also exit early in breakProtrudingToken() (before the expensive call to
SourceManager::getSpellingColumnNumber()). This makes formatting huge
(100k+-item) braced lists possible.
llvm-svn: 189094
With this patch, braced lists (with more than 3 elements are formatted in a
column layout if possible). E.g.:
static const uint16_t CallerSavedRegs64Bit[] = {
X86::RAX, X86::RDX, X86::RCX, X86::RSI, X86::RDI,
X86::R8, X86::R9, X86::R10, X86::R11, 0
};
Required other changes:
- FormatTokens can now have a special role that contains extra data and can do
special formattings. A comma separated list is currently the only
implementation.
- Move penalty calculation entirely into ContinuationIndenter (there was a last
piece still in UnwrappedLineFormatter).
Review: http://llvm-reviews.chandlerc.com/D1457
llvm-svn: 189018
Before:
if (!aaaaaaaaaa( // break
aaaaa)) {
}
After:
if (!aaaaaaaaaa( // break
aaaaa)) {
}
Also cleaned up formatting using clang-format.
llvm-svn: 188891
This patch adds four new options to control:
- Spaces after control keyworks (if(..) vs if (..))
- Spaces in empty parentheses (f( ) vs f())
- Spaces in c-style casts (( int )1.0 vs (int)1.0)
- Spaces in other parentheses (f(a) vs f( a ))
Patch by Joe Hermaszewski. Thank you for working on this!
llvm-svn: 188793
Goals: Structure code better and make components easier to use for
future features (e.g. column layout for long braced initializers).
No functional changes intended.
llvm-svn: 188543
Some coding styles use a different indent for constructor initializers.
Patch by Klemens Baum. Thank you.
Review: http://llvm-reviews.chandlerc.com/D1360
Post review changes: Changed data type to unsigned as a negative indent
width does not make sense and added test for configuration parsing.
llvm-svn: 188260
Previously these were formatting as catch (E & e) because the inner parenthesis
was being marked as an expression.
Patch by Thomas Gibson-Robinson.
llvm-svn: 188153
In particular, left braces after an enum declaration now occur on their
own line. Further, when short ifs/whiles are allowed these no longer
cause the left brace to be on the same line as the if/while when a
brace is included.
Patch by Thomas Gibson-Robinson.
llvm-svn: 187901
This removes a formatting choice that was added at one point, but is
not generally liked by users. Specifically, in builder-type calls, do
(easily) break if the object before the ./-> is either a field or a
parameter-less function call. I.e., don't break after "aa.aa.aa" or
"aa.aa.aa()". In general, these sequences in builder-type calls are
seen as a single entity and thus breaking them up is a bad idea.
llvm-svn: 187865
Before, clang-format would not break overly long string literals
following a "<<" with FormatStyle::AlwaysBreakBeforeMultilineStrings
being set.
llvm-svn: 187650
Before:
template <bool B, bool C>
class A {
static_assert(B &&C, "Something is wrong");
};
After:
template <bool B, bool C>
class A {
static_assert(B && C, "Something is wrong");
};
(Note the spacing around '&&'). Also change the identifier table to always
understand all C++11 keywords (which seems like the right thing to do).
llvm-svn: 187589
With this patch, clang-format can be configured to:
* not indent in namespace at all (former behavior).
* indent in namespace as in other blocks.
* indent only in inner namespaces (as required by WebKit style).
Also fix alignment of access specifiers in WebKit style.
Patch started by Marek Kurdej. Thank you!
llvm-svn: 187540
New options:
* Break before the commas of constructor initializers and align
the commas with the colon.
* Break before binary operators
Additionally, for styles without column limit, don't just accept
linebreaks done by the user, but instead remove 'invalid' (according
to the current style) linebreaks and add 'required' ones.
llvm-svn: 187210
This is far from implementing all the rules given by
http://www.webkit.org/coding/coding-style.html
The important new feature is the support for styles that don't have a
column limit. For such styles, clang-format will (at the moment) simply
respect the input's formatting decisions within statements.
llvm-svn: 187033
The AlwaysBreakBeforeMultilineStrings rule does not really make sense
if it does not a column gain.
Before (in Google style):
f(
"aaaa"
"bbbb");
After:
f("aaaa"
"bbbb");
llvm-svn: 186515
Motivating example:
// column limit ------------------->
void ffffffffffff(int aaaaaa /* test */);
Formatting before the patch:
void ffffffffffff(int aaaaaa /* test
*/);
Formatting after the patch:
void
ffffffffffff(int aaaaaa /* test */);
llvm-svn: 186471
Summary:
These can appear when comments contain command lines with quoted line
breaks. As the text (including escaped newlines and '//' from consecutive lines)
is a single line comment, we used to break it even when it didn't exceed column
limit. This is a temporary solution, in the future we may want to support this
case completely - at least adjust leading whitespace when changing indentation
of the first line.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D1146
llvm-svn: 186456
The fundamental concept is:
Format as if the braced init list was a function call (with parentheses
replaced by braces). If there is no name/type before the opening brace
(e.g. if the braced list is nested), assume a zero-length identifier
just before the opening brace.
This behavior is gated on a new style flag, which for now replaces the
SpacesInBracedLists style flag. Activate this style flag for Google
style to reflect recent style guide changes.
llvm-svn: 186433
This fixes an incorrect detection that led to a formatting error.
Before:
some_var = function (*some_pointer_var)[0];
After:
some_var = function(*some_pointer_var)[0];
llvm-svn: 186402
Fixed a test that by now passed for the wrong reason.
Before:
llvm::outs() << "aaaaaaaaaaaaaaaaaaa: " << aaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
llvm::outs() << "aaaaaaaaaaaaaaaaaaa: "
<< aaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
Also reformatted Format.cpp with the latest changes (1 formatting fix
and 1 layout change of a <<-chain).
llvm-svn: 186322
Before this patch, it did not cooperate with
Style::AlwaysBreakBeforeMultilineStrings. Thus, it would turn
aaaaaaaaaaaa(aaaaaaaaaaaaa, "aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa");
into:
aaaaaaaaaaaa(aaaaaaaaaaaaa, "aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa "
"aaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa");
and only a second format step would lead to the desired (with that
option):
aaaaaaaaaaaa(aaaaaaaaaaaaa,
"aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa "
"aaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa");
This could even lead to clang-format breaking the string at a different
character and thus leading to a completely different end result.
llvm-svn: 186154
clang-format used to treat array subscript expressions much like
function call (just replacing () with []). However, this is not really
appropriate especially for expressions with multiple subscripts.
Although it might seem counter-intuitive, the most consistent solution
seems to be to always (if necessary) break before a square bracket,
never after it. Also, multiple subscripts of the same expression should
be aligned if they are on subsequent lines.
Before:
aaaaaaaaaaaaaaaaaaaaaaaaa[aaaaaaaaaaaaaaaaaaaaaaaaa][
bbbbbbbbbbbbbbbbbbbbbbbbb] = c;
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa[
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa][
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb] = ccccccccccc;
After:
aaaaaaaaaaaaaaaaaaaaaaaaa[aaaaaaaaaaaaaaaaaaaaaaaaa]
[bbbbbbbbbbbbbbbbbbbbbbbbb] = c;
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa]
[bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb] = ccccccccccc;
llvm-svn: 186153
Before:
int i; // indented 2 space more than clang-format would use.
SomeReturnType // clang-format invoked on this line.
SomeFunctionMakingLBraceEndInColumn80() {
} // This is the indent clang-format would prefer.
After:
int i; // indented 2 space more than clang-format would use.
SomeReturnType // clang-format invoked on this line.
SomeFunctionMakingLBraceEndInColumn80() {
}
llvm-svn: 186120
(if they are not function-like).
Before:
SomeFunction(aaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaa)
OVERRIDE;
After:
SomeFunction(aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaa) OVERRIDE;
llvm-svn: 186117
This puts a slight penalty on the linebreak before the first "<<", so
that clang-format generally tries to keep things on the first line.
User feedback has shown that this is generally desirable.
Before:
llvm::outs()
<< "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa =" << aaaaaaaaaaaaaaaaaaaaaaaaaaa;
After:
llvm::outs() << "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ="
<< aaaaaaaaaaaaaaaaaaaaaaaaaaa;
llvm-svn: 186115
Trailing return types can only occur in declaration contexts.
Before:
void f() { auto a = b -> c(); }
After:
void f() { auto a = b->c(); }
llvm-svn: 186087
This is not activated for any style, might change or go away
completely.
For those that want to play around with it, set
ExperimentalAutoDetectBinPacking to true.
clang-format will then:
Look at whether function calls/declarations/definitions are currently
formatted with one parameter per line (on a case-by-case basis). If so,
clang-format will avoid bin-packing the parameters. If all parameters
are on one line (thus that line is "inconclusive"), clang-format will
make the choice dependent on whether there are other bin-packed
calls/declarations in the same file.
The reason for this change is that bin-packing in some situations can be
really bad and an author might opt to put one parameter on each line. If
the author does that, he might want clang-format not to mess with that.
If the author is unhappy with the one-per-line formatting, clang-format
can easily be convinced to bin-pack by putting any two parameters on the
same line.
llvm-svn: 186003
This fixes llvm.org/PR15170.
For now, the basic formatting rules are (based on the C++11 standard):
* Surround the "->" with spaces.
* Break before "->".
Also fix typo.
llvm-svn: 185938
Basically treat a function with a trailing call similar to a function
with multiple parameters.
Before:
aaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaa))
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa();
After:
aaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaa))
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa();
Also fix typo.
llvm-svn: 185930
Before:
someFunction(OtherParam, BracedList{
// comment 1 (Forcing intersting break)
param1, param2,
// comment 2
param3, param4
});
After:
someFunction(OtherParam, BracedList{
// comment 1 (Forcing intersting break)
param1, param2,
// comment 2
param3, param4
});
To do so, the UnwrappedLineParser now stores the information about the
kind of brace in the FormatToken.
llvm-svn: 185914
This adds a penalty for clang-format for each break that occurs in
a set of parentheses (including fake parenthesis that determine
the range of certain operator precendences) that have not yet been
broken. Thereby, clang-format prefers similar line breaks.
This fixes llvm.org/PR15506.
Before:
const int kTrackingOptions =
NSTrackingMouseMoved | NSTrackingMouseEnteredAndExited |
NSTrackingActiveAlways;
After:
const int kTrackingOptions = NSTrackingMouseMoved |
NSTrackingMouseEnteredAndExited |
NSTrackingActiveAlways;
Also removed ParenState::ForFakeParenthesis which has become unused.
llvm-svn: 185822
Summary:
Fixes problems that lead to incorrect formatting of these and similar snippets:
/*
**
*/
/*
**/
/*
* */
/*
*test
*/
Clang-format used to think that all the cases above use "* " decoration, and
failed to calculate insertion position properly. It also used to remove leading
"* " in the last line.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1113
llvm-svn: 185818
This is a better implementation of r183097. The main purpose is to
prevent certain constructs to be formatted "like a block of text".
Before:
aaaaaaaaaaaaa<
aaaaaaaaaa, aaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa>* aaaa = new aaaaaaaaaaaaa<
aaaaaaaaaa, aaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa>(bbbbbbbbbbbbbbbbbbbbbbbb);
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa[
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb] = (*cccccccccccccccc)[
dddddddddddddddddddddddddddddddddddddddddddddddddddddddd];
After:
aaaaaaaaaaaaa<aaaaaaaaaa, aaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa>* aaaa =
new aaaaaaaaaaaaa<aaaaaaaaaa, aaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa>(
bbbbbbbbbbbbbbbbbbbbbbbb);
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa[
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb] =
(*cccccccccccccccc)[
dddddddddddddddddddddddddddddddddddddddddddddddddddddddd];
llvm-svn: 185687
Additionally, allow breaking after c-style casts, but with a high
penalty.
Before:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *foo = (
aaaaaaaaaaaaaaaaa *)bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
After:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *foo = (aaaaaaaaaaaaaaaaa *)
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
This fixes llvm.org/PR16049.
llvm-svn: 185685
Summary:
Always breaking before multiline strings can help format complex
expressions containing multiline strings more consistently, and avoid consuming
too much horizontal space.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D1097
llvm-svn: 185622
In general, clang-format breaks after an operator if the LHS spans
multiple lines. Otherwise, this can lead to confusing effects and
effectively hide the operator precendence, e.g. in
if (aaaaaaaaaaaaaa ==
bbbbbbbbbbbbbb && c) { ...
This patch removes this rule for comparisons, if the LHS is not a binary
expression itself as many users were wondering why clang-format inserts
an unnecessary linebreak.
Before:
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) >
5) { ...
After:
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) > 5) { ...
In the long run, we might:
- Want to do this for other binary expressions as well.
- Do this only if the RHS is short or even only if it is a literal.
llvm-svn: 185530
This lead to weird formatting.
Before:
DoSomethingWithVector({ {} /* No data */ }, {
{ 1, 2 }
});
After:
DoSomethingWithVector({ {} /* No data */ }, { { 1, 2 } });
llvm-svn: 185346
Summary:
Add penalty when an excessively long line in a block comment can not be
broken on a leading whitespace. Lack of this addition can lead to severe column
width violations when they can be easily avoided.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D1071
llvm-svn: 185337
This is not all bad, but people are often surprised by it.
Before:
namespace {
int SomeVariable = 0; // comment
} // namespace
After:
namespace {
int SomeVariable = 0; // comment
} // namespace
llvm-svn: 185327
Before: void f(int */* unused */) {}
After: void f(int * /* unused */) {}
The previous version seems to be valid C++ code but confuses many syntax
highlighters.
llvm-svn: 185320
Summary:
Some valid pre-C++11 constructs change meaning when lexed in C++11
mode, e.g.
#define x(_a) printf("foo"_a);
(example from http://llvm.org/bugs/show_bug.cgi?id=16342). "foo"_a is treated as
a user-defined string literal when parsed in C++11 mode.
In order to deal with this correctly, we need to set lexing mode according to
which standard the code conforms to. We already have a configuration value for
this (FormatStyle.Standard), which seems to be appropriate to use in this case
as well.
Reviewers: klimek
CC: cfe-commits, gribozavr
Differential Revision: http://llvm-reviews.chandlerc.com/D1028
llvm-svn: 185149
Summary:
Fixes a problem where \t,\v or \f could lead to a crash when placed as
a first character in a line comment. The cause is that rtrim and ltrim handle
these characters, but our code didn't, so some invariants could be broken.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1013
llvm-svn: 184425
Summary:
A trailing block comment having multiple lines would cause extremely
high penalties if the summary length of its lines is more than the column limit.
Fixed by always considering only the last line of a multi-line block comment.
Removed a long-standing FIXME from relevant tests and added a motivating test
modelled after problem cases from real code.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1010
llvm-svn: 184340
Summary: Split strings at word boundaries, when there are no spaces and slashes.
Reviewers: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1003
llvm-svn: 184304
Summary:
E.g. the second line in
return aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
b; //
is indented 4 characters more than in
return aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
b;
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D984
llvm-svn: 184078
Summary:
Selectively propagate the information about token kind in
WhitespaceManager::replaceWhitespaceInToken.For correct alignment of new
segments of line comments in order to align them correctly. Don't set
BreakBeforeParameter in breakProtrudingToken for line comments, as it introduces
a break after the _next_ parameter. Added tests for related functions.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D980
llvm-svn: 184076
Summary:
Don't remove backslashes from block comments. Previously this
/* \ \ \ \ \ \
*/
would be turned to this:
/*
*/
which spoils some kinds of ASCII-art, people use in their comments. The behavior
was related to handling escaped newlines in block comments inside preprocessor
directives. This patch makes handling it in a more civilized way.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D979
llvm-svn: 183978
Summary:
Basically, don't special-case line comments in this regard. And fixed
an incorrect test, that relied on the wrong behavior.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D962
llvm-svn: 183851
Summary:
"//Test" becomes "// Test". This change is aimed to improve code
readability and conformance to certain coding styles. If a comment starts with a
non-alphanumeric character, the space isn't added, e.g. "//-*-c++-*-" stays
unchanged.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D949
llvm-svn: 183750
Summary: Remove them from the TokenText as well.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D935
llvm-svn: 183536
Summary:
Introduced two new style parameters: PenaltyBreakComment and
PenaltyBreakString. Add penalty for each character of a breakable token beyond
the column limit (this relates mainly to comments, as they are broken only on
whitespace). Tuned PenaltyBreakComment to prefer comment breaking over breaking
inside most binary expressions.
Fixed a bug that prevented *, & and && from being considered TT_BinaryOperator
in the presense of adjacent comments.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D933
llvm-svn: 183530
Before, clang-format would happily move a trailing block comment to a
new line, which normally changes the perceived binding of that comment.
E.g., it would move:
void f() { /* comment */
...
}
to:
void f() {
/* comment */
...
}
llvm-svn: 183420
The leading "}" in the construct "} else if (..) {" was confusing the
expression parser. Thus, no fake parentheses were generated and the
indentation was broken in some cases.
llvm-svn: 183393
Summary:
Detect if the file is valid UTF-8, and if this is the case, count code
points instead of just using number of bytes in all (hopefully) places, where
number of columns is needed. In particular, use the new
FormatToken.CodePointCount instead of TokenLength where appropriate.
Changed BreakableToken implementations to respect utf-8 character boundaries
when in utf-8 mode.
Reviewers: klimek, djasper
Reviewed By: djasper
CC: cfe-commits, rsmith, gribozavr
Differential Revision: http://llvm-reviews.chandlerc.com/D918
llvm-svn: 183312
An oversight in this detection made clang-format unable to format
the following nicely:
void aaaaaaaaaaaaaaaaaaa<aaaaaaaaaaaaaaaaaaaaaaaaaaa,
bbbbbbbbbbbbbbbbbbbbbbbbbb>(
cccccccccccccccccccccccccccc);
llvm-svn: 183097
Before, clang-format would not find a solution for formatting:
if ((aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
bbbbbbbbbbbbbbbbbb) && // aaaaaaaaaaaaaaaa
cccccc) {
}
llvm-svn: 183096
If a "}" is found inside parenthesis, this is probably a case of
missing parenthesis. This enables continuing to format after stuff code
like:
class A {
void f(
};
..
llvm-svn: 183009
With this patch, the simplified rule is:
If the block is part of a declaration (class, namespace, function,
enum, ..), merge an empty block onto a single line. Otherwise
(specifically for the compound statements of if, for, while, ...),
keep the braces on two separate lines.
The reasons are:
- Mostly the formatting of empty blocks does not matter much.
- Empty compound statements are really rare and are usually just
inserted while still working on the code. If they are on two lines,
inserting code is easier. Also, overlooking the "{}" of an
"if (...) {}" can be really bad.
- Empty declarations are not uncommon, e.g. empty constructors. Putting
them on one line saves vertical space at no loss of readability.
llvm-svn: 183008
When trying to fall back to search from the end onwards, we
would still find leading whitespace if the leading whitespace
went on after the end of the line.
llvm-svn: 182886
Now that the TokenAnnotator does not require stack space anymore,
reconstructing the lines has become the limiting factor. This patch
fixes that problem, allowing large files with multiple megabytes of
single unwrapped lines to be formatted.
llvm-svn: 182861
Gets rid of AnnotatedToken, putting everything into FormatToken.
FormatTokens are created once, and only referenced by pointer. This
enables multiple future features, like having tokens shared between
multiple UnwrappedLines (while there's still work to do to fully enable
that).
llvm-svn: 182859
If an identifier is on its own line and it is all upper case, it is highly
likely that this is a macro that is meant to stand on a line by itself.
Before:
class A : public QObject {
Q_OBJECT A() {}
};
Ater:
class A : public QObject {
Q_OBJECT
A() {}
};
llvm-svn: 182855
With option enabled (e.g. in Google-style):
template <typename T>
void f() {}
With option disabled:
template <typename T> void f() {}
Enabling this for Google-style and Chromium-style, not sure which other
styles would prefer that.
llvm-svn: 182849
This made it necessary to remove an error detection which would let us
bail out of braced lists in certain situations of missing "}". However,
as we always entirely escape from the braced list on finding ";", this
should not be a big problem.
With this, we can no format braced lists with uniformat inits:
return { arg1, SomeType { parameter } };
llvm-svn: 182788
With this patch, we create all tokens in one go before parsing and pass
an ArrayRef<FormatToken*> to the UnwrappedLineParser. The
UnwrappedLineParser is switched to use pointer-to-token internally.
The UnwrappedLineParser still copies the tokens into the UnwrappedLines.
This will be fixed in an upcoming patch.
llvm-svn: 182768
To fully support this, we also need to expand tabs in the text before
the block comment. This patch breaks indentation when there was a
non-standard mixture of spaces and tabs used for indentation, but
fixes a regression in the simple case:
{
/*
* Comment.
*/
int i;
}
Is now formatted correctly, if there were tabs used for indentation
before.
llvm-svn: 182760
Block comment indentation of empty lines regressed, as we did not
have a test for it.
/* Comment with...
*
* empty line. */
is now formatted correctly again.
llvm-svn: 182757
Before:
int (*func)(void*);
void f() { int(*func)(void*); }
After (consistent space after "int"):
int (*func)(void*);
void f() { int (*func)(void*); }
llvm-svn: 182756
This gets turned into two ">" operators at the beginning in order to
simplify template parameter handling. Thus, we need a special case to
handle those two binary operators correctly.
With this patch, clang-format can now correctly handle cases like:
aaaaaa = aaaaaaa(aaaaaaa, // break
aaaaaa) >>
bbbbbb;
llvm-svn: 182754
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.
As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
much more similar
- revamp handling of multi-line comments; we now calculate the
information about lines in multi-line comments similar to normal
tokens, and always issue replacements
As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.
llvm-svn: 182738
In general, we like to avoid line breaks like:
...
SomeParameter, OtherParameter).DoSomething(
...
as they tend to make code really hard to read (how would you even indent the
next line?). Previously we have implemented this in a hacky way, which has now
shown to lead to problems. This fixes a few weird looking formattings, such as:
Before:
aaaaa(
aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
.aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
After:
aaaaa(aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa).aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 182731
Before:
@{ NSFontAttributeNameeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
: regularFont, };
Now:
@{ NSFontAttributeNameeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee :
regularFont, };
':'s in dictionary literals (and the corresponding {}s) are now marked as
TT_ObjCDictLiteral too, which makes further improvements to dict literal
layout possible.
llvm-svn: 182716
Previously we started sequences to align for single line comments when
the previous line had a trailing comment, but the sequence was broken
for other reasons.
Now we re-format:
// a
// b
f(); // c
to:
// a
// b
f(); // c
llvm-svn: 182608
Now correctly leaves:
f(); // comment
// comment
g(); // comment
... alone if the middle comment was aligned with g() before formatting.
llvm-svn: 182605
Previously we would align:
f(); // comment
// other comment
g();
Even if // other comment was at the start of the line. Now we do not
align trailing comments if they have been already aligned correctly
with the next line.
Thus,
f(); // comment
// other comment
g();
will not be changed, while:
f(); // comment
// other commment
g();
will lead to the two trailing comments being aligned.
llvm-svn: 182577
Replaces the use of WhitespaceStart + WhitspaceLength.
This made a bug in the formatter obvous where we would incorrectly
calculate the next column.
FIXME: There's a similar bug left regarding TokenLength. We should
probably also move to have a TokenRange instead.
llvm-svn: 182572
Before:
vector<int> x { 1, 2, 3 };
After:
vector<int> x{ 1, 2, 3 };
Also add a style option to remove the spaces inside braced lists,
so that the above becomes:
std::vector<int> v{1, 2, 3};
llvm-svn: 182570
Allows formatting of C++11 braced init list constructs, like:
vector<int> v { 1, 2, 3 };
f({ 1, 2 });
This involves some changes of how tokens are handled in the
UnwrappedLineFormatter. Note that we have a plan to evolve the
design of the token flow into one where we create all tokens
up-front and then annotate them in the various layers (as we
currently already have to create all tokens at once anyway, the
current abstraction does not help). Thus, this introduces
FIXMEs towards that goal.
llvm-svn: 182568
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).
Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.
The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.
This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
#define A \
f({ \
g(); \
});
... now gets correctly layouted.
llvm-svn: 182467
clang-format was a bit too aggressive when trying to keep labels and
values on the same line.
Before:
llvm::outs()
<< "aaaaaaaaaaaaaaaaaaa: " << aaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
llvm::outs() << "aaaaaaaaaaaaaaaaaaa: "
<< aaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 182458
This only affects styles that prevent bin packing. There, a break after
a template declaration also forced a line break after the function name.
Before:
template <class SomeType, class SomeOtherType>
SomeType
SomeFunction(SomeType Type, SomeOtherType OtherType) {}
After:
template <class SomeType, class SomeOtherType>
SomeType SomeFunction(SomeType Type, SomeOtherType OtherType) {}
This fixes llvm.org/PR16072.
llvm-svn: 182457
If clang-format is confronted with long and deeply nested lines (e.g.
complex static initializers or function calls), it can currently try too
hard to find the optimal solution and never finish. The reason is that
the memoization does not work effectively for deeply nested lines.
This patch removes an earlier workaround and instead opts for
accepting a non-optimal solution in rare cases. However, it only does
so only in cases where it would have to analyze an excessive number of
states (currently set to 10000 - the most complex line in Format.cpp
requires ~800 states) so this should not change the behavior in a
relevant way.
llvm-svn: 182449
Basically, the new rule is: The opening "{" always has to be on the
same line as the first element if the braced list is nested
(e.g. in another braced list or in a function).
The solution that clang-format produces almost always adheres to this
rule anyway and this makes clang-format significantly faster for larger
lists. Added a test cases for the only exception I could find
(which doesn't seem to be very important at first sight).
llvm-svn: 182082
It turns out that several implementations go through the trouble of
setting up a SourceManager and Lexer and abstracting this into a
function makes usage easier.
Also abstracts SourceManager-independent ranges out of
tooling::Refactoring and provides a convenience function to create them
from line ranges.
llvm-svn: 181997
Before:
namespace abc { class SomeClass; }
namespace def { void someFunction() {} }
After:
namespace abc {
class Def;
}
namespace def {
void someFunction() {}
}
Rationale:
a) Having anything other than forward declaration on the same line
as a namespace looks confusing.
b) Formatting namespace-forward-declaration-combinations different
from other stuff is inconsistent.
c) Wasting vertical space close to such forward declarations really
does not affect readability.
llvm-svn: 181887
The function type detection in r181438 and r181764 detected function
types too eagerly. This led to inconsistent formatting of inline
assembly and (together with r181687) to an incorrect formatting of calls
in macros.
Before: #define DEREF_AND_CALL_F(parameter) f (*parameter)
After: #define DEREF_AND_CALL_F(parameter) f(*parameter)
llvm-svn: 181870
We have been assuming that CharSourceRange::getTokenRange() by itself
expands a range until the end of a token, but in fact it only sets
IsTokenRange to true. Thus, we have so far only considered the first
character of the last token to belong to an unwrapped line. This
did not really manifest in symptoms as all edit integrations
expand ranges to fully lines.
llvm-svn: 181778
Before (in styles that allow it), clang-format would not merge an
if statement onto a single line, if only the second line was format
(e.g. in an editor integration):
if (a)
return; // clang-format invoked on this line.
With this patch, this gets properly merged to:
if (a) return; // ...
llvm-svn: 181770
This fixes indentation where there are for example multiple closing
parentheses after a string literal, and where those parentheses
run over the end of the line.
During testing this revealed a bug in the implementation of
breakProtrudingToken: we don't want to change the state if we didn't
actually do anything.
llvm-svn: 181767
We now support "Linux" and "Stroustrup" brace breaking styles, which
gets us one step closer to support formatting WebKit, KDE & Linux code.
Linux brace breaking style:
namespace a
{
class A
{
void f()
{
if (x) {
f();
} else {
g();
}
}
}
}
Stroustrup brace breaking style:
namespace a {
class A {
void f()
{
if (x) {
f();
} else {
g();
}
}
}
}
llvm-svn: 181700
Fake parentheses (i.e. emulated parentheses used to correctly handle
binary expressions) used to prevent the optimization implemented in
r180264.
llvm-svn: 181692
This seems to be the vastly more common case. If we find enough
examples to the contrary, we can make it smarter.
Before: #define MACRO void f(int * a)
After: #define MACRO void f(int *a)
llvm-svn: 181687
Otherwise (when indenting from the wrapped -> or .), this looks
like a confusing indent.
Before:
aaaaaaa //
.aaaaaaa( //
aaaaaaa);
After:
aaaaaaa //
.aaaaaaa( //
aaaaaaa);
llvm-svn: 181595
Thereby, the macro is consistently formatted (including the trailing
escaped newlines) even if clang-format is invoked only on single lines
of the macro.
llvm-svn: 181590
Summary:
Adds actual config file reading to the clang-format utility.
Configuration file name is .clang-format. It is looked up for each input file
in its parent directories starting from immediate one. First found .clang-format
file is used. When using standard input, .clang-format is searched starting from
the current directory.
Added -dump-config option to easily create configuration files.
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits, jordan_rose, kimgr
Differential Revision: http://llvm-reviews.chandlerc.com/D758
llvm-svn: 181589
Before, the actual operator of an overloaded operator declaration was
handled as a binary operator an thus, clang-format could not find valid
formattings for many examples, e.g.:
template <typename AAAAAAA, typename BBBBBBB>
AAAAAAA operator/(const AAAAAAA &a, BBBBBBB &b);
llvm-svn: 181585
With style where the *s go with the type:
Before: typedef bool* (Class:: *Member)() const;
After: typedef bool* (Class::*Member)() const;
llvm-svn: 181439
If the LHS of a binary expression is broken, clang-format should also
break after the operator as otherwise:
- The RHS can be easy to miss
- It can look as if clang-format doesn't understand operator precedence
Before:
bool aaaaaaaaaaaaaaaaaaaaa = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa !=
bbbbbbbbbbbbbbbbbb && ccccccccc == ddddddddddd;
After:
bool aaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa != bbbbbbbbbbbbbbbbbb &&
ccccccccc == ddddddddddd;
As an additional note, clang-format would also be ok with the following
formatting, it just has a higher penalty (IMO correctly so).
bool aaaaaaaaaaaaaaaaaaaaa = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa !=
bbbbbbbbbbbbbbbbbb &&
ccccccccc == ddddddddddd;
llvm-svn: 181430
Before:
aaaaaaaa::
aaaaaaaa::
aaaaaaaa();
After:
aaaaaaaa::
aaaaaaaa::
aaaaaaaa();
The reason for the change is that:
a) we are not sure which is better
b) it is a really rare edge case
c) it simplifies the code
d) it currently causes problems with memoization
llvm-svn: 181421
Summary:
Added parseConfiguration method, which reads FormatStyle from YAML
string. This supports all FormatStyle fields and an additional BasedOnStyle
field, which can be used to specify base style.
Reviewers: djasper, klimek
Reviewed By: djasper
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D754
llvm-svn: 181326
LLVM/Clang basically don't use such comments and for Google-style,
include-lines are explicitly exempt from the column limit. Also, for
most cases, where the column limit is violated, the "better" solution
would be to move the comment to before the include, which clang-format
cannot do (yet).
llvm-svn: 181191
clang-format did not indent any declarations/definitions when breaking
after the type. With this change, it indents for all declarations but
does not indent for function definitions, i.e.:
Before:
const SomeLongTypeName&
some_long_variable_name;
typedef SomeLongTypeName
SomeLongTypeAlias;
const SomeLongReturnType*
SomeLongFunctionName();
const SomeLongReturnType*
SomeLongFunctionName() { ... }
After:
const SomeLongTypeName&
some_long_variable_name;
typedef SomeLongTypeName
SomeLongTypeAlias;
const SomeLongReturnType*
SomeLongFunctionName();
const SomeLongReturnType*
SomeLongFunctionName() { ... }
While it might seem inconsistent to indent function declarations, but
not definitions, there are two reasons for that:
- Function declarations are very similar to declarations of function
type variables, so there is another side to consistency to consider.
- There can be many function declarations on subsequent lines and not
indenting can make them harder to identify. Function definitions
are already separated by their body and not indenting
makes the function name slighly easier to find.
llvm-svn: 181187
This seems to be more common in LLVM, Google and Chromium.
Before:
class AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA :
public BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB,
public CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC {
};
After:
class AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
: public BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB,
public CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC {
};
llvm-svn: 181183
Deeply nested expressions basically break clang-format's memoization.
This patch slightly improves the situations and makes expressions like
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa())))))))))))))))))))))))))))))))))))))));
work.
llvm-svn: 180264
This enables formattings like:
#define A \
int aaaa; \
int b; \
int ccc; \
int dddddddddd;
Enabling this for Google/Chromium styles only as I don't know whether it
is desired for Clang/LLVM.
llvm-svn: 180253
In the following snippet, clang-format incorrectly aligned the
trailing comment, when only the last line was formatted:
int aaaaaa; // comment
int b;
int c; // Formatting only this line moved this comment.
llvm-svn: 180173
In Google style, constructor initializers need to be all on one line or
one initializer per line if that does not fit. Without this patch, this
non-bin-packing-behavior incorrectly extends to the parameters of the
initializers.
Before:
Constructor()
: aaaaa(aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa) {}
After:
Constructor()
: aaaaa(aaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa) {}
llvm-svn: 180001
Summary:
Added BreakableLineComment, moved common code from
BreakableBlockComment to newly added BreakableComment. As a side-effect of the
rewrite, found another problem with escaped newlines and had to change
code which removes trailing whitespace from line comments not to break after
this patch.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D682
llvm-svn: 179693
We do this in general, but missed a few cases.
Before:
void aaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbb bbbb);
After:
void aaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
bbbb bbbb);
llvm-svn: 179570
Summary:
Both strings and block comments are broken into lines in
breakProtrudingToken. Logic specific for strings or block comments is abstracted
in implementations of the BreakToken interface. Among other goodness, this
change fixes placement of backslashes after a block comment inside a
preprocessor directive (see removed FIXMEs in unit tests).
The code is far from being polished, and some parts of it will be changed for
line comments support.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D665
llvm-svn: 179526
Previously we'd only detect structural errors on the very first level.
This leads to incorrectly balanced braces not being discovered, and thus
incorrect indentation.
This change fixes the problem by:
- changing the parser to use an error state that can be detected
anywhere inside the productions, for example if we get an eof on
SOME_MACRO({ some block <eof>
- previously we'd never break lines when we discovered a structural
error; now we break even in the case of a structural error if there
are two unwrapped lines within the same line; thus,
void f() { while (true) { g(); y(); } }
will still be re-formatted, even if there's missing braces somewhere
in the file
- still exclude macro definitions from generating structural error;
macro definitions are inbalanced snippets
llvm-svn: 179379
Function declarations are now broken with the following preferences:
1) break amongst arguments.
2) break after return type.
3) break after (.
4) break before after nested name specifiers.
Options #2 or #3 are preferred over #1 only if a substantial number of
lines can be saved by that.
llvm-svn: 179287
Before:
class A {
public : // test
};
After:
class A {
public: // test
};
Also remove duplicate methods calculating properties of AnnotatedTokens
and make them members of AnnotatedTokens so that they are in a common
place.
llvm-svn: 179167
Summary:
Some codebases use these kinds of macros in functions, e.g. Chromium's
IPC_BEGIN_MESSAGE_MAP, IPC_BEGIN_MESSAGE_HANDLER, etc.
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D645
llvm-svn: 179099
The idea is to indent according to operator precedence and pretty much
identical to how stuff would be indented with parenthesis.
Before:
bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb +
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb &&
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >
ccccccccccccccccccccccccccccccccccccccccc;
After:
bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb +
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb &&
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >
ccccccccccccccccccccccccccccccccccccccccc;
llvm-svn: 179049
This combines several related changes:
a) Don't break before after the variable types in for loops with a
single variable.
b) Better indent DeclStmts defining multiple variables.
Before:
bool aaaaaaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaa),
bbbbbbbbbbbbbbbbbbbbbbbbb =
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(bbbbbbbbbbbbbbbb);
for (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaa = aaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaa;
aaaaaaaaaaa != aaaaaaaaaaaaaaaaaaa; ++aaaaaaaaaaa) {
}
After:
bool aaaaaaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaa),
bbbbbbbbbbbbbbbbbbbbbbbbb =
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(bbbbbbbbbbbbbbbb);
for (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaa =
aaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaa;
aaaaaaaaaaa != aaaaaaaaaaaaaaaaaaa; ++aaaaaaaaaaa) {
}
llvm-svn: 178641
Summary:
It turns out that we don't need to store CommentsBeforeNextToken in the
line state, but rather flush them before we start parsing preprocessor
directives. This fixes wrong comment indentation in code blocks in macro calls
(the test is included).
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D617
llvm-svn: 178638
Basically we have always special-cased the top-level statement of an
unwrapped line (the one with ParenLevel == 0) and that lead to several
inconsistencies. All added tests were formatted in a strange way, for
example:
Before:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa();
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()) {
}
After:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa();
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()) {
}
llvm-svn: 178542
Comments before preprocessor directives used to be stored with InPPDirective
flag set, which prevented correct comment splitting in this case. Fixed by
flushing comments before switching on InPPDirective. Added a new test and fixed
one of the existing tests.
llvm-svn: 178261
It turns out that
-foo;
can be an objective C method declaration. So instead of the previous
solution, recognize objective C methods only if we are in a declaration
scope.
llvm-svn: 177740
Before:
int a; // not formatted
// formatting this line only
After:
int a; // not formatted
// formatting this line only
This makes clang-format stable independent of whether the whole
file or single lines are formatted in most cases.
llvm-svn: 177739
Apparently one needs to set LangOptions.LineComment.
Before "//* */" got reformatted to "/ /* */" as the lexer was returning
the token sequence (slash, comment). This could also lead to weird other
stuff, e.g. for people that like to using comments like:
//****************
llvm-svn: 177720
Summary:
1. When splitting one-line block comment, use indentation and *s.
2. Remove trailing whitespace from all lines of a comment, not only the ones being splitted.
3. Add backslashes for all lines if a comment is used insed a preprocessor directive.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D557
llvm-svn: 177635
Summary: Added support for pointers-to-members usage via .* and a few tests.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D556
llvm-svn: 177537
Before (when only reformatting "int b"):
int a; // comment
// comment
int b;
After:
int a; // comment
// comment
int b;
This also fixes llvm.org/PR15433.
llvm-svn: 177524
This seems to be generally more desired.
Before:
if (aaaaaaaa &&
bbbbbbbb >
cccccccc) {}
After:
if (aaaaaaaa &&
bbbbbbbb >
cccccccc) {}
Also: Some formatting cleanup on clang-format's files.
llvm-svn: 177514
clang-format already prevented sequences like:
...
SomeParameter).someFunction(
...
as those are quite confusing. This failed on:
...
SomeParameter).someFunction(otherFunction(
...
Fixed in this patch.
llvm-svn: 177157
Summary:
Do this to avoid spoling nicely formatted multi-line comments (e.g.
with code examples or similar stuff).
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D544
llvm-svn: 177153
Summary:
Aligns continuation lines of multi-line comments to the base
indentation level +1:
class A {
/*
* test
*/
void f() {}
};
The first revision is work in progress. The implementation is not yet complete.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D541
llvm-svn: 177080
The stronger binding of a string ending in :/= does not really make
sense if it is the only character.
Before:
llvm::outs() << aaaaaaaaaaaaaaaaaaaaaaaa
<< "=" << bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
After:
llvm::outs() << aaaaaaaaaaaaaaaaaaaaaaaa << "="
<< bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
llvm-svn: 177075
Before:
A = new SomeType * [Length];
A = new SomeType *[Length]();
After:
A = new SomeType *[Length];
A = new SomeType *[Length]();
Small formatting cleanups with clang-format.
llvm-svn: 176936
1. We now ignore all non-default string literals, including raw
literals.
2. We do not break inside escape sequences any more.
FIXME: We still break in trigraphs.
llvm-svn: 176710
With the cursor located at "I", clang-format would not do anything to:
int a;
I
int b;
With this patch, it reduces the number of empty lines as necessary, and
removes unnecessary whitespace. It does not change/reformat "int a;" or
"int b;".
llvm-svn: 176650
With [] marking the selected range, clang-format invoked on
[ ] int a;
Would so far not reformat anything. With this patch, it formats a
line if its leading whitespace is touched.
llvm-svn: 176435
In builder type call, we indent to the laster function calls.
However, for the last element of such a call, we don't need to do
so, as that normally just wastes space and does not increase
readability.
Before:
aaaaaa->aaaaaa->aaaaaa( // break
aaaaaa);
aaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaa
->aaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
aaaaaa->aaaaaa->aaaaaa( // break
aaaaaa);
aaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 176352
We now break at a slash if we do not find a space to break on.
Also fixes a bug where we would go over the limit when breaking the
second line.
llvm-svn: 176350
If we don't find a natural split point (currently space) in a string
literal protruding over the line, we just split at the last possible
point.
llvm-svn: 176349
Two improvements:
1) Always leave at least one space before "\". Otherwise is can look bad
and there is a risk of unwillingly joining to characters to a different
token.
2) Use the full column limit for single-line #defines.
Fixes llvm.org/PR15148
llvm-svn: 176245
After some discussions, it seems that this is the better path in
the long run. Does not change Chromium style, as there, bin packing
is forbidden by the style guide.
Also fix two minor bugs wrt. formatting:
1. If a call parameter is a function call itself and is split before
the "." or "->", split before the next parameter.
2. If a call parameter is string literal that has to be split onto
two lines, split before the next parameter.
llvm-svn: 176177
Empty lines followed by line comments are often used to highlight the
comment. Empty lines somewhere else are usually left over from manual or
automatic formatting and should probably be removed.
Before (clang-format would keep):
S s = {
a,
b
};
After:
S s = { a, b };
llvm-svn: 176086
We might want to move towards doing this if the formatting can be
significantly improved, but we need to carefully evaluate the different
situations first.
Before (the string literal was split by clang-format here):
aaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaa, aaaaaa("aaa aaaaa aaa aaa aaaaa aaa "
"aaaaa aaa aaa aaaaaa"));
After:
aaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaa,
aaaaaa("aaa aaaaa aaa aaa aaaaa aaa aaaaa aaa aaa aaaaaa"));
llvm-svn: 176084
This fixes llvm.org/PR15350.
Before:
Constructor(int Parameter = 0)
: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaa),
aaaaaaaaaaaa(aaaaaaaaaaaaaaaaa) {}
After:
Constructor(int Parameter = 0)
: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaa),
aaaaaaaaaaaa(aaaaaaaaaaaaaaaaa) {}
I think the correct solution is to put the VariablePos into
ParenState, not LineState. Added FIXME.
llvm-svn: 176027
In conditional expressions, if the condition is split over multiple
lines, also break before both operands.
This prevents formattings like:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ? b : c;
Which are bad, because they suggestion incorrect operator precedence:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ? b : c);
This lead to the discovery that the expression parser incorrectly
handled conditional operators and that it could also handle semicolons
(which in turn reduced the amount of special casing for for-loops). As a
side-effect, we can now apply the bin-packing configuration to the
sections of for-loops.
llvm-svn: 175973
Also don't break in long include directives as that is not desired.
We can now format:
#include "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
#define LL_FORMAT "ll"
printf("aaaaa: %d, bbbbbbbbb: %" LL_FORMAT "d, cccccccc: %" LL_FORMAT
"d, ddddddddd: %" LL_FORMAT "d\n");
Before, this led to weird results.
llvm-svn: 175959
This fixes llvm.org/PR15033.
Also: Always break before a parameter, if the previous parameter was
split over multiple lines. This was necessary to make the right
decisions in for-loops, almost always makes the code more readable and
also fixes llvm.org/PR14873.
Before:
for (llvm::ArrayRef<NamedDecl *>::iterator I = FD->getDeclsInPrototypeScope()
.begin(), E = FD->getDeclsInPrototypeScope().end();
I != E; ++I) {
}
foo(bar(bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb,
ccccccccccccccccccccccccccccc), d, bar(e, f));
After:
for (llvm::ArrayRef<NamedDecl *>::iterator
I = FD->getDeclsInPrototypeScope().begin(),
E = FD->getDeclsInPrototypeScope().end();
I != E; ++I) {
}
foo(bar(bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb,
ccccccccccccccccccccccccccccc),
d, bar(e, f));
llvm-svn: 175741
We now indent the following correctly:
1. some + "literal" /* comment */
"literal";
2. breaking string literals after which we have another string literal.
llvm-svn: 175628
If the code author decides to put empty lines anywhere into the code we
should treat them equally, i.e. reduce them to the configured
MaxEmptyLinesToKeep.
With this change, we e.g. keep the newline in:
SomeType ST = {
// First value
a,
// Second value
b
};
llvm-svn: 175620
An alternative strategy to calculating the break on demand when hitting
a token that would need to be broken would be to put all possible breaks
inside the token into the optimizer.
Currently only supports breaking at spaces; more break points to come.
llvm-svn: 175613
The key bug was
if (Other.StartOfLineLevel < StartOfLineLevel) ..
instead of
if (Other.StartOfLineLevel != StartOfLineLevel) ..
Also cleaned up the function to be more consistent in the comparisons.
llvm-svn: 175500
In builder-type calls, it can be very confusing to just indent
parameters from the start of the line. Instead, indent 4 from the
correct function call.
Before:
aaaaaaaaaaaaaaaaaaa()->aaaaaa(bbbbb)->aaaaaaaaaaaaaaaaaaa( // break
aaaaaaaaaaaaaa);
aaaaaaaaaaaaaaaaaaaaaaa *aaaaaaaaa = aaaaaa->aaaaaaaaaaaa()->aaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
->aaaaaaaaaaaaaaaaa();
After:
aaaaaaaaaaaaaaaaaaa()->aaaaaa(bbbbb)->aaaaaaaaaaaaaaaaaaa( // break
aaaaaaaaaaaaaa);
aaaaaaaaaaaaaaaaaaaaaaa *aaaaaaaaa = aaaaaa->aaaaaaaaaaaa()
->aaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
->aaaaaaaaaaaaaaaaa();
llvm-svn: 175444
An unwrapped line can get moved around if there is no newline before
it and the previous line was formatted.
Example:
template<typename T> // Cursor is on this line when hitting "format"
T *getFETokenInfo() const { return static_cast<T*>(FETokenInfo); }
"return .." is the second unwrapped line in this scenario. I does not
touch any reformatted region. Thus, the result of formatting is:
template <typename T> T *getFETokenInfo() const { return static_cast<T *>(FETokenInfo); }
After second format (and arguably desired end-result):
template <typename T> T *getFETokenInfo() const {
return static_cast<T *>(FETokenInfo);
}
This fixes: llvm.org/PR15060.
llvm-svn: 175440
This fixes llvm.org/PR15248.
Before:
Test::Test(int b) : a(b *b) {}
for (int i = 0; i < a *a; ++i) {}
After:
Test::Test(int b) : a(b * b) {}
for (int i = 0; i < a * a; ++i) {}
llvm-svn: 175439
The current heuristic assumes that there can't be binary operators in
builder-type calls (excluding assigments). However, it also excluded
< and > in general, which is wrong. Now they are only excluded if they
are template parameters.
Before:
return aaaaaaaaaaaaaaaaa->aaaaa().aaaaaaaaaaaaa()i
.aaaaaa() < aaaaaaaaaaaaaaaaaaa->aaaaa().aaaaaaaaaaaaa().aaaaaa();
After:
return aaaaaaaaaaaaaaaaa->aaaaa().aaaaaaaaaaaaa().aaaaaa() <
aaaaaaaaaaaaaaaaaaa->aaaaa().aaaaaaaaaaaaa().aaaaaa();
llvm-svn: 175291