Summary: Remove them from the TokenText as well.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D935
llvm-svn: 183536
Summary:
Introduced two new style parameters: PenaltyBreakComment and
PenaltyBreakString. Add penalty for each character of a breakable token beyond
the column limit (this relates mainly to comments, as they are broken only on
whitespace). Tuned PenaltyBreakComment to prefer comment breaking over breaking
inside most binary expressions.
Fixed a bug that prevented *, & and && from being considered TT_BinaryOperator
in the presense of adjacent comments.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D933
llvm-svn: 183530
Summary:
Detect if the file is valid UTF-8, and if this is the case, count code
points instead of just using number of bytes in all (hopefully) places, where
number of columns is needed. In particular, use the new
FormatToken.CodePointCount instead of TokenLength where appropriate.
Changed BreakableToken implementations to respect utf-8 character boundaries
when in utf-8 mode.
Reviewers: klimek, djasper
Reviewed By: djasper
CC: cfe-commits, rsmith, gribozavr
Differential Revision: http://llvm-reviews.chandlerc.com/D918
llvm-svn: 183312
An oversight in this detection made clang-format unable to format
the following nicely:
void aaaaaaaaaaaaaaaaaaa<aaaaaaaaaaaaaaaaaaaaaaaaaaa,
bbbbbbbbbbbbbbbbbbbbbbbbbb>(
cccccccccccccccccccccccccccc);
llvm-svn: 183097
Before, clang-format would not find a solution for formatting:
if ((aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
bbbbbbbbbbbbbbbbbb) && // aaaaaaaaaaaaaaaa
cccccc) {
}
llvm-svn: 183096
With this patch, the simplified rule is:
If the block is part of a declaration (class, namespace, function,
enum, ..), merge an empty block onto a single line. Otherwise
(specifically for the compound statements of if, for, while, ...),
keep the braces on two separate lines.
The reasons are:
- Mostly the formatting of empty blocks does not matter much.
- Empty compound statements are really rare and are usually just
inserted while still working on the code. If they are on two lines,
inserting code is easier. Also, overlooking the "{}" of an
"if (...) {}" can be really bad.
- Empty declarations are not uncommon, e.g. empty constructors. Putting
them on one line saves vertical space at no loss of readability.
llvm-svn: 183008
Now that the TokenAnnotator does not require stack space anymore,
reconstructing the lines has become the limiting factor. This patch
fixes that problem, allowing large files with multiple megabytes of
single unwrapped lines to be formatted.
llvm-svn: 182861
Gets rid of AnnotatedToken, putting everything into FormatToken.
FormatTokens are created once, and only referenced by pointer. This
enables multiple future features, like having tokens shared between
multiple UnwrappedLines (while there's still work to do to fully enable
that).
llvm-svn: 182859
With option enabled (e.g. in Google-style):
template <typename T>
void f() {}
With option disabled:
template <typename T> void f() {}
Enabling this for Google-style and Chromium-style, not sure which other
styles would prefer that.
llvm-svn: 182849
With this patch, we create all tokens in one go before parsing and pass
an ArrayRef<FormatToken*> to the UnwrappedLineParser. The
UnwrappedLineParser is switched to use pointer-to-token internally.
The UnwrappedLineParser still copies the tokens into the UnwrappedLines.
This will be fixed in an upcoming patch.
llvm-svn: 182768
This gets turned into two ">" operators at the beginning in order to
simplify template parameter handling. Thus, we need a special case to
handle those two binary operators correctly.
With this patch, clang-format can now correctly handle cases like:
aaaaaa = aaaaaaa(aaaaaaa, // break
aaaaaa) >>
bbbbbb;
llvm-svn: 182754
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.
As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
much more similar
- revamp handling of multi-line comments; we now calculate the
information about lines in multi-line comments similar to normal
tokens, and always issue replacements
As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.
llvm-svn: 182738
In general, we like to avoid line breaks like:
...
SomeParameter, OtherParameter).DoSomething(
...
as they tend to make code really hard to read (how would you even indent the
next line?). Previously we have implemented this in a hacky way, which has now
shown to lead to problems. This fixes a few weird looking formattings, such as:
Before:
aaaaa(
aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
.aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
After:
aaaaa(aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa).aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 182731
Replaces the use of WhitespaceStart + WhitspaceLength.
This made a bug in the formatter obvous where we would incorrectly
calculate the next column.
FIXME: There's a similar bug left regarding TokenLength. We should
probably also move to have a TokenRange instead.
llvm-svn: 182572
Before:
vector<int> x { 1, 2, 3 };
After:
vector<int> x{ 1, 2, 3 };
Also add a style option to remove the spaces inside braced lists,
so that the above becomes:
std::vector<int> v{1, 2, 3};
llvm-svn: 182570
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).
Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.
The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.
This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
#define A \
f({ \
g(); \
});
... now gets correctly layouted.
llvm-svn: 182467
This only affects styles that prevent bin packing. There, a break after
a template declaration also forced a line break after the function name.
Before:
template <class SomeType, class SomeOtherType>
SomeType
SomeFunction(SomeType Type, SomeOtherType OtherType) {}
After:
template <class SomeType, class SomeOtherType>
SomeType SomeFunction(SomeType Type, SomeOtherType OtherType) {}
This fixes llvm.org/PR16072.
llvm-svn: 182457
If clang-format is confronted with long and deeply nested lines (e.g.
complex static initializers or function calls), it can currently try too
hard to find the optimal solution and never finish. The reason is that
the memoization does not work effectively for deeply nested lines.
This patch removes an earlier workaround and instead opts for
accepting a non-optimal solution in rare cases. However, it only does
so only in cases where it would have to analyze an excessive number of
states (currently set to 10000 - the most complex line in Format.cpp
requires ~800 states) so this should not change the behavior in a
relevant way.
llvm-svn: 182449
Basically, the new rule is: The opening "{" always has to be on the
same line as the first element if the braced list is nested
(e.g. in another braced list or in a function).
The solution that clang-format produces almost always adheres to this
rule anyway and this makes clang-format significantly faster for larger
lists. Added a test cases for the only exception I could find
(which doesn't seem to be very important at first sight).
llvm-svn: 182082
It turns out that several implementations go through the trouble of
setting up a SourceManager and Lexer and abstracting this into a
function makes usage easier.
Also abstracts SourceManager-independent ranges out of
tooling::Refactoring and provides a convenience function to create them
from line ranges.
llvm-svn: 181997
Before:
namespace abc { class SomeClass; }
namespace def { void someFunction() {} }
After:
namespace abc {
class Def;
}
namespace def {
void someFunction() {}
}
Rationale:
a) Having anything other than forward declaration on the same line
as a namespace looks confusing.
b) Formatting namespace-forward-declaration-combinations different
from other stuff is inconsistent.
c) Wasting vertical space close to such forward declarations really
does not affect readability.
llvm-svn: 181887
We have been assuming that CharSourceRange::getTokenRange() by itself
expands a range until the end of a token, but in fact it only sets
IsTokenRange to true. Thus, we have so far only considered the first
character of the last token to belong to an unwrapped line. This
did not really manifest in symptoms as all edit integrations
expand ranges to fully lines.
llvm-svn: 181778
Before (in styles that allow it), clang-format would not merge an
if statement onto a single line, if only the second line was format
(e.g. in an editor integration):
if (a)
return; // clang-format invoked on this line.
With this patch, this gets properly merged to:
if (a) return; // ...
llvm-svn: 181770
This fixes indentation where there are for example multiple closing
parentheses after a string literal, and where those parentheses
run over the end of the line.
During testing this revealed a bug in the implementation of
breakProtrudingToken: we don't want to change the state if we didn't
actually do anything.
llvm-svn: 181767
We now support "Linux" and "Stroustrup" brace breaking styles, which
gets us one step closer to support formatting WebKit, KDE & Linux code.
Linux brace breaking style:
namespace a
{
class A
{
void f()
{
if (x) {
f();
} else {
g();
}
}
}
}
Stroustrup brace breaking style:
namespace a {
class A {
void f()
{
if (x) {
f();
} else {
g();
}
}
}
}
llvm-svn: 181700
Fake parentheses (i.e. emulated parentheses used to correctly handle
binary expressions) used to prevent the optimization implemented in
r180264.
llvm-svn: 181692
Otherwise (when indenting from the wrapped -> or .), this looks
like a confusing indent.
Before:
aaaaaaa //
.aaaaaaa( //
aaaaaaa);
After:
aaaaaaa //
.aaaaaaa( //
aaaaaaa);
llvm-svn: 181595
Thereby, the macro is consistently formatted (including the trailing
escaped newlines) even if clang-format is invoked only on single lines
of the macro.
llvm-svn: 181590
Summary:
Adds actual config file reading to the clang-format utility.
Configuration file name is .clang-format. It is looked up for each input file
in its parent directories starting from immediate one. First found .clang-format
file is used. When using standard input, .clang-format is searched starting from
the current directory.
Added -dump-config option to easily create configuration files.
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits, jordan_rose, kimgr
Differential Revision: http://llvm-reviews.chandlerc.com/D758
llvm-svn: 181589
If the LHS of a binary expression is broken, clang-format should also
break after the operator as otherwise:
- The RHS can be easy to miss
- It can look as if clang-format doesn't understand operator precedence
Before:
bool aaaaaaaaaaaaaaaaaaaaa = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa !=
bbbbbbbbbbbbbbbbbb && ccccccccc == ddddddddddd;
After:
bool aaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa != bbbbbbbbbbbbbbbbbb &&
ccccccccc == ddddddddddd;
As an additional note, clang-format would also be ok with the following
formatting, it just has a higher penalty (IMO correctly so).
bool aaaaaaaaaaaaaaaaaaaaa = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa !=
bbbbbbbbbbbbbbbbbb &&
ccccccccc == ddddddddddd;
llvm-svn: 181430
Before:
aaaaaaaa::
aaaaaaaa::
aaaaaaaa();
After:
aaaaaaaa::
aaaaaaaa::
aaaaaaaa();
The reason for the change is that:
a) we are not sure which is better
b) it is a really rare edge case
c) it simplifies the code
d) it currently causes problems with memoization
llvm-svn: 181421
Summary:
Added parseConfiguration method, which reads FormatStyle from YAML
string. This supports all FormatStyle fields and an additional BasedOnStyle
field, which can be used to specify base style.
Reviewers: djasper, klimek
Reviewed By: djasper
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D754
llvm-svn: 181326
LLVM/Clang basically don't use such comments and for Google-style,
include-lines are explicitly exempt from the column limit. Also, for
most cases, where the column limit is violated, the "better" solution
would be to move the comment to before the include, which clang-format
cannot do (yet).
llvm-svn: 181191
clang-format did not indent any declarations/definitions when breaking
after the type. With this change, it indents for all declarations but
does not indent for function definitions, i.e.:
Before:
const SomeLongTypeName&
some_long_variable_name;
typedef SomeLongTypeName
SomeLongTypeAlias;
const SomeLongReturnType*
SomeLongFunctionName();
const SomeLongReturnType*
SomeLongFunctionName() { ... }
After:
const SomeLongTypeName&
some_long_variable_name;
typedef SomeLongTypeName
SomeLongTypeAlias;
const SomeLongReturnType*
SomeLongFunctionName();
const SomeLongReturnType*
SomeLongFunctionName() { ... }
While it might seem inconsistent to indent function declarations, but
not definitions, there are two reasons for that:
- Function declarations are very similar to declarations of function
type variables, so there is another side to consistency to consider.
- There can be many function declarations on subsequent lines and not
indenting can make them harder to identify. Function definitions
are already separated by their body and not indenting
makes the function name slighly easier to find.
llvm-svn: 181187
Deeply nested expressions basically break clang-format's memoization.
This patch slightly improves the situations and makes expressions like
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa())))))))))))))))))))))))))))))))))))))));
work.
llvm-svn: 180264
This enables formattings like:
#define A \
int aaaa; \
int b; \
int ccc; \
int dddddddddd;
Enabling this for Google/Chromium styles only as I don't know whether it
is desired for Clang/LLVM.
llvm-svn: 180253
In the following snippet, clang-format incorrectly aligned the
trailing comment, when only the last line was formatted:
int aaaaaa; // comment
int b;
int c; // Formatting only this line moved this comment.
llvm-svn: 180173
In Google style, constructor initializers need to be all on one line or
one initializer per line if that does not fit. Without this patch, this
non-bin-packing-behavior incorrectly extends to the parameters of the
initializers.
Before:
Constructor()
: aaaaa(aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa) {}
After:
Constructor()
: aaaaa(aaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa) {}
llvm-svn: 180001
Summary:
Added BreakableLineComment, moved common code from
BreakableBlockComment to newly added BreakableComment. As a side-effect of the
rewrite, found another problem with escaped newlines and had to change
code which removes trailing whitespace from line comments not to break after
this patch.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D682
llvm-svn: 179693
We do this in general, but missed a few cases.
Before:
void aaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbb bbbb);
After:
void aaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
bbbb bbbb);
llvm-svn: 179570
Summary:
Both strings and block comments are broken into lines in
breakProtrudingToken. Logic specific for strings or block comments is abstracted
in implementations of the BreakToken interface. Among other goodness, this
change fixes placement of backslashes after a block comment inside a
preprocessor directive (see removed FIXMEs in unit tests).
The code is far from being polished, and some parts of it will be changed for
line comments support.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D665
llvm-svn: 179526
Previously we'd only detect structural errors on the very first level.
This leads to incorrectly balanced braces not being discovered, and thus
incorrect indentation.
This change fixes the problem by:
- changing the parser to use an error state that can be detected
anywhere inside the productions, for example if we get an eof on
SOME_MACRO({ some block <eof>
- previously we'd never break lines when we discovered a structural
error; now we break even in the case of a structural error if there
are two unwrapped lines within the same line; thus,
void f() { while (true) { g(); y(); } }
will still be re-formatted, even if there's missing braces somewhere
in the file
- still exclude macro definitions from generating structural error;
macro definitions are inbalanced snippets
llvm-svn: 179379
Function declarations are now broken with the following preferences:
1) break amongst arguments.
2) break after return type.
3) break after (.
4) break before after nested name specifiers.
Options #2 or #3 are preferred over #1 only if a substantial number of
lines can be saved by that.
llvm-svn: 179287
Before:
class A {
public : // test
};
After:
class A {
public: // test
};
Also remove duplicate methods calculating properties of AnnotatedTokens
and make them members of AnnotatedTokens so that they are in a common
place.
llvm-svn: 179167
The idea is to indent according to operator precedence and pretty much
identical to how stuff would be indented with parenthesis.
Before:
bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb +
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb &&
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >
ccccccccccccccccccccccccccccccccccccccccc;
After:
bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb +
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb &&
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >
ccccccccccccccccccccccccccccccccccccccccc;
llvm-svn: 179049
This combines several related changes:
a) Don't break before after the variable types in for loops with a
single variable.
b) Better indent DeclStmts defining multiple variables.
Before:
bool aaaaaaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaa),
bbbbbbbbbbbbbbbbbbbbbbbbb =
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(bbbbbbbbbbbbbbbb);
for (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaa = aaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaa;
aaaaaaaaaaa != aaaaaaaaaaaaaaaaaaa; ++aaaaaaaaaaa) {
}
After:
bool aaaaaaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaa),
bbbbbbbbbbbbbbbbbbbbbbbbb =
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(bbbbbbbbbbbbbbbb);
for (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaa =
aaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaa;
aaaaaaaaaaa != aaaaaaaaaaaaaaaaaaa; ++aaaaaaaaaaa) {
}
llvm-svn: 178641
Basically we have always special-cased the top-level statement of an
unwrapped line (the one with ParenLevel == 0) and that lead to several
inconsistencies. All added tests were formatted in a strange way, for
example:
Before:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa();
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()) {
}
After:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa();
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()) {
}
llvm-svn: 178542
Before:
int a; // not formatted
// formatting this line only
After:
int a; // not formatted
// formatting this line only
This makes clang-format stable independent of whether the whole
file or single lines are formatted in most cases.
llvm-svn: 177739
Apparently one needs to set LangOptions.LineComment.
Before "//* */" got reformatted to "/ /* */" as the lexer was returning
the token sequence (slash, comment). This could also lead to weird other
stuff, e.g. for people that like to using comments like:
//****************
llvm-svn: 177720
Summary:
1. When splitting one-line block comment, use indentation and *s.
2. Remove trailing whitespace from all lines of a comment, not only the ones being splitted.
3. Add backslashes for all lines if a comment is used insed a preprocessor directive.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D557
llvm-svn: 177635
Before (when only reformatting "int b"):
int a; // comment
// comment
int b;
After:
int a; // comment
// comment
int b;
This also fixes llvm.org/PR15433.
llvm-svn: 177524
This seems to be generally more desired.
Before:
if (aaaaaaaa &&
bbbbbbbb >
cccccccc) {}
After:
if (aaaaaaaa &&
bbbbbbbb >
cccccccc) {}
Also: Some formatting cleanup on clang-format's files.
llvm-svn: 177514
clang-format already prevented sequences like:
...
SomeParameter).someFunction(
...
as those are quite confusing. This failed on:
...
SomeParameter).someFunction(otherFunction(
...
Fixed in this patch.
llvm-svn: 177157
Summary:
Do this to avoid spoling nicely formatted multi-line comments (e.g.
with code examples or similar stuff).
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D544
llvm-svn: 177153
Summary:
Aligns continuation lines of multi-line comments to the base
indentation level +1:
class A {
/*
* test
*/
void f() {}
};
The first revision is work in progress. The implementation is not yet complete.
Reviewers: djasper
Reviewed By: djasper
CC: cfe-commits, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D541
llvm-svn: 177080
Before:
A = new SomeType * [Length];
A = new SomeType *[Length]();
After:
A = new SomeType *[Length];
A = new SomeType *[Length]();
Small formatting cleanups with clang-format.
llvm-svn: 176936
1. We now ignore all non-default string literals, including raw
literals.
2. We do not break inside escape sequences any more.
FIXME: We still break in trigraphs.
llvm-svn: 176710
With the cursor located at "I", clang-format would not do anything to:
int a;
I
int b;
With this patch, it reduces the number of empty lines as necessary, and
removes unnecessary whitespace. It does not change/reformat "int a;" or
"int b;".
llvm-svn: 176650
With [] marking the selected range, clang-format invoked on
[ ] int a;
Would so far not reformat anything. With this patch, it formats a
line if its leading whitespace is touched.
llvm-svn: 176435
In builder type call, we indent to the laster function calls.
However, for the last element of such a call, we don't need to do
so, as that normally just wastes space and does not increase
readability.
Before:
aaaaaa->aaaaaa->aaaaaa( // break
aaaaaa);
aaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaa
->aaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
aaaaaa->aaaaaa->aaaaaa( // break
aaaaaa);
aaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 176352
We now break at a slash if we do not find a space to break on.
Also fixes a bug where we would go over the limit when breaking the
second line.
llvm-svn: 176350
If we don't find a natural split point (currently space) in a string
literal protruding over the line, we just split at the last possible
point.
llvm-svn: 176349
Two improvements:
1) Always leave at least one space before "\". Otherwise is can look bad
and there is a risk of unwillingly joining to characters to a different
token.
2) Use the full column limit for single-line #defines.
Fixes llvm.org/PR15148
llvm-svn: 176245
After some discussions, it seems that this is the better path in
the long run. Does not change Chromium style, as there, bin packing
is forbidden by the style guide.
Also fix two minor bugs wrt. formatting:
1. If a call parameter is a function call itself and is split before
the "." or "->", split before the next parameter.
2. If a call parameter is string literal that has to be split onto
two lines, split before the next parameter.
llvm-svn: 176177
Empty lines followed by line comments are often used to highlight the
comment. Empty lines somewhere else are usually left over from manual or
automatic formatting and should probably be removed.
Before (clang-format would keep):
S s = {
a,
b
};
After:
S s = { a, b };
llvm-svn: 176086
We might want to move towards doing this if the formatting can be
significantly improved, but we need to carefully evaluate the different
situations first.
Before (the string literal was split by clang-format here):
aaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaa, aaaaaa("aaa aaaaa aaa aaa aaaaa aaa "
"aaaaa aaa aaa aaaaaa"));
After:
aaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaa,
aaaaaa("aaa aaaaa aaa aaa aaaaa aaa aaaaa aaa aaa aaaaaa"));
llvm-svn: 176084
This fixes llvm.org/PR15350.
Before:
Constructor(int Parameter = 0)
: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaa),
aaaaaaaaaaaa(aaaaaaaaaaaaaaaaa) {}
After:
Constructor(int Parameter = 0)
: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaa),
aaaaaaaaaaaa(aaaaaaaaaaaaaaaaa) {}
I think the correct solution is to put the VariablePos into
ParenState, not LineState. Added FIXME.
llvm-svn: 176027
In conditional expressions, if the condition is split over multiple
lines, also break before both operands.
This prevents formattings like:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ? b : c;
Which are bad, because they suggestion incorrect operator precedence:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ? b : c);
This lead to the discovery that the expression parser incorrectly
handled conditional operators and that it could also handle semicolons
(which in turn reduced the amount of special casing for for-loops). As a
side-effect, we can now apply the bin-packing configuration to the
sections of for-loops.
llvm-svn: 175973
This fixes llvm.org/PR15033.
Also: Always break before a parameter, if the previous parameter was
split over multiple lines. This was necessary to make the right
decisions in for-loops, almost always makes the code more readable and
also fixes llvm.org/PR14873.
Before:
for (llvm::ArrayRef<NamedDecl *>::iterator I = FD->getDeclsInPrototypeScope()
.begin(), E = FD->getDeclsInPrototypeScope().end();
I != E; ++I) {
}
foo(bar(bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb,
ccccccccccccccccccccccccccccc), d, bar(e, f));
After:
for (llvm::ArrayRef<NamedDecl *>::iterator
I = FD->getDeclsInPrototypeScope().begin(),
E = FD->getDeclsInPrototypeScope().end();
I != E; ++I) {
}
foo(bar(bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb,
ccccccccccccccccccccccccccccc),
d, bar(e, f));
llvm-svn: 175741
We now indent the following correctly:
1. some + "literal" /* comment */
"literal";
2. breaking string literals after which we have another string literal.
llvm-svn: 175628
If the code author decides to put empty lines anywhere into the code we
should treat them equally, i.e. reduce them to the configured
MaxEmptyLinesToKeep.
With this change, we e.g. keep the newline in:
SomeType ST = {
// First value
a,
// Second value
b
};
llvm-svn: 175620
An alternative strategy to calculating the break on demand when hitting
a token that would need to be broken would be to put all possible breaks
inside the token into the optimizer.
Currently only supports breaking at spaces; more break points to come.
llvm-svn: 175613
The key bug was
if (Other.StartOfLineLevel < StartOfLineLevel) ..
instead of
if (Other.StartOfLineLevel != StartOfLineLevel) ..
Also cleaned up the function to be more consistent in the comparisons.
llvm-svn: 175500
In builder-type calls, it can be very confusing to just indent
parameters from the start of the line. Instead, indent 4 from the
correct function call.
Before:
aaaaaaaaaaaaaaaaaaa()->aaaaaa(bbbbb)->aaaaaaaaaaaaaaaaaaa( // break
aaaaaaaaaaaaaa);
aaaaaaaaaaaaaaaaaaaaaaa *aaaaaaaaa = aaaaaa->aaaaaaaaaaaa()->aaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
->aaaaaaaaaaaaaaaaa();
After:
aaaaaaaaaaaaaaaaaaa()->aaaaaa(bbbbb)->aaaaaaaaaaaaaaaaaaa( // break
aaaaaaaaaaaaaa);
aaaaaaaaaaaaaaaaaaaaaaa *aaaaaaaaa = aaaaaa->aaaaaaaaaaaa()
->aaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
->aaaaaaaaaaaaaaaaa();
llvm-svn: 175444
An unwrapped line can get moved around if there is no newline before
it and the previous line was formatted.
Example:
template<typename T> // Cursor is on this line when hitting "format"
T *getFETokenInfo() const { return static_cast<T*>(FETokenInfo); }
"return .." is the second unwrapped line in this scenario. I does not
touch any reformatted region. Thus, the result of formatting is:
template <typename T> T *getFETokenInfo() const { return static_cast<T *>(FETokenInfo); }
After second format (and arguably desired end-result):
template <typename T> T *getFETokenInfo() const {
return static_cast<T *>(FETokenInfo);
}
This fixes: llvm.org/PR15060.
llvm-svn: 175440
This got lost and was untested as the same effect is achieved by
avoiding bin packing, which is active in Google style by default.
However, moving forward, we want more control over the bin packing
option(s) and thus, this flag should work as expected.
llvm-svn: 175277
This is almost always more readable.
Before:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
? aaaaaaaaaaaaaaaaaaaaaaaaaaa : aaaaaaaaaaaaaaaaaaaaaaaaaaa;
After:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
? aaaaaaaaaaaaaaaaaaaaaaaaaaa
: aaaaaaaaaaaaaaaaaaaaaaaaaaa;
llvm-svn: 175262
This gives a clearer separation of the context, e.g. in GMOCK
statements.
Before:
EXPECT_CALL(SomeObject,
SomeFunction(Parameter)).WillRepeatedly(Return(SomeValue));
After:
EXPECT_CALL(SomeObject, SomeFunction(Parameter))
.WillRepeatedly(Return(SomeValue));
Minor format cleanups.
llvm-svn: 175162
So far, clang-format has always assumed the whitespace belonging to the
subsequent token. This has the negative side-effect that when
clang-format formats a line, it does not remove its trailing whitespace,
as it belongs to the next token.
Thus, this patch fixes most of llvm.org/PR15062.
We are not zapping a file's trailing whitespace so far, as this does not
belong to any token we see during formatting. We need to fix this in a
subsequent patch.
llvm-svn: 175152
The formatter can now format:
void aaaaaaaaaaaaaaaaaa(int level,
double *min_x,
double *max_x,
double *min_y,
double *max_y,
double *min_z,
double *max_z, ) {
}
Although this is invalid code, it frequently happens during development and
clang-format should be nicer :-).
llvm-svn: 175151
This fixes llvm.org/PR15179.
Before:
class ColorChooserMac : public content::ColorChooser,
public content::WebContentsObserver {
};
After:
class ColorChooserMac : public content::ColorChooser,
public content::WebContentsObserver {
};
llvm-svn: 175147
This has so far been disabled for Google style, but should be done
before breaking at nested name specifiers or in template parameters.
Before (in Google style):
template <typename T>
aaaaaaaa::aaaaa::aaaaaa<T, aaaaaaaaaaaaaaaaaaaaaaaaa> aaaaaaaaaaaaaaaaaaaaaaaa<
T>::aaaaaaa() {}
After:
template <typename T>
aaaaaaaa::aaaaa::aaaaaa<T, aaaaaaaaaaaaaaaaaaaaaaaaa>
aaaaaaaaaaaaaaaaaaaaaaaa<T>::aaaaaaa() {}
llvm-svn: 175074
- clear ownership: the SpecificBumpPtrAllocator owns all StateNodes
- this allows us to simplify the memoization data structure into a
std::set (FIXME: figure out whether we want to use a hash based
data structure).
- introduces StateNode as recursive data structure, instead of using
Edge and the Seen-map combined to drill through the graph
- using a count to stabilize the penalty instead of relying on the
container
- pulled out a method to forward-apply states in the end
This leads to a ~40% runtime decrease on Nico's benchmark.
Main FiXME is that the parameter lists of some function get too long.
I'd vote for either pulling the Queue etc into the Formatter proper,
or creating an inner class just for the search algorithm.
llvm-svn: 175051
Before:
for (id foo in[self getStuffFor : bla]) {
}
Now:
for (id foo in [self getStuffFor:bla]) {
}
"in" is treated as loop keyword if the line starts with "for", and as a
regular identifier else. To check for "in", its IdentifierInfo is handed
through a few layers.
llvm-svn: 174889
In google style, trailing comments are separated by two spaces. This
patch fixes the counting of these spaces and prevents clang-format from
creating a line with 81 columns.
llvm-svn: 174879
With this patch, the formatter introduces 'fake' parenthesis according
to the operator precedence of binary operators.
Before:
return aaaa & AAAAAAAAAAAAAAAAAAAAAAAAAAAAA || bbbb &
BBBBBBBBBBBBBBBBBBBBBBBBBBBBB || cccc & CCCCCCCCCCCCCCCCCCCCCCCCCC ||
dddd & DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD;
f(aaaaaaaaaaaaaaaaaaaa && aaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaa &&
aaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaa && aaaaaaaaaaaaaaaaaaaa);
After:
return aaaa & AAAAAAAAAAAAAAAAAAAAAAAAAAAAA ||
bbbb & BBBBBBBBBBBBBBBBBBBBBBBBBBBBB ||
cccc & CCCCCCCCCCCCCCCCCCCCCCCCCC ||
dddd & DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD;
f(aaaaaaaaaaaaaaaaaaaa && aaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaa && aaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaa && aaaaaaaaaaaaaaaaaaaa);
Future improvements:
- Get rid of some of the hacky ways to nicely format certain constructs.
- Merge this parser and the AnnotatingParser as we now have several parsers
that analyze (), [], etc.
llvm-svn: 174714
With this patch, clang-format can analyze the input file for two
properties:
1. Is "int *a" or "int* a" more common.
2. Are non-C++03 constructs used, e.g. A<A<A>>.
With Google-style, clang-format will now use the more common style for
(1) and format C++03 compatible, unless it finds C++11 constructs in the
input.
llvm-svn: 174504
We can now format stuff like:
- (void)doSomethingWith:(GTMFoo *)theFoo
rect:(NSRect)theRect
interval:(float)theInterval {
[myObject doFooWith:arg1 //
name:arg2
error:arg3];
}
This seems to fix everything mentioned in llvm.org/PR14939.
llvm-svn: 174364
This combines several changes:
* Calculation token type (e.g. for * and &) in the AnnotatingParser.
* Calculate the scope binding strength in the AnnotatingParser.
* Let <> and [] scopes bind stronger than () and {} scopes.
* Add minimal debugging output.
llvm-svn: 174307
In order to end up with good solutions, clang-format needs to try
"all" combinations of line breaks, evaluate them and select the
best one. Before, we have done this using a DFS with memoization
and cut-off conditions. However, this approach is very limited
as shown by the huge static initializer in the attachment of
llvm.org/PR14959.
Instead, this new implementation uses a variant of Dijkstra's
algorithm to do a prioritized BFS over the solution space.
Some numbers:
lib/Format/TokenAnnotator.cpp: 1.5s -> 0.15s
Attachment of PR14959: 10min+ (didn't finish) -> 10s
No functional changes intended.
llvm-svn: 174166
1. Never avoid bin packing in static initializers as this can
lead to terrible results.
2. If an element has to be broken over multiple lines, break after
the following comma.
This should be a step forward, but there are still many cases
especially with nested static initializers that we handle badly.
More patches will follow.
llvm-svn: 174061
The style guide only forbids this for function declarations. So,
now
someFunction(
aaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaa, aaaaaaaaaaaa);
Is allowed in Chromium mode.
llvm-svn: 173806
Before (in good cases):
for (auto aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) {}
for (auto aaaaaaaaaaaaaaaaaaaa : aaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaa,
aaaa)) {}
After:
for (auto aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa :
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) {}
for (auto aaaaaaaaaaaaaaaaaaaa :
aaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaa, aaaa)) {}
llvm-svn: 173684
Before we did not really systematically format those. Now, we format the
different cases as:
- 1 Line: a ? b : c;
- 2 Lines: short ? loooooooooong
: loooooooooong
- 2 Lines: loooooooooooooooong
? short : short
- 3 Lines: loooooooooooooooong
? loooooooooooooong
: loooooooooooooong
Not sure whether "?" and ":" should go on the new line, but it seems to
be the most consistent approach.
llvm-svn: 173683
1. Use a hanging ident for function calls nested in binary expressions.
E.g.:
int aaaaa = aaaaaaaaa && aaaaaaaaaa(
aaaaaaaaaa);
2. Slightly improve heuristic for builder type expressions and reduce
penalty for breaking before "." and "->" in those.
3. Remove mostly obsolete metric of decreasing indent level. This
fixes: llvm.org/PR14931.
Changes #1 and #2 were necessary to keep tests passing after #3.
llvm-svn: 173680
These always represent a continuation and we should increase the ident.
Before:
aaaaaaaaa(aaaaaaaaaaaaaaaaaaaaa::
aaaaaaaaaaaaaaaaaaaa);
After:
aaaaaaaaa(aaaaaaaaaaaaaaaaaaaaa::
aaaaaaaaaaaaaaaaaaaa);
llvm-svn: 173675
This combines two small changes:
1) Put a penalty on breaking after "<"
2) Only produce a hanging indent when parameters are separated by
commas.
Before:
aaaaaaaaaaaaaaaaaaaaaaaa<
aaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaa>(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
aaaaaa(new Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaa));
After:
aaaaaaaaaaaaaaaaaaaaaaaa<aaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaa>(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
aaaaaa(new Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaa));
This changes one ObjC test, but AFAICT this is not according to any
style guide (neither before nor after). We probably should be aligning
on the ":" there according to:
http://google-styleguide.googlecode.com/svn/trunk/objcguide.xml?showone=Method_Invocations#Method_Invocations
llvm-svn: 173457
Otherwise, really long nested name specifiers can easily lead to a
violation of the column limit.
Not sure about the rules for indentation in those cases, so input is
appreciated (see tests.).
llvm-svn: 173438
Before:
int aaaa = aaaaa().aaaaa() // force break
.aaaaa();
After:
int aaaa = aaaaa().aaaaa() // force break
.aaaaa();
The other indent is just wrong and confusing.
llvm-svn: 173273
Before:
bool aaaa = aaaaaaaaaaa(
aaaaaaaaaaaaaaaaa);
After:
bool aaaa = aaaaaaaaaaa(
aaaaaaaaaaaaaaaaa);
The other indentation was a nice attempt but doesn't work in many cases.
Not sure what the right long term solution is as the "After: " is still
not nice. We either need to figure out what to do in the cases where it
"doesn't work" or come up with a third solution, e.g. falling back to:
bool aaaa =
aaaaaaaaaaa(
aaaaaaaaaaaaaaaaa);
which should always work and nicely highlight the structure.
llvm-svn: 173268
Layouting would prevent breaking before + in
a[b + c] = d;
Regression detected by code review.
Also fixes an invalid-read found by the valgrind bot.
llvm-svn: 173262
Having seen more cases, this actually was not a good thing to do in the
first place. We can still improve on what we do now, but breaking after
the "=" is good in many cases.
Before:
aaaaaaaaaaaaa = aa->aaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaa));
After:
aaaaaaaaaaaaa =
aa->aaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaa));
llvm-svn: 173257
Before: if (int * a = &b) ...
After: if (int *a = &b) ...
Also changed all the existing tests to test the expressions in question
both in a declaration and in an expression context.
llvm-svn: 173256
We will need a more principled solution, but we should not leave this
unfixed until we come up with one.
Before: void f() { int * a; }
After: void f() { int *a; }
llvm-svn: 173252
This only affects styles where BinPackParameters is false.
With AllowAllParametersOnNextLine:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaa, aaaaaaaaaa, aaaaaaaaaa, aaaaaaaaaaa, aaaaaaaaaaa);
Without it:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaa,
aaaaaaaaaa,
aaaaaaaaaa,
aaaaaaaaaaa,
aaaaaaaaaaa);
llvm-svn: 173246
This gives us the ability to guess better defaults for whether a *
between identifiers is a pointer dereference or binary operator.
Now correctly formats:
void f(a *b);
void f() { f(a * b); }
llvm-svn: 173243
Changing nextToken() in the UnwrappedLineParser to get the next
non-comment token. This allows us to correctly layout a whole class of
snippets, like:
if /* */(/* */ a /* */) /* */
f() /* */; /* */
else /* */
g();
Fixes a bug in the formatter where we would assume there is a previous
non-comment token.
Also adds the indent level of an unwrapped line to the debug output in
the parser.
llvm-svn: 173168
We used to align trailing comments belong to different things.
Before:
void f() { // some function..
}
int a; // some variable..
After:
void f() { // some function..
}
int a; // some variable..
llvm-svn: 173100
We now only put empty blocks into a single line, if all of:
- all tokens of the structural element fit into a single line
- we're not in a control flow statement
Note that we usually don't put record definitions into a single line, as
there's usually at least one more token (the semicolon) after the
closing brace. This doesn't hold when we are in a context where there is
no semicolon, like "enum E {}".
There were some missing tests around joining lines around the corner
cases of the allowed number of columns, so this patch adds some.
llvm-svn: 173055
Before: template <template <typename T>, typename P > class X;
After: template <template <typename T>, typename P> class X;
More importantly, the token annotations for the second ">" are now computed
correctly.
llvm-svn: 173047
').' is likely part of a builder pattern statement.
This is based upon a patch developed by Nico Weber. Thank you!
Before:
int foo() {
return llvm::StringSwitch<Reference::Kind>(name).StartsWith(
".eh_frame_hdr", ORDER_EH_FRAMEHDR).StartsWith(
".eh_frame", ORDER_EH_FRAME).StartsWith(".init", ORDER_INIT).StartsWith(
".fini", ORDER_FINI).StartsWith(".hash", ORDER_HASH).Default(ORDER_TEXT);
}
After:
int foo() {
return llvm::StringSwitch<Reference::Kind>(name)
.StartsWith(".eh_frame_hdr", ORDER_EH_FRAMEHDR)
.StartsWith(".eh_frame", ORDER_EH_FRAME)
.StartsWith(".init", ORDER_INIT).StartsWith(".fini", ORDER_FINI)
.StartsWith(".hash", ORDER_HASH).Default(ORDER_TEXT);
}
Probably not ideal, but makes many cases much more readable.
The changes to overriding-ftemplate-comments.cpp don't seem better or
worse. We should address those soon.
llvm-svn: 172804
It's generally not possible to know if 'a' '*' 'b' is a multiplication
expression or a variable declaration with a purely lexer-based approach. The
formatter currently uses a heuristic that classifies this token sequence as a
multiplication in rhs contexts (after '=' or 'return') and as a declaration
else.
Because of this, it gets bit tests in ifs, such as "if (a & b)" wrong. However,
declarations in ifs always have to be followed by '=', so this patch changes
the formatter to classify '&' as an operator if it's at the start of an if
statement.
Before:
if (a& b)
if (int* b = f())
Now:
if (a & b)
if (int* b = f())
llvm-svn: 172731
Also adding more tests.
We can now keep the formatting of something like:
static SomeType type = { aaaaaaaaaaaaaaaaaaaa, /* comment */
aaaaaaaaaaaaaaaaaaaa /* comment */,
/* comment */ aaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaa, // comment
aaaaaaaaaaaaaaaaaaaa };
Note that the comment in the first line is handled like a trailing line comment
as that is likely what the user intended.
llvm-svn: 172711
Before: Constructor() : a(a), // comment a(a) {}
After: Constructor() : a(a), // comment
a(a) {}
Needed this as a quick fix. Will add more tests for this in a future
commit.
llvm-svn: 172624
We used to incorrectly parse
aaaaaa ? aaaaaa(aaaaaa) : aaaaaaaa;
Due to an l_paren being followed by a colon, we assumed it to be part of
a constructor initializer. Thus, we never found the colon belonging to
the conditional expression, marked the line as bing incorrect and did
not format it.
llvm-svn: 172621
"Bin-packing" here means allowing multiple parameters on one line, if a
function call/declaration is spread over multiple lines.
This is required by the Chromium style guide and probably desired for
the Google style guide. Not making changes to LLVM style as I don't have
enough data.
With this enabled, we format stuff like:
aaaaaaaaaaaaaaa(aaaaaaaaaa,
aaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaa).aaaaaaaaaaaaaaaaaa();
llvm-svn: 172617
This makes the tedious fitsIntoLimit() method unnecessary and I can
replace one hack (constructor initializers) by a slightly better hack.
Furthermore, this will enable calculating whether a certain part of a
line fits into the limit for future modifications.
llvm-svn: 172604
It was quite convoluted leading to us accidentally introducing O(N^2)
complexity while copying from UnwrappedLine to AnnotatedLine. We might
still want to improve the datastructure in AnnotatedLine (most
importantly not put them in a vector where they need to be copied on
vector resizing but that will be done as a follow-up.
This fixes most of the regression in llvm.org/PR14959.
No formatting changes intended.
llvm-svn: 172602
This is an optimization that djasper spottet. For now, we do not format
anything after the first token that belongs to such an implicit string
literal. All our state is not made for handling that anyway, so we'll
revisit this if we find a problem.
llvm-svn: 172537
Treat tokens inside <> for includes and everything from the second token
of a warning / error on as an implicit string literal, e.g. do not
change its whitespace at all.
Now correctly formats:
#include < path with space >
#error Leave all white!!!!! space* alone!
Note that for #error and #warning we still format the space up to the
first token of the text, so:
# error Text
will become
#error Text
llvm-svn: 172536
We used to incorrectly identify some operators (*, &, +, -, etc.) if
there were comments around them.
Example:
Before: int a = /**/ - 1;
After: int a = /**/ -1;
llvm-svn: 172533
We now format this correctly:
Status::Rep Status::global_reps[3] = {
{ kGlobalRef, OK_CODE, NULL, NULL, NULL },
{ kGlobalRef, CANCELLED_CODE, NULL, NULL, NULL },
{ kGlobalRef, UNKNOWN_CODE, NULL, NULL, NULL }
};
- fixed a bug where BreakBeforeClosingBrace would be set on the wrong
state
- added penalties for breaking between = and {, and between { and any
other non-{ token
llvm-svn: 172433
Now, "if (a) return;" is only allowed, if this option is set.
Also add a Chromium style which is currently identical to Google style
except for this option.
llvm-svn: 172431
Before: if (a)
return;
After: if (a) return;
Not yet sure, whether this is always desired, but we can add options and
make this a style parameter as we go along.
llvm-svn: 172413
Main difference, add an AnnotatedLine class to hold information about a
line while formatting. At the same time degrade the UnwrappedLine class
to a class solely used for communicating between the UnwrappedLineParser
and the Formatter.
No functional changes intended.
llvm-svn: 172403
clang-format should not change whether or not there is a line break
before a line comment as this strongly influences the percieved binding.
User input: void f(int a,
// b is awesome
int b);
void g(int a, // a is awesome
int b);
Before: void f(int a, // b is awesome
int b);
void g(int a, // a is awesome
int b);
After: <unchanged from input>
llvm-svn: 172361
I am not aware of a case where that would be wrong. The specific case I
am fixing are function parameters wrapped in parenthesis (e.g. in
macros).
Before: function(a,(b));
After: function(a, (b));
llvm-svn: 172351
A ")" before any of "=", "{" or ";" won't be a cast. This fixes issues
with the formatting of unnamed parameters.
Before: void f(int *){}
After: void f(int *) {}
llvm-svn: 172349
canBreakBefore() does not allow breaking after ':' for LT_ObjCMethodDecl lines,
so if Newline is true in addTokenToState() for ':' then LT_ObjCMethodDecl
cannot be set. No functionality change.
llvm-svn: 172307
This follows the approach suggested by djasper in PR14911: When a '[' is
seen that's at the start of a line, follows a binary operator, or follows one
of : [ ( return throw, that '[' and its closing ']' are marked as
TT_ObjCMethodExpr and every ':' in that range that isn't part of a ternary
?: is marked as TT_ObjCMethodExpr as well.
Update the layout routines to not output spaces around ':' tokens that are
marked TT_ObjCMethodExpr, and only allow breaking after such tokens, not
before.
Before:
[self adjustButton : closeButton_ ofKind : NSWindowCloseButton];
Now:
[self adjustButton:closeButton_ ofKind:NSWindowCloseButton];
llvm-svn: 172304
Before:
void aaaaaaaaaaaa(int aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) GUARDED_BY(
aaaaaaaaaaaaa);
After:
void aaaaaaaaaaaa(int aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
GUARDED_BY(aaaaaaaaaaaaa);
Also did some formatting cleanups with clang-format on the way.
llvm-svn: 172200
Added option to put each constructor initializer on its own line
if not all initializers fit on a single line. Enabling this for
Google style now as the style guide (arguable) suggests it. Not
sure whether we also want it for LLVM.
llvm-svn: 172196
As we keep adding more stuff to it, this structure is easier to
maintain. At one point we might think about making it an actual
class with specific accessors, etc.
llvm-svn: 172188
Objective-C method declarations look like this:
- (returntype)name:(type)argname anothername:(type)arg2name;
In google style, there's no space after the leading '-' but one after
"(returntype)" instead (but none after the argument types), see
http://google-styleguide.googlecode.com/svn/trunk/objcguide.xml#Method_Declarations_and_Definitions
Not inserting the space after '-' is easy, but to insert the space after the
return type, the formatter needs to know that a closing parenthesis ends the
return type. To do this, I tweaked the code in parse() to check for this, which
in turn required moving detection of TT_ObjCMethodSpecifier from annotate() to
parse(), because parse() runs before annotate().
(To keep things interesting, the return type is optional, but it's almost
always there in practice.)
http://llvm-reviews.chandlerc.com/D280
llvm-svn: 172140
Before:
@property(assign, getter = isEditable) BOOL editable;
Now:
@property(assign, getter=isEditable) BOOL editable;
It'd be nice if some Apple person could let me know if spaces are preferred
around '=' in @synthesize lines (see FIXME in the test).
llvm-svn: 172110
Don't do this in Google style though:
http://google-styleguide.googlecode.com/svn/trunk/objcguide.xml#Protocols
Most other places (function declarations, variable declarations) still get
this wrong, and since this looks very similiar to template instantiations to
the lexer (`id <MyProtocol> a = ...`), it's going to be hard to fix in some
places.
llvm-svn: 172099
The first token in @implementation, @interface, and @protocol lines is now
marked TT_ObjCDecl, and lines starting with a TT_ObjCDecl token are now marked
LT_ObjCMethodDecl.
llvm-svn: 172093
This prepares the code for single line optimizations and changes the
dependencies between single-line-formats to the indent of the first
token.
Conceptually, the first token is "between" the lines anyway, as the
whitespace for the first token includes the previous end-of-line, which
needs to be escaped when inside a preprocessor directive.
llvm-svn: 172083
We now decide whether a newline should go before the closing brace
depending on whether a newline was inserted after the opening brace.
For example, we now insert a newline before '};' in:
static SomeClass WithALoooooooooooooooooooongName = {
100000000, \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\"
};
... while not inserting a newline here:
static SomeClass = { a, b, c, d, e, f, g, h, i, j,
looooooooooooooooooooooooooooooooooongname,
looooooooooooooooooooooooooooooong };
Also fixes the formating of (column limit 25):
int x = {
avariable,
b(alongervariable)
};
llvm-svn: 172076
We're now formatting (column limit 25):
int x = {
avariable,
b(alongervariable) };
This also fixes:
Aaa({
int i;
}, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb,
ccccccccccccccccc));
... where we would previously break after the '},'.
Putting the closing curly into an extra line when there's a break
directly after the first curly will be done in a subsequent patch.
Paired with djasper.
llvm-svn: 172070
Before: int (^myBlock) (int) = ^(int num) {}
A<void ()>;
int (*b)(int);
After: int (^myBlock)(int) = ^(int num) {}
A<void()>;
int(*b)(int);
For function types and function pointer types, this patch only makes
the behavior consistent (for types that are keywords and other types).
For the latter function pointer type declarations, we'll probably
want to add a space after "int".
Also added LangOpts.Bool = 1, so we handle "A<bool()>" appropriately
Moved the LangOpts-settings to a public place for use by tests
and clang-format binary.
llvm-svn: 172065
Previously, we would not indent:
SOME_MACRO({
int i;
});
correctly. This is fixed by adding the trailing }); to the unwrapped
line starting with SOME_MACRO({, so the formatter can correctly match
the braces and indent accordingly.
Also fixes incorrect parsing of initializer lists, like:
int a[] = { 1 };
llvm-svn: 172058
This fixes llvm.org/PR14684.
Before: int *pa = (int *) & a;
After: int *pa = (int *)&a;
We still don't understand all kinds of casts. I added a FIXME to
address that.
llvm-svn: 172056
Previously, we'd always start at indent level 0 after a preprocessor
directive, now we layout the following snippet (column limit 69) as
follows:
functionCallTo(someOtherFunction(
withSomeParameters, whichInSequence,
areLongerThanALine(andAnotherCall,
B
withMoreParamters,
whichStronglyInfluenceTheLayout),
andMoreParameters),
trailing);
Note that the different jumping indent is a different issue that will be
addressed separately.
This is the first step towards handling #ifdef->#else->#endif chains
correctly.
llvm-svn: 171974
This fixes llvm.org/PR14870 and we no longer mess up:
template <typename T1, typename T2 = char, typename T3 = char,
typename T4 = char>
void f();
It removes the nice aligment for assignments inside other expressions,
but I am not sure those are actually practically relevant. If so, we can
fix those later.
llvm-svn: 171966
This addresses llvm.org/PR14847.
We can now format something like:
int aaaaaaaaaaaaaaaaaaaaaaaaaaa =
// aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaiaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
clang-format unavoidably exceeds the column limit, but does not just
flush everything into a single line. Moreover, it tries to minimize the
number of characters beyond the column limit.
llvm-svn: 171964
This fixes llvm.org/PR14860.
Before, we messed up the format of:
if (DeclaratorInfo.isFunctionDeclarator() &&
//getDeclSpecContextFromDeclaratorContext(Context) == DSC_top_level &&
Tok.is(tok::semi) && NextToken().is(tok::l_brace)) {
}
llvm-svn: 171961
This addresses llvm.org/PR14864.
We used to completely mess this up and now format as:
Diag(NewFD->getLocation(),
getLangOpts().MicrosoftExt ? diag::ext_function_specialization_in_class :
diag::err_function_specialization_in_class)
<< NewFD->getDeclName();
llvm-svn: 171957
In Clang/LLVM this seems to be the more common formatting for ##s. There
might still be case that we miss, but we'll fix those as we go along.
Before:
#define A(X)
void function ## X();
After:
#define A(X)
void function##X();
llvm-svn: 171862
This is a first step towards supporting more complex structures such
as #ifs inside unwrapped lines. This patch mostly converts the array-based
UnwrappedLine into a linked-list-based UnwrappedLine. Future changes will
allow multiple children for each Token turning the UnwrappedLine into a
tree.
No functional changes intended.
llvm-svn: 171856
This should make it slightly more readable as it more clearly separates
what happens where. No intended functional changes. More of this to
come..
llvm-svn: 171748
This addresses llvm.org/PR14830.
Before:
unsigned Cost =
TTI.getMemoryOpCost(I->getOpcode(), VectorTy, SI->getAlignment(),
SI->getPointerAddressSpace());
CharSourceRange LineRange =
CharSourceRange::getTokenRange(TheLine.Tokens.front().Tok.getLocation(),
TheLine.Tokens.back().Tok.getLocation());
After:
unsigned Cost = TTI.getMemoryOpCost(I->getOpcode(), VectorTy,
SI->getAlignment(),
SI->getPointerAddressSpace());
CharSourceRange LineRange = CharSourceRange::getTokenRange(
TheLine.Tokens.front().Tok.getLocation(),
TheLine.Tokens.back().Tok.getLocation());
This required rudimentary changes to static initializer lists, but we
are not yet formatting them in a reasonable way. That will be done in a
subsequent patch.
llvm-svn: 171731
Before:
virtual void write(ELFWriter *writer, OwningPtr<FileOutputBuffer> &buffer) =
0
After:
virtual void write(ELFWriter *writerrr,
OwningPtr<FileOutputBuffer> &buffer) = 0;
This addresses llvm.org/PR14815.
To implement this I introduced a line type during parsing and moved the
definition of TokenType out of the struct for increased readability.
Should have done the latter in a separate patch, but it would be hard to
pull apart now.
llvm-svn: 171724
We would format:
#define A \
int f(a); int i;
as
#define A \
int f(a);\
int i
The fix will break up macro definitions that could fit a line, but hit
the last column; fixing that is more involved, though, as it requires
looking at the following line.
llvm-svn: 171715
Previously, we'd format
int i;\
// comment
as
int i; // comment
The problem is that the escaped newline is part of the next token, and
thus the raw token text of the comment doesn't start with "//".
llvm-svn: 171713
If a token follows directly on an escaped newline, the escaped newline
is stored with the token. Since we re-layout escaped newlines, we need
to treat them just like normal whitespace - thus, we need to increase
the whitespace-length of the token, while decreasing the token length
(otherwise the token length contains the length of the escaped newline
and we double-count it while indenting).
llvm-svn: 171706
To parse # correctly, we need to know whether it is the first token in a
line - we can deduct this either from the whitespace or seeing that the
token is the first in the file - we already calculate this information.
This patch moves the identification of the first token into the
getNextToken method and stores it inside the FormatToken, so the
UnwrappedLineParser can stay independent of the SourceManager.
llvm-svn: 171640
Some of this is still pretty rough (note the load of FIXMEs), but it is
strictly an improvement and fixes various bugs that were related to
macro processing but are also imporant in non-macro use cases.
Specific fixes:
- correctly puts espaced newlines at the end of the line
- fixes counting of white space before a token when escaped newlines are
present
- fixes parsing of "trailing" tokens when eof() is hit
- puts macro parsing orthogonal to parsing other structure
- general support for parsing of macro definitions
Due to the fix to format trailing tokens, this change also includes a
bunch of fixes to the c-index tests.
llvm-svn: 171556
Fixes:
- incorrect handling of multiple consecutive preprocessor directives
- crash when trying to right align the escpaed newline for a line that
is longer than the column limit
- using only ColumnLimit-1 columns when layouting with escaped newlines
inside preprocessor directives
llvm-svn: 171401
This fixes llvm.org/PR14786.
We will need to split there as a last resort, but that should be done
consistently independent of whether the type is a template type or not.
Before:
template <typename T>
aaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaa<T>
::aaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
template <typename T>
aaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaa<T>::aaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 171400
This fixes llvm.org/PR14687.
Also fixes segfault for lines starting with * or &.
Before:
a *~b;
*a = 1; // <- this segfaulted
After:
a * ~b;
*a = 1; // no segfault :-)
llvm-svn: 171396
This is the first step towards handling preprocessor directives. This
patch only fixes the most pressing issue, namely correctly escaping
newlines for tokens within a sequence of a preprocessor directive.
The next step will be to fix incorrect format decisions on #define
directives.
llvm-svn: 171393
Apply all formatting changes that clang-format would apply to its own source
code. All choices seem to improve readability (or at least not make it worse).
No functional changes.
llvm-svn: 171039
This prevents formattings like this (assuming "parameter" doesn't fit the line):
bool f = someFunction() && someFunctionWithParam(
parameter) && someOtherFunction();
Here, "parameter" - the start of line 2 - has a parenthesis level of 2, but
there are subsequent tokens ("&&" and "someOtherFunction") with a lower level.
This is bad for readability as "parameter" hides "someOtherFunction". With this
patch, this changes to:
bool f = someFunction() &&
someFunctionWithParam(parameter) &&
someOtherFunction();
llvm-svn: 171038
This changes:
int Result = a + // force break
b;
return Result + // force break
5;
To:
int Result = a + // force break
b;
return Result + // force break
5;
llvm-svn: 171032
This fixes llvm.org/pr14686.
We used to add too many spaces for different versions of overloaded operator
function declarations/definitions. This patch changes, e.g.
operator *() {}
operator >() {}
operator () () {}
to
operator*() {}
operator>() {}
operator()() {}
llvm-svn: 171028
With this patch, splitting after binary operators has a panelty corresponding
to the operator's precedence. We used to ignore this and eagerly format like:
if (aaaaaaaaaaaaaaaaaaaaaaaaa || bbbbbbbbbbbbbbbbbbbbbbbbb &&
ccccccccccccccccccccccccc) { .. }
With this patch, this becomes:
if (aaaaaaaaaaaaaaaaaaaaaaaaa ||
bbbbbbbbbbbbbbbbbbbbbbbbb && ccccccccccccccccccccccccc) { .. }
llvm-svn: 171007
We used to not really format them. Now we do:
for (MachineBasicBlock::succ_iterator SI = BB->succ_begin(),
SE = BB->succ_end();
SI != SE; ++SI) {
This is just one example and I am sure we still mess some of them up, but it
is a step forward.
llvm-svn: 170899
We used to format initializers like this (with a sort of hacky implementation):
Constructor()
: Val1(A),
Val2(B) {
and now format like this (with a somewhat better solution):
Constructor()
: Val1(A), Val2(B) {
assuming this would not fit on a single line. Also added tests.
As a side effect we now first analyze whether an UnwrappedLine needs to be
split at all. If not, not splitting it is the best solution by definition. As
this should be a very common case in normal code, not exploring the entire
solution space can provide significant speedup.
llvm-svn: 170457
avoided.
This required a minor modification of the memoization as now the
"CurrentPenalty" depends on whether or not we break before the current
token. Therefore, the CurrentPenalty should not be memoized but added
after retrieving a value from memory. This should not affect the runtime
behavior.
llvm-svn: 170337
More specifically:
- Improve formatting of static initializers.
- Fix formatting of lines comments in enums.
- Fix formmating of trailing line comments.
llvm-svn: 170316
Summary: FormatTokenLexer is here, FormatTokenBuffer is on the way. This will allow to re-parse unwrapped lines when needed.
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D186
llvm-svn: 169605
- Fix behavior of memoization together with optimization
- Correctly attribute the PenaltyIndentLevel (breaking directly after "(" did
not count towards the inner level)
- Recognize more tokens as assignments
Review: http://llvm-reviews.chandlerc.com/D172
llvm-svn: 169384
uncovered.
This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.
I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.
llvm-svn: 169237
The necessity of this fix points to a problem with the design
of the addToken during the optimiation phase, which we need to address
in a much more principled way.
llvm-svn: 169151