In general, we like to avoid line breaks like:
...
SomeParameter, OtherParameter).DoSomething(
...
as they tend to make code really hard to read (how would you even indent the
next line?). Previously we have implemented this in a hacky way, which has now
shown to lead to problems. This fixes a few weird looking formattings, such as:
Before:
aaaaa(
aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
.aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
After:
aaaaa(aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa).aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 182731
Before:
@{ NSFontAttributeNameeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
: regularFont, };
Now:
@{ NSFontAttributeNameeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee :
regularFont, };
':'s in dictionary literals (and the corresponding {}s) are now marked as
TT_ObjCDictLiteral too, which makes further improvements to dict literal
layout possible.
llvm-svn: 182716
Previously we started sequences to align for single line comments when
the previous line had a trailing comment, but the sequence was broken
for other reasons.
Now we re-format:
// a
// b
f(); // c
to:
// a
// b
f(); // c
llvm-svn: 182608
Now correctly leaves:
f(); // comment
// comment
g(); // comment
... alone if the middle comment was aligned with g() before formatting.
llvm-svn: 182605
Previously we would align:
f(); // comment
// other comment
g();
Even if // other comment was at the start of the line. Now we do not
align trailing comments if they have been already aligned correctly
with the next line.
Thus,
f(); // comment
// other comment
g();
will not be changed, while:
f(); // comment
// other commment
g();
will lead to the two trailing comments being aligned.
llvm-svn: 182577
Replaces the use of WhitespaceStart + WhitspaceLength.
This made a bug in the formatter obvous where we would incorrectly
calculate the next column.
FIXME: There's a similar bug left regarding TokenLength. We should
probably also move to have a TokenRange instead.
llvm-svn: 182572
Before:
vector<int> x { 1, 2, 3 };
After:
vector<int> x{ 1, 2, 3 };
Also add a style option to remove the spaces inside braced lists,
so that the above becomes:
std::vector<int> v{1, 2, 3};
llvm-svn: 182570
Allows formatting of C++11 braced init list constructs, like:
vector<int> v { 1, 2, 3 };
f({ 1, 2 });
This involves some changes of how tokens are handled in the
UnwrappedLineFormatter. Note that we have a plan to evolve the
design of the token flow into one where we create all tokens
up-front and then annotate them in the various layers (as we
currently already have to create all tokens at once anyway, the
current abstraction does not help). Thus, this introduces
FIXMEs towards that goal.
llvm-svn: 182568
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).
Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.
The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.
This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
#define A \
f({ \
g(); \
});
... now gets correctly layouted.
llvm-svn: 182467
clang-format was a bit too aggressive when trying to keep labels and
values on the same line.
Before:
llvm::outs()
<< "aaaaaaaaaaaaaaaaaaa: " << aaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
llvm::outs() << "aaaaaaaaaaaaaaaaaaa: "
<< aaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 182458
This only affects styles that prevent bin packing. There, a break after
a template declaration also forced a line break after the function name.
Before:
template <class SomeType, class SomeOtherType>
SomeType
SomeFunction(SomeType Type, SomeOtherType OtherType) {}
After:
template <class SomeType, class SomeOtherType>
SomeType SomeFunction(SomeType Type, SomeOtherType OtherType) {}
This fixes llvm.org/PR16072.
llvm-svn: 182457
If clang-format is confronted with long and deeply nested lines (e.g.
complex static initializers or function calls), it can currently try too
hard to find the optimal solution and never finish. The reason is that
the memoization does not work effectively for deeply nested lines.
This patch removes an earlier workaround and instead opts for
accepting a non-optimal solution in rare cases. However, it only does
so only in cases where it would have to analyze an excessive number of
states (currently set to 10000 - the most complex line in Format.cpp
requires ~800 states) so this should not change the behavior in a
relevant way.
llvm-svn: 182449
With this patch, clang-format will try to keep the cursor at the
original code position in editor integrations (implemented for emacs and
vim). This means, after formatting, clang-format will try to keep the
cursor on the same character of the same token.
llvm-svn: 182373
Basically, the new rule is: The opening "{" always has to be on the
same line as the first element if the braced list is nested
(e.g. in another braced list or in a function).
The solution that clang-format produces almost always adheres to this
rule anyway and this makes clang-format significantly faster for larger
lists. Added a test cases for the only exception I could find
(which doesn't seem to be very important at first sight).
llvm-svn: 182082
This ensures that we format:
void longFunctionName {
} // long comment here
And not:
void longFunctionName {}
// long comment here
As requested in post-commit-review.
llvm-svn: 182024
It turns out that several implementations go through the trouble of
setting up a SourceManager and Lexer and abstracting this into a
function makes usage easier.
Also abstracts SourceManager-independent ranges out of
tooling::Refactoring and provides a convenience function to create them
from line ranges.
llvm-svn: 181997
Before:
namespace abc { class SomeClass; }
namespace def { void someFunction() {} }
After:
namespace abc {
class Def;
}
namespace def {
void someFunction() {}
}
Rationale:
a) Having anything other than forward declaration on the same line
as a namespace looks confusing.
b) Formatting namespace-forward-declaration-combinations different
from other stuff is inconsistent.
c) Wasting vertical space close to such forward declarations really
does not affect readability.
llvm-svn: 181887
The function type detection in r181438 and r181764 detected function
types too eagerly. This led to inconsistent formatting of inline
assembly and (together with r181687) to an incorrect formatting of calls
in macros.
Before: #define DEREF_AND_CALL_F(parameter) f (*parameter)
After: #define DEREF_AND_CALL_F(parameter) f(*parameter)
llvm-svn: 181870
The most common (non-buggy) case are where such objects are used as
return expressions in bool-returning functions or as boolean function
arguments. In those cases I've used (& added if necessary) a named
function to provide the equivalent (or sometimes negative, depending on
convenient wording) test.
DiagnosticBuilder kept its implicit conversion operator owing to the
prevalent use of it in return statements.
One bug was found in ExprConstant.cpp involving a comparison of two
PointerUnions (PointerUnion did not previously have an operator==, so
instead both operands were converted to bool & then compared). A test
is included in test/SemaCXX/constant-expression-cxx1y.cpp for the fix
(adding operator== to PointerUnion in LLVM).
llvm-svn: 181869
We have been assuming that CharSourceRange::getTokenRange() by itself
expands a range until the end of a token, but in fact it only sets
IsTokenRange to true. Thus, we have so far only considered the first
character of the last token to belong to an unwrapped line. This
did not really manifest in symptoms as all edit integrations
expand ranges to fully lines.
llvm-svn: 181778
Before (in styles that allow it), clang-format would not merge an
if statement onto a single line, if only the second line was format
(e.g. in an editor integration):
if (a)
return; // clang-format invoked on this line.
With this patch, this gets properly merged to:
if (a) return; // ...
llvm-svn: 181770
This library supports all the features of the compile-time based ASTMatcher
library, but allows the user to specify and construct the matchers at runtime.
It contains the following modules:
- A variant type, to be used by the matcher factory.
- A registry, where the matchers are indexed by name and have a factory method
with a generic signature.
- A simple matcher expression parser, that can be used to convert a matcher
expression string into actual matchers that can be used with the AST at
runtime.
Many features where omitted from this first revision to simplify this code
review. The main ideas are still represented in this change and it already has
support working use cases.
Things that are missing:
- Support for polymorphic matchers. These requires supporting code in the
registry, the marshallers and the variant type.
- Support for numbers, char and bool arguments to the matchers. This requires
supporting code in the parser and the variant type.
- A command line program putting everything together and providing an already
functional tool.
Patch by Samuel Benzaquen.
llvm-svn: 181768
This fixes indentation where there are for example multiple closing
parentheses after a string literal, and where those parentheses
run over the end of the line.
During testing this revealed a bug in the implementation of
breakProtrudingToken: we don't want to change the state if we didn't
actually do anything.
llvm-svn: 181767
We now support "Linux" and "Stroustrup" brace breaking styles, which
gets us one step closer to support formatting WebKit, KDE & Linux code.
Linux brace breaking style:
namespace a
{
class A
{
void f()
{
if (x) {
f();
} else {
g();
}
}
}
}
Stroustrup brace breaking style:
namespace a {
class A {
void f()
{
if (x) {
f();
} else {
g();
}
}
}
}
llvm-svn: 181700
Fake parentheses (i.e. emulated parentheses used to correctly handle
binary expressions) used to prevent the optimization implemented in
r180264.
llvm-svn: 181692
This seems to be the vastly more common case. If we find enough
examples to the contrary, we can make it smarter.
Before: #define MACRO void f(int * a)
After: #define MACRO void f(int *a)
llvm-svn: 181687
Otherwise (when indenting from the wrapped -> or .), this looks
like a confusing indent.
Before:
aaaaaaa //
.aaaaaaa( //
aaaaaaa);
After:
aaaaaaa //
.aaaaaaa( //
aaaaaaa);
llvm-svn: 181595
Thereby, the macro is consistently formatted (including the trailing
escaped newlines) even if clang-format is invoked only on single lines
of the macro.
llvm-svn: 181590
Summary:
Adds actual config file reading to the clang-format utility.
Configuration file name is .clang-format. It is looked up for each input file
in its parent directories starting from immediate one. First found .clang-format
file is used. When using standard input, .clang-format is searched starting from
the current directory.
Added -dump-config option to easily create configuration files.
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits, jordan_rose, kimgr
Differential Revision: http://llvm-reviews.chandlerc.com/D758
llvm-svn: 181589
Before, the actual operator of an overloaded operator declaration was
handled as a binary operator an thus, clang-format could not find valid
formattings for many examples, e.g.:
template <typename AAAAAAA, typename BBBBBBB>
AAAAAAA operator/(const AAAAAAA &a, BBBBBBB &b);
llvm-svn: 181585
With style where the *s go with the type:
Before: typedef bool* (Class:: *Member)() const;
After: typedef bool* (Class::*Member)() const;
llvm-svn: 181439
If the LHS of a binary expression is broken, clang-format should also
break after the operator as otherwise:
- The RHS can be easy to miss
- It can look as if clang-format doesn't understand operator precedence
Before:
bool aaaaaaaaaaaaaaaaaaaaa = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa !=
bbbbbbbbbbbbbbbbbb && ccccccccc == ddddddddddd;
After:
bool aaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa != bbbbbbbbbbbbbbbbbb &&
ccccccccc == ddddddddddd;
As an additional note, clang-format would also be ok with the following
formatting, it just has a higher penalty (IMO correctly so).
bool aaaaaaaaaaaaaaaaaaaaa = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa !=
bbbbbbbbbbbbbbbbbb &&
ccccccccc == ddddddddddd;
llvm-svn: 181430
Before:
aaaaaaaa::
aaaaaaaa::
aaaaaaaa();
After:
aaaaaaaa::
aaaaaaaa::
aaaaaaaa();
The reason for the change is that:
a) we are not sure which is better
b) it is a really rare edge case
c) it simplifies the code
d) it currently causes problems with memoization
llvm-svn: 181421
Summary:
Added parseConfiguration method, which reads FormatStyle from YAML
string. This supports all FormatStyle fields and an additional BasedOnStyle
field, which can be used to specify base style.
Reviewers: djasper, klimek
Reviewed By: djasper
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D754
llvm-svn: 181326
LLVM/Clang basically don't use such comments and for Google-style,
include-lines are explicitly exempt from the column limit. Also, for
most cases, where the column limit is violated, the "better" solution
would be to move the comment to before the include, which clang-format
cannot do (yet).
llvm-svn: 181191
clang-format did not indent any declarations/definitions when breaking
after the type. With this change, it indents for all declarations but
does not indent for function definitions, i.e.:
Before:
const SomeLongTypeName&
some_long_variable_name;
typedef SomeLongTypeName
SomeLongTypeAlias;
const SomeLongReturnType*
SomeLongFunctionName();
const SomeLongReturnType*
SomeLongFunctionName() { ... }
After:
const SomeLongTypeName&
some_long_variable_name;
typedef SomeLongTypeName
SomeLongTypeAlias;
const SomeLongReturnType*
SomeLongFunctionName();
const SomeLongReturnType*
SomeLongFunctionName() { ... }
While it might seem inconsistent to indent function declarations, but
not definitions, there are two reasons for that:
- Function declarations are very similar to declarations of function
type variables, so there is another side to consistency to consider.
- There can be many function declarations on subsequent lines and not
indenting can make them harder to identify. Function definitions
are already separated by their body and not indenting
makes the function name slighly easier to find.
llvm-svn: 181187
This seems to be more common in LLVM, Google and Chromium.
Before:
class AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA :
public BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB,
public CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC {
};
After:
class AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
: public BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB,
public CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC {
};
llvm-svn: 181183
Deeply nested expressions basically break clang-format's memoization.
This patch slightly improves the situations and makes expressions like
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(aaaaa(
aaaaa(aaaaa())))))))))))))))))))))))))))))))))))))));
work.
llvm-svn: 180264
This enables formattings like:
#define A \
int aaaa; \
int b; \
int ccc; \
int dddddddddd;
Enabling this for Google/Chromium styles only as I don't know whether it
is desired for Clang/LLVM.
llvm-svn: 180253
In the following snippet, clang-format incorrectly aligned the
trailing comment, when only the last line was formatted:
int aaaaaa; // comment
int b;
int c; // Formatting only this line moved this comment.
llvm-svn: 180173
In Google style, constructor initializers need to be all on one line or
one initializer per line if that does not fit. Without this patch, this
non-bin-packing-behavior incorrectly extends to the parameters of the
initializers.
Before:
Constructor()
: aaaaa(aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa) {}
After:
Constructor()
: aaaaa(aaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaa) {}
llvm-svn: 180001
Summary:
Added BreakableLineComment, moved common code from
BreakableBlockComment to newly added BreakableComment. As a side-effect of the
rewrite, found another problem with escaped newlines and had to change
code which removes trailing whitespace from line comments not to break after
this patch.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D682
llvm-svn: 179693
We do this in general, but missed a few cases.
Before:
void aaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbb bbbb);
After:
void aaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
bbbb bbbb);
llvm-svn: 179570
Summary:
Both strings and block comments are broken into lines in
breakProtrudingToken. Logic specific for strings or block comments is abstracted
in implementations of the BreakToken interface. Among other goodness, this
change fixes placement of backslashes after a block comment inside a
preprocessor directive (see removed FIXMEs in unit tests).
The code is far from being polished, and some parts of it will be changed for
line comments support.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D665
llvm-svn: 179526
Previously we'd only detect structural errors on the very first level.
This leads to incorrectly balanced braces not being discovered, and thus
incorrect indentation.
This change fixes the problem by:
- changing the parser to use an error state that can be detected
anywhere inside the productions, for example if we get an eof on
SOME_MACRO({ some block <eof>
- previously we'd never break lines when we discovered a structural
error; now we break even in the case of a structural error if there
are two unwrapped lines within the same line; thus,
void f() { while (true) { g(); y(); } }
will still be re-formatted, even if there's missing braces somewhere
in the file
- still exclude macro definitions from generating structural error;
macro definitions are inbalanced snippets
llvm-svn: 179379
Function declarations are now broken with the following preferences:
1) break amongst arguments.
2) break after return type.
3) break after (.
4) break before after nested name specifiers.
Options #2 or #3 are preferred over #1 only if a substantial number of
lines can be saved by that.
llvm-svn: 179287
Before:
class A {
public : // test
};
After:
class A {
public: // test
};
Also remove duplicate methods calculating properties of AnnotatedTokens
and make them members of AnnotatedTokens so that they are in a common
place.
llvm-svn: 179167
isVirtual - matches CXXMethodDecl nodes for virtual methods
isOverride - matches CXXMethodDecl nodes for methods that override virtual methods from a base class.
Author: Philip Dunstan <phil@philipdunstan.com>
llvm-svn: 179126
Summary:
Some codebases use these kinds of macros in functions, e.g. Chromium's
IPC_BEGIN_MESSAGE_MAP, IPC_BEGIN_MESSAGE_HANDLER, etc.
Reviewers: djasper, klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D645
llvm-svn: 179099
The idea is to indent according to operator precedence and pretty much
identical to how stuff would be indented with parenthesis.
Before:
bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb +
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb &&
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >
ccccccccccccccccccccccccccccccccccccccccc;
After:
bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ==
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb +
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb &&
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa *
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >
ccccccccccccccccccccccccccccccccccccccccc;
llvm-svn: 179049
This combines several related changes:
a) Don't break before after the variable types in for loops with a
single variable.
b) Better indent DeclStmts defining multiple variables.
Before:
bool aaaaaaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaa),
bbbbbbbbbbbbbbbbbbbbbbbbb =
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(bbbbbbbbbbbbbbbb);
for (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaa = aaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaa;
aaaaaaaaaaa != aaaaaaaaaaaaaaaaaaa; ++aaaaaaaaaaa) {
}
After:
bool aaaaaaaaaaaaaaaaaaaaaaaaa =
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaa),
bbbbbbbbbbbbbbbbbbbbbbbbb =
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb(bbbbbbbbbbbbbbbb);
for (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaa =
aaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaa;
aaaaaaaaaaa != aaaaaaaaaaaaaaaaaaa; ++aaaaaaaaaaa) {
}
llvm-svn: 178641
Summary:
It turns out that we don't need to store CommentsBeforeNextToken in the
line state, but rather flush them before we start parsing preprocessor
directives. This fixes wrong comment indentation in code blocks in macro calls
(the test is included).
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D617
llvm-svn: 178638