Before: void f(int */* unused */) {}
After: void f(int * /* unused */) {}
The previous version seems to be valid C++ code but confuses many syntax
highlighters.
llvm-svn: 185320
explicitly specify use of C++98 or C++11. Lang_CXX is preserved as
an alias for Lang_CXX98.
This does not add Lang_CXX1Y or Lang_C11, on the assumption that it's
better to add them if/when they are needed.
(This is a prerequisite for a test in a later patch for RecursiveASTVisitor.)
Reviewed by Richard Smith.
llvm-svn: 185276
Summary:
Some valid pre-C++11 constructs change meaning when lexed in C++11
mode, e.g.
#define x(_a) printf("foo"_a);
(example from http://llvm.org/bugs/show_bug.cgi?id=16342). "foo"_a is treated as
a user-defined string literal when parsed in C++11 mode.
In order to deal with this correctly, we need to set lexing mode according to
which standard the code conforms to. We already have a configuration value for
this (FormatStyle.Standard), which seems to be appropriate to use in this case
as well.
Reviewers: klimek
CC: cfe-commits, gribozavr
Differential Revision: http://llvm-reviews.chandlerc.com/D1028
llvm-svn: 185149
They are mostly duplicated and got out of sync during the PathV1 removal. We
should factor the code somewhere, but for now a FIXME will do.
llvm-svn: 185019
- Added conversion routines and checks in Matcher<T> that take a DynTypedMatcher.
- Added type information on the error messages for the marshallers.
- Allows future work on Polymorphic/overloaded matchers. We should be
able to disambiguate at runtime and choose the appropriate overload.
llvm-svn: 184429
Summary:
Fixes a problem where \t,\v or \f could lead to a crash when placed as
a first character in a line comment. The cause is that rtrim and ltrim handle
these characters, but our code didn't, so some invariants could be broken.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1013
llvm-svn: 184425
Summary:
A trailing block comment having multiple lines would cause extremely
high penalties if the summary length of its lines is more than the column limit.
Fixed by always considering only the last line of a multi-line block comment.
Removed a long-standing FIXME from relevant tests and added a motivating test
modelled after problem cases from real code.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1010
llvm-svn: 184340
Added ASTNodeKind as a standalone way to represent node kinds and their hierarchy.
This change is to support ongoing work on D815.
Reviewers: klimek
CC: cfe-commits
llvm-svn: 184331
This is in preparation for the backwards references to bound
nodes, which will expose a lot more about how matches occur. Main
changes:
- instead of building the tree of bound nodes, we build a "set" of bound
nodes and explode all possible match combinations while running
through the matchers; this will allow us to also implement matchers
that filter down the current set of matches, like "equalsBoundNode"
- take the set of bound nodes at the start of the match into
consideration when doing memoization; as part of that, reevaluated
that memoization gives us benefits that are large enough (it still
does - the effect on common match patterns is up to an order of
magnitude)
- reset the bound nodes when a node does not match, thus never leaking
information from partial sub-matcher matches for failing matchers
Effects:
- we can now correctly "explode" combinatorial matches, for example:
allOf(forEachDescendant(...bind("a")),
forEachDescendant(...bind("b"))) will now trigger matches for all
combinations of matching "a" and "b"s.
- we now never expose bound nodes from partial matches in matchers that
did not match in the end - this fixes a long-standing issue
FIXMEs:
- rename BoundNodesTreeBuilder to BoundNodesBuilder or
BoundNodesSetBuilder, as we don't build a tree any more; this is out
of scope for this change, though
- we're seeing some performance regressions (around 10%), but I expect
some performance tuning will get that back, and it's easily worth
the increase in expressiveness for now
llvm-svn: 184313
Summary: Split strings at word boundaries, when there are no spaces and slashes.
Reviewers: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1003
llvm-svn: 184304
Summary:
E.g. the second line in
return aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
b; //
is indented 4 characters more than in
return aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
b;
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D984
llvm-svn: 184078
Summary:
Selectively propagate the information about token kind in
WhitespaceManager::replaceWhitespaceInToken.For correct alignment of new
segments of line comments in order to align them correctly. Don't set
BreakBeforeParameter in breakProtrudingToken for line comments, as it introduces
a break after the _next_ parameter. Added tests for related functions.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D980
llvm-svn: 184076
The big changes are:
- Deleting Driver/(Arg|Opt)*
- Rewriting includes to llvm/Option/ and re-sorting
- 'using namespace llvm::opt' in clang::driver
- Fixing the autoconf build by adding option everywhere
As discussed in the review, this change includes using directives in
header files. I'll make follow up changes to remove those in favor of
name specifiers.
Reviewers: espindola
Differential Revision: http://llvm-reviews.chandlerc.com/D975
llvm-svn: 183989
Summary:
Don't remove backslashes from block comments. Previously this
/* \ \ \ \ \ \
*/
would be turned to this:
/*
*/
which spoils some kinds of ASCII-art, people use in their comments. The behavior
was related to handling escaped newlines in block comments inside preprocessor
directives. This patch makes handling it in a more civilized way.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D979
llvm-svn: 183978
Summary:
Basically, don't special-case line comments in this regard. And fixed
an incorrect test, that relied on the wrong behavior.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D962
llvm-svn: 183851
Summary:
"//Test" becomes "// Test". This change is aimed to improve code
readability and conformance to certain coding styles. If a comment starts with a
non-alphanumeric character, the space isn't added, e.g. "//-*-c++-*-" stays
unchanged.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D949
llvm-svn: 183750
correctly in the presence of qualified types.
(I had to change the unittest because it was trying to cast a
QualifiedTypeLoc to TemplateSpecializationTypeLoc.)
llvm-svn: 183563
Summary: Remove them from the TokenText as well.
Reviewers: klimek
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D935
llvm-svn: 183536
Summary:
Introduced two new style parameters: PenaltyBreakComment and
PenaltyBreakString. Add penalty for each character of a breakable token beyond
the column limit (this relates mainly to comments, as they are broken only on
whitespace). Tuned PenaltyBreakComment to prefer comment breaking over breaking
inside most binary expressions.
Fixed a bug that prevented *, & and && from being considered TT_BinaryOperator
in the presense of adjacent comments.
Reviewers: klimek, djasper
Reviewed By: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D933
llvm-svn: 183530
Before, clang-format would happily move a trailing block comment to a
new line, which normally changes the perceived binding of that comment.
E.g., it would move:
void f() { /* comment */
...
}
to:
void f() {
/* comment */
...
}
llvm-svn: 183420
The leading "}" in the construct "} else if (..) {" was confusing the
expression parser. Thus, no fake parentheses were generated and the
indentation was broken in some cases.
llvm-svn: 183393
Summary:
Detect if the file is valid UTF-8, and if this is the case, count code
points instead of just using number of bytes in all (hopefully) places, where
number of columns is needed. In particular, use the new
FormatToken.CodePointCount instead of TokenLength where appropriate.
Changed BreakableToken implementations to respect utf-8 character boundaries
when in utf-8 mode.
Reviewers: klimek, djasper
Reviewed By: djasper
CC: cfe-commits, rsmith, gribozavr
Differential Revision: http://llvm-reviews.chandlerc.com/D918
llvm-svn: 183312
Summary: Add support on the parser, registry, and DynTypedMatcher for binding IDs dynamically.
Reviewers: klimek
CC: cfe-commits, revane
Differential Revision: http://llvm-reviews.chandlerc.com/D911
llvm-svn: 183144
This patch ensures that APValues are deallocated with the ASTContext by
registering a deallocation function for APValues to the ASTContext.
Original version of the patch by James Dennett.
llvm-svn: 183101
An oversight in this detection made clang-format unable to format
the following nicely:
void aaaaaaaaaaaaaaaaaaa<aaaaaaaaaaaaaaaaaaaaaaaaaaa,
bbbbbbbbbbbbbbbbbbbbbbbbbb>(
cccccccccccccccccccccccccccc);
llvm-svn: 183097
Before, clang-format would not find a solution for formatting:
if ((aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
bbbbbbbbbbbbbbbbbb) && // aaaaaaaaaaaaaaaa
cccccc) {
}
llvm-svn: 183096
If a "}" is found inside parenthesis, this is probably a case of
missing parenthesis. This enables continuing to format after stuff code
like:
class A {
void f(
};
..
llvm-svn: 183009
With this patch, the simplified rule is:
If the block is part of a declaration (class, namespace, function,
enum, ..), merge an empty block onto a single line. Otherwise
(specifically for the compound statements of if, for, while, ...),
keep the braces on two separate lines.
The reasons are:
- Mostly the formatting of empty blocks does not matter much.
- Empty compound statements are really rare and are usually just
inserted while still working on the code. If they are on two lines,
inserting code is easier. Also, overlooking the "{}" of an
"if (...) {}" can be really bad.
- Empty declarations are not uncommon, e.g. empty constructors. Putting
them on one line saves vertical space at no loss of readability.
llvm-svn: 183008
When trying to fall back to search from the end onwards, we
would still find leading whitespace if the leading whitespace
went on after the end of the line.
llvm-svn: 182886
newFrontendActionFactory() took a pointer to a callback to call when a source
file was done being processed by an action. This revision updates the callback
to include an ante-processing callback as well.
Callback-providing class renamed and callback functions themselves renamed.
Functions are no longer pure-virtual so users aren't forced to implement both
callbacks if one isn't needed.
llvm-svn: 182864
If an identifier is on its own line and it is all upper case, it is highly
likely that this is a macro that is meant to stand on a line by itself.
Before:
class A : public QObject {
Q_OBJECT A() {}
};
Ater:
class A : public QObject {
Q_OBJECT
A() {}
};
llvm-svn: 182855
With option enabled (e.g. in Google-style):
template <typename T>
void f() {}
With option disabled:
template <typename T> void f() {}
Enabling this for Google-style and Chromium-style, not sure which other
styles would prefer that.
llvm-svn: 182849
This made it necessary to remove an error detection which would let us
bail out of braced lists in certain situations of missing "}". However,
as we always entirely escape from the braced list on finding ";", this
should not be a big problem.
With this, we can no format braced lists with uniformat inits:
return { arg1, SomeType { parameter } };
llvm-svn: 182788
To fully support this, we also need to expand tabs in the text before
the block comment. This patch breaks indentation when there was a
non-standard mixture of spaces and tabs used for indentation, but
fixes a regression in the simple case:
{
/*
* Comment.
*/
int i;
}
Is now formatted correctly, if there were tabs used for indentation
before.
llvm-svn: 182760
Block comment indentation of empty lines regressed, as we did not
have a test for it.
/* Comment with...
*
* empty line. */
is now formatted correctly again.
llvm-svn: 182757
Before:
int (*func)(void*);
void f() { int(*func)(void*); }
After (consistent space after "int"):
int (*func)(void*);
void f() { int (*func)(void*); }
llvm-svn: 182756
This gets turned into two ">" operators at the beginning in order to
simplify template parameter handling. Thus, we need a special case to
handle those two binary operators correctly.
With this patch, clang-format can now correctly handle cases like:
aaaaaa = aaaaaaa(aaaaaaa, // break
aaaaaa) >>
bbbbbb;
llvm-svn: 182754
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.
As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
much more similar
- revamp handling of multi-line comments; we now calculate the
information about lines in multi-line comments similar to normal
tokens, and always issue replacements
As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.
llvm-svn: 182738
In general, we like to avoid line breaks like:
...
SomeParameter, OtherParameter).DoSomething(
...
as they tend to make code really hard to read (how would you even indent the
next line?). Previously we have implemented this in a hacky way, which has now
shown to lead to problems. This fixes a few weird looking formattings, such as:
Before:
aaaaa(
aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
.aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
After:
aaaaa(aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa).aaaaa(aaaaa),
aaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 182731
Before:
@{ NSFontAttributeNameeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
: regularFont, };
Now:
@{ NSFontAttributeNameeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee :
regularFont, };
':'s in dictionary literals (and the corresponding {}s) are now marked as
TT_ObjCDictLiteral too, which makes further improvements to dict literal
layout possible.
llvm-svn: 182716
Previously we started sequences to align for single line comments when
the previous line had a trailing comment, but the sequence was broken
for other reasons.
Now we re-format:
// a
// b
f(); // c
to:
// a
// b
f(); // c
llvm-svn: 182608
Now correctly leaves:
f(); // comment
// comment
g(); // comment
... alone if the middle comment was aligned with g() before formatting.
llvm-svn: 182605
Previously we would align:
f(); // comment
// other comment
g();
Even if // other comment was at the start of the line. Now we do not
align trailing comments if they have been already aligned correctly
with the next line.
Thus,
f(); // comment
// other comment
g();
will not be changed, while:
f(); // comment
// other commment
g();
will lead to the two trailing comments being aligned.
llvm-svn: 182577
Replaces the use of WhitespaceStart + WhitspaceLength.
This made a bug in the formatter obvous where we would incorrectly
calculate the next column.
FIXME: There's a similar bug left regarding TokenLength. We should
probably also move to have a TokenRange instead.
llvm-svn: 182572
Before:
vector<int> x { 1, 2, 3 };
After:
vector<int> x{ 1, 2, 3 };
Also add a style option to remove the spaces inside braced lists,
so that the above becomes:
std::vector<int> v{1, 2, 3};
llvm-svn: 182570
Allows formatting of C++11 braced init list constructs, like:
vector<int> v { 1, 2, 3 };
f({ 1, 2 });
This involves some changes of how tokens are handled in the
UnwrappedLineFormatter. Note that we have a plan to evolve the
design of the token flow into one where we create all tokens
up-front and then annotate them in the various layers (as we
currently already have to create all tokens at once anyway, the
current abstraction does not help). Thus, this introduces
FIXMEs towards that goal.
llvm-svn: 182568
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).
Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.
The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.
This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
#define A \
f({ \
g(); \
});
... now gets correctly layouted.
llvm-svn: 182467
clang-format was a bit too aggressive when trying to keep labels and
values on the same line.
Before:
llvm::outs()
<< "aaaaaaaaaaaaaaaaaaa: " << aaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
llvm::outs() << "aaaaaaaaaaaaaaaaaaa: "
<< aaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 182458
This only affects styles that prevent bin packing. There, a break after
a template declaration also forced a line break after the function name.
Before:
template <class SomeType, class SomeOtherType>
SomeType
SomeFunction(SomeType Type, SomeOtherType OtherType) {}
After:
template <class SomeType, class SomeOtherType>
SomeType SomeFunction(SomeType Type, SomeOtherType OtherType) {}
This fixes llvm.org/PR16072.
llvm-svn: 182457
If clang-format is confronted with long and deeply nested lines (e.g.
complex static initializers or function calls), it can currently try too
hard to find the optimal solution and never finish. The reason is that
the memoization does not work effectively for deeply nested lines.
This patch removes an earlier workaround and instead opts for
accepting a non-optimal solution in rare cases. However, it only does
so only in cases where it would have to analyze an excessive number of
states (currently set to 10000 - the most complex line in Format.cpp
requires ~800 states) so this should not change the behavior in a
relevant way.
llvm-svn: 182449
With this patch, clang-format will try to keep the cursor at the
original code position in editor integrations (implemented for emacs and
vim). This means, after formatting, clang-format will try to keep the
cursor on the same character of the same token.
llvm-svn: 182373
Basically, the new rule is: The opening "{" always has to be on the
same line as the first element if the braced list is nested
(e.g. in another braced list or in a function).
The solution that clang-format produces almost always adheres to this
rule anyway and this makes clang-format significantly faster for larger
lists. Added a test cases for the only exception I could find
(which doesn't seem to be very important at first sight).
llvm-svn: 182082