Commit Graph

360 Commits

Author SHA1 Message Date
Alexander Kornienko 1efe0a07bb Fixed typo: NoneComment -> NonComment, no other changes.
llvm-svn: 185640
2013-07-04 14:47:51 +00:00
Alexander Kornienko 5861171893 Added AlwaysBreakBeforeMultilineStrings option.
Summary:
Always breaking before multiline strings can help format complex
expressions containing multiline strings more consistently, and avoid consuming
too much horizontal space.

Reviewers: djasper

Reviewed By: djasper

CC: cfe-commits, klimek

Differential Revision: http://llvm-reviews.chandlerc.com/D1097

llvm-svn: 185622
2013-07-04 12:02:44 +00:00
Daniel Jasper 7ae41cdd22 Don't insert confusing line breaks in comparisons.
In general, clang-format breaks after an operator if the LHS spans
multiple lines. Otherwise, this can lead to confusing effects and
effectively hide the operator precendence, e.g. in

if (aaaaaaaaaaaaaa ==
        bbbbbbbbbbbbbb && c) { ...

This patch removes this rule for comparisons, if the LHS is not a binary
expression itself as many users were wondering why clang-format inserts
an unnecessary linebreak.

Before:
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
        aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) >
    5) { ...

After:
if (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
        aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) > 5) { ...

In the long run, we might:
- Want to do this for other binary expressions as well.
- Do this only if the RHS is short or even only if it is a literal.

llvm-svn: 185530
2013-07-03 10:34:47 +00:00
Alexander Kornienko aa620e187e Avoid column limit violation in block comments in certain cases.
Summary:
Add penalty when an excessively long line in a block comment can not be
broken on a leading whitespace. Lack of this addition can lead to severe column
width violations when they can be easily avoided.

Reviewers: djasper

Reviewed By: djasper

CC: cfe-commits, klimek

Differential Revision: http://llvm-reviews.chandlerc.com/D1071

llvm-svn: 185337
2013-07-01 13:42:42 +00:00
Craig Topper af35e8521a Put helper classes into anonymous namespace.
llvm-svn: 185295
2013-06-30 22:29:28 +00:00
Alexander Kornienko 1e80887d63 Use lexing mode based on FormatStyle.Standard.
Summary:
Some valid pre-C++11 constructs change meaning when lexed in C++11
mode, e.g.
#define x(_a) printf("foo"_a);
(example from http://llvm.org/bugs/show_bug.cgi?id=16342). "foo"_a is treated as
a user-defined string literal when parsed in C++11 mode.
In order to deal with this correctly, we need to set lexing mode according to
which standard the code conforms to. We already have a configuration value for
this (FormatStyle.Standard), which seems to be appropriate to use in this case
as well.

Reviewers: klimek

CC: cfe-commits, gribozavr

Differential Revision: http://llvm-reviews.chandlerc.com/D1028

llvm-svn: 185149
2013-06-28 12:51:24 +00:00
Nico Weber f579ab302a Fix a comment.
llvm-svn: 184905
2013-06-26 02:42:46 +00:00
Nico Weber 9096fc0dab Run clang-format on lib/Format code after r184894. No other changes.
llvm-svn: 184896
2013-06-26 00:30:14 +00:00
Manuel Klimek 836c2868f9 Add an option to not indent declarations when breaking after the type.
Make that option the default for LLVM style.

llvm-svn: 184563
2013-06-21 17:25:42 +00:00
Alexander Kornienko a3555e2416 Fixed long-standing issue with incorrect length calculation of multi-line comments.
Summary:
A trailing block comment having multiple lines would cause extremely
high penalties if the summary length of its lines is more than the column limit.
Fixed by always considering only the last line of a multi-line block comment.
Removed a long-standing FIXME from relevant tests and added a motivating test
modelled after problem cases from real code.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1010

llvm-svn: 184340
2013-06-19 19:50:11 +00:00
Alexander Kornienko 4d26b6efef Fixes incorrect indentation of line comments after break and re-alignment.
Summary:
Selectively propagate the information about token kind in
WhitespaceManager::replaceWhitespaceInToken.For correct alignment of new
segments of line comments in order to align them correctly. Don't set
BreakBeforeParameter in breakProtrudingToken for line comments, as it introduces
a break after the _next_ parameter. Added tests for related functions.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D980

llvm-svn: 184076
2013-06-17 12:59:44 +00:00
Alexander Kornienko be633908be Don't remove backslashes from block comments.
Summary:
Don't remove backslashes from block comments. Previously this
/* \    \ \ \ \ \
*/
would be turned to this:
/*
*/
which spoils some kinds of ASCII-art, people use in their comments. The behavior
was related to handling escaped newlines in block comments inside preprocessor
directives. This patch makes handling it in a more civilized way.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D979

llvm-svn: 183978
2013-06-14 11:46:10 +00:00
Alexander Kornienko f370ad9055 Preserve newlines before block comments in static initializers.
Summary:
Basically, don't special-case line comments in this regard. And fixed
an incorrect test, that relied on the wrong behavior.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D962

llvm-svn: 183851
2013-06-12 19:04:12 +00:00
Alexander Kornienko ee4ca9ba0e Improved handling of escaped newlines at the token start.
Summary: Remove them from the TokenText as well.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D935

llvm-svn: 183536
2013-06-07 17:45:07 +00:00
Alexander Kornienko dd7ece53a2 Fixed calculation of penalty when breaking tokens.
Summary:
Introduced two new style parameters: PenaltyBreakComment and
PenaltyBreakString. Add penalty for each character of a breakable token beyond
the column limit (this relates mainly to comments, as they are broken only on
whitespace). Tuned PenaltyBreakComment to prefer comment breaking over breaking
inside most binary expressions.
Fixed a bug that prevented *, & and && from being considered TT_BinaryOperator
in the presense of adjacent comments.

Reviewers: klimek, djasper

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D933

llvm-svn: 183530
2013-06-07 16:02:52 +00:00
Alexander Kornienko ffcc010767 UTF-8 support for clang-format.
Summary:
Detect if the file is valid UTF-8, and if this is the case, count code
points instead of just using number of bytes in all (hopefully) places, where
number of columns is needed. In particular, use the new
FormatToken.CodePointCount instead of TokenLength where appropriate.
Changed BreakableToken implementations to respect utf-8 character boundaries
when in utf-8 mode.

Reviewers: klimek, djasper

Reviewed By: djasper

CC: cfe-commits, rsmith, gribozavr

Differential Revision: http://llvm-reviews.chandlerc.com/D918

llvm-svn: 183312
2013-06-05 14:09:10 +00:00
Daniel Jasper 1027c6e5dd Let clang-format remove empty lines before "}".
These lines almost never aid readability.

Before:
void f() {
  int i;  // some variable

}

After:
void f() {
  int i;  // some variable
}

llvm-svn: 183112
2013-06-03 16:16:41 +00:00
Daniel Jasper 8050395236 Improve detection preventing certain kind of formatting patterns.
An oversight in this detection made clang-format unable to format
the following nicely:
void aaaaaaaaaaaaaaaaaaa<aaaaaaaaaaaaaaaaaaaaaaaaaaa,
                         bbbbbbbbbbbbbbbbbbbbbbbbbb>(
    cccccccccccccccccccccccccccc);

llvm-svn: 183097
2013-06-03 09:54:46 +00:00
Daniel Jasper 68d888cfed Fix line-breaking problem caused by comment.
Before, clang-format would not find a solution for formatting:
if ((aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
     bbbbbbbbbbbbbbbbbb) && // aaaaaaaaaaaaaaaa
    cccccc) {
}

llvm-svn: 183096
2013-06-03 08:42:05 +00:00
Daniel Jasper a9eb2aafa1 Make formatting of empty blocks more consistent.
With this patch, the simplified rule is:
If the block is part of a declaration (class, namespace, function,
enum, ..), merge an empty block onto a single line. Otherwise
(specifically for the compound statements of if, for, while, ...),
keep the braces on two separate lines.

The reasons are:
- Mostly the formatting of empty blocks does not matter much.
- Empty compound statements are really rare and are usually just
  inserted while still working on the code. If they are on two lines,
  inserting code is easier. Also, overlooking the "{}" of an
  "if (...) {}" can be really bad.
- Empty declarations are not uncommon, e.g. empty constructors. Putting
  them on one line saves vertical space at no loss of readability.

llvm-svn: 183008
2013-05-31 14:56:20 +00:00
Manuel Klimek 4c5c28bb36 Use a non-recursive implementation to reconstruct line breaks.
Now that the TokenAnnotator does not require stack space anymore,
reconstructing the lines has become the limiting factor. This patch
fixes that problem, allowing large files with multiple megabytes of
single unwrapped lines to be formatted.

llvm-svn: 182861
2013-05-29 15:10:11 +00:00
Manuel Klimek 6e6310ec84 The second step in the token refactoring.
Gets rid of AnnotatedToken, putting everything into FormatToken.
FormatTokens are created once, and only referenced by pointer.  This
enables multiple future features, like having tokens shared between
multiple UnwrappedLines (while there's still work to do to fully enable
that).

llvm-svn: 182859
2013-05-29 14:47:47 +00:00
Daniel Jasper 61e6bbf850 Add option to always break template declarations.
With option enabled (e.g. in Google-style):
template <typename T>
void f() {}

With option disabled:
template <typename T> void f() {}

Enabling this for Google-style and Chromium-style, not sure which other
styles would prefer that.

llvm-svn: 182849
2013-05-29 12:07:31 +00:00
Manuel Klimek 591ab5a830 Make UnwrappedLines and AnnotatedToken contain pointers to FormatToken.
The FormatToken is now not copyable any more.

llvm-svn: 182772
2013-05-28 13:42:28 +00:00
Manuel Klimek 15dfe7ac40 A first step towards giving format tokens pointer identity.
With this patch, we create all tokens in one go before parsing and pass
an ArrayRef<FormatToken*> to the UnwrappedLineParser. The
UnwrappedLineParser is switched to use pointer-to-token internally.

The UnwrappedLineParser still copies the tokens into the UnwrappedLines.
This will be fixed in an upcoming patch.

llvm-svn: 182768
2013-05-28 11:55:06 +00:00
Daniel Jasper bca4bbe30a Initial support for designated initializers.
llvm-svn: 182767
2013-05-28 11:30:49 +00:00
Daniel Jasper 9f82df295e Fix formatting of expressions containing ">>".
This gets turned into two ">" operators at the beginning in order to
simplify template parameter handling. Thus, we need a special case to
handle those two binary operators correctly.

With this patch, clang-format can now correctly handle cases like:
aaaaaa = aaaaaaa(aaaaaaa, // break
                 aaaaaa) >>
         bbbbbb;

llvm-svn: 182754
2013-05-28 07:42:44 +00:00
Manuel Klimek 9043c74f49 Major refactoring of BreakableToken.
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.

As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
  much more similar
- revamp handling of multi-line comments; we now calculate the
  information about lines in multi-line comments similar to normal
  tokens, and always issue replacements

As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.

llvm-svn: 182738
2013-05-27 15:23:34 +00:00
Daniel Jasper 7b27a10b1e Improve indentation of assignments.
Before:
unsigned OriginalStartColumn = SourceMgr.getSpellingColumnNumber(
    Current.FormatTok.getStartOfNonWhitespace()) -
                               1;

After:
unsigned OriginalStartColumn =
    SourceMgr.getSpellingColumnNumber(
        Current.FormatTok.getStartOfNonWhitespace()) -
    1;

llvm-svn: 182733
2013-05-27 12:45:09 +00:00
Daniel Jasper 32a796bc5b Fix hacky way of preventing a certain type of line break.
In general, we like to avoid line breaks like:

  ...
  SomeParameter, OtherParameter).DoSomething(
  ...

as they tend to make code really hard to read (how would you even indent the
next line?). Previously we have implemented this in a hacky way, which has now
shown to lead to problems. This fixes a few weird looking formattings, such as:

Before:
aaaaa(
    aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
        .aaaaa(aaaaa),
    aaaaaaaaaaaaaaaaaaaaa);
After:
aaaaa(aaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,
            aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa).aaaaa(aaaaa),
      aaaaaaaaaaaaaaaaaaaaa);

llvm-svn: 182731
2013-05-27 11:50:16 +00:00
Daniel Jasper 5bd0b9e53b Improve formatting of braced lists.
Before: vector<int> v{ -1};
After:  vector<int> v{-1};
llvm-svn: 182597
2013-05-23 18:05:18 +00:00
Manuel Klimek 5c24cca0f0 Use a SourceRange for the whitespace location in FormatToken.
Replaces the use of WhitespaceStart + WhitspaceLength.
This made a bug in the formatter obvous where we would incorrectly
calculate the next column.

FIXME: There's a similar bug left regarding TokenLength. We should
probably also move to have a TokenRange instead.

llvm-svn: 182572
2013-05-23 10:56:37 +00:00
Daniel Jasper e5777d25d6 Improve formatting of braced lists.
Before:
vector<int> x { 1, 2, 3 };
After:
vector<int> x{ 1, 2, 3 };

Also add a style option to remove the spaces inside braced lists,
so that the above becomes:
std::vector<int> v{1, 2, 3};

llvm-svn: 182570
2013-05-23 10:15:45 +00:00
Manuel Klimek 4fe43002f8 Makes whitespace management more consistent.
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).

Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.

The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.

This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
  #define A \
    f({     \
      g();  \
    });
... now gets correctly layouted.

llvm-svn: 182467
2013-05-22 12:51:29 +00:00
Daniel Jasper 53e8d854fd Fix function declaration behavior.
This only affects styles that prevent bin packing. There, a break after
a template declaration also forced a line break after the function name.

Before:
template <class SomeType, class SomeOtherType>
SomeType
SomeFunction(SomeType Type, SomeOtherType OtherType) {}

After:
template <class SomeType, class SomeOtherType>
SomeType SomeFunction(SomeType Type, SomeOtherType OtherType) {}

This fixes llvm.org/PR16072.

llvm-svn: 182457
2013-05-22 08:55:55 +00:00
Daniel Jasper f8114cf621 Cut-off clang-format analysis.
If clang-format is confronted with long and deeply nested lines (e.g.
complex static initializers or function calls), it can currently try too
hard to find the optimal solution and never finish. The reason is that
the memoization does not work effectively for deeply nested lines.

This patch removes an earlier workaround and instead opts for
accepting a non-optimal solution in rare cases. However, it only does
so only in cases where it would have to analyze an excessive number of
states (currently set to 10000 - the most complex line in Format.cpp
requires ~800 states) so this should not change the behavior in a
relevant way.

llvm-svn: 182449
2013-05-22 05:27:42 +00:00
Alexander Kornienko 06e0033427 Minor fix: don't crash on empty configuration file, consider empty configuration files invalid.
llvm-svn: 182290
2013-05-20 15:18:01 +00:00
Alexander Kornienko 006b5c89ce Clang-format: allow -style="{yaml/json}" on command line
Summary: + improved handling of default style and predefined styles.

Reviewers: djasper, klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D813

llvm-svn: 182205
2013-05-19 00:53:30 +00:00
Daniel Jasper 473c62c485 Slightly modify the formatting rules for braced lists.
Basically, the new rule is: The opening "{" always has to be on the
same line as the first element if the braced list is nested
(e.g. in another braced list or in a function).

The solution that clang-format produces almost always adheres to this
rule anyway and this makes clang-format significantly faster for larger
lists. Added a test cases for the only exception I could find
(which doesn't seem to be very important at first sight).

llvm-svn: 182082
2013-05-17 09:35:01 +00:00
Daniel Jasper 8bb99e8911 Don't insert a break into include lines with trailing comments.
llvm-svn: 182003
2013-05-16 12:59:13 +00:00
Daniel Jasper 3a685df7e0 Add option to put short loops on a single line.
This enables things like:

for (int &v : vec) v *= 2;

Enabled for Google style.

llvm-svn: 182000
2013-05-16 12:12:21 +00:00
Daniel Jasper ec04c0dba0 Add a more convenient interface to use clang-format.
It turns out that several implementations go through the trouble of
setting up a SourceManager and Lexer and abstracting this into a
function makes usage easier.

Also abstracts SourceManager-independent ranges out of
tooling::Refactoring and provides a convenience function to create them
from line ranges.

llvm-svn: 181997
2013-05-16 10:40:07 +00:00
Daniel Jasper f9eb9b1883 Comments should not prevent single-line functions.
Before:
void f() {}
void g() {
} // comment

After:
void f() {}
void g() {} // comment

llvm-svn: 181996
2013-05-16 10:17:39 +00:00
Daniel Jasper 7dd22c51b1 Add back accidentally deleted line and add test for it.
Before:
f("a", "b"
  "c");
After:
f("a", "b"
       "c");

llvm-svn: 181980
2013-05-16 04:26:02 +00:00
Daniel Jasper abca58c9ee Don't put short namespace on a single line.
Before:
namespace abc { class SomeClass; }
namespace def { void someFunction() {} }

After:
namespace abc {
class Def;
}
namespace def {
void someFunction() {}
}

Rationale:
a) Having anything other than forward declaration on the same line
   as a namespace looks confusing.
b) Formatting namespace-forward-declaration-combinations different
   from other stuff is inconsistent.
c) Wasting vertical space close to such forward declarations really
   does not affect readability.

llvm-svn: 181887
2013-05-15 14:09:55 +00:00
Daniel Jasper c6fbc2192c Break function declarations after multi-line return types.
Before:
template <typename A>
SomeLoooooooooooooooooooooongType<
    typename some_namespace::SomeOtherType<A>::Type> Function() {}

After:
template <typename A>
SomeLoooooooooooooooooooooongType<
    typename some_namespace::SomeOtherType<A>::Type>
Function() {}

llvm-svn: 181877
2013-05-15 09:35:08 +00:00
Daniel Jasper 00aca707d5 Don't merge one-line functions in weird brace styles.
llvm-svn: 181872
2013-05-15 08:30:06 +00:00
Daniel Jasper d2ae41a7c6 Remove diagnostics from clang-format.
We only ever implemented one and that one is not actually all that
helpful (e.g. gets incorrectly triggered by macros).

llvm-svn: 181871
2013-05-15 08:14:19 +00:00
Daniel Jasper 571f1af0bb Fix expression breaking for one-parameter-per-line styles.
Before:
  if (aaaaaaaaaaaaaaaaaaaaaaaaaaaa || aaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
      aaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
      aaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
      aaaaaaaaaaaaaaaaaaaaaaaaaaaa) {}
After:
  if (aaaaaaaaaaaaaaaaaaaaaaaaaaaa || aaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
      aaaaaaaaaaaaaaaaaaaaaaaaaaaa || aaaaaaaaaaaaaaaaaaaaaaaaaaaa ||
      aaaaaaaaaaaaaaaaaaaaaaaaaaaa) {}

llvm-svn: 181828
2013-05-14 20:39:56 +00:00
Daniel Jasper cdd0662b4e Correctly determine ranges for clang-format.
We have been assuming that CharSourceRange::getTokenRange() by itself
expands a range until the end of a token, but in fact it only sets
IsTokenRange to true. Thus, we have so far only considered the first
character of the last token to belong to an unwrapped line. This
did not really manifest in symptoms as all edit integrations
expand ranges to fully lines.

llvm-svn: 181778
2013-05-14 10:31:09 +00:00