Commit Graph

17 Commits

Author SHA1 Message Date
Alexander Kornienko b93062e236 Use the same set of whitespace characters for all operations in BreakableToken.
Summary:
Fixes a problem where \t,\v or \f could lead to a crash when placed as
a first character in a line comment. The cause is that rtrim and ltrim handle
these characters, but our code didn't, so some invariants could be broken.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1013

llvm-svn: 184425
2013-06-20 13:58:37 +00:00
Alexander Kornienko 7285207486 Split long strings on word boundaries.
Summary: Split strings at word boundaries, when there are no spaces and slashes.

Reviewers: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1003

llvm-svn: 184304
2013-06-19 14:22:47 +00:00
Alexander Kornienko be633908be Don't remove backslashes from block comments.
Summary:
Don't remove backslashes from block comments. Previously this
/* \    \ \ \ \ \
*/
would be turned to this:
/*
*/
which spoils some kinds of ASCII-art, people use in their comments. The behavior
was related to handling escaped newlines in block comments inside preprocessor
directives. This patch makes handling it in a more civilized way.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D979

llvm-svn: 183978
2013-06-14 11:46:10 +00:00
Alexander Kornienko 555efc36d0 Insert a space at the start of a line comment in case it starts with an alphanumeric character.
Summary:
"//Test" becomes "// Test". This change is aimed to improve code
readability and conformance to certain coding styles. If a comment starts with a
non-alphanumeric character, the space isn't added, e.g. "//-*-c++-*-" stays
unchanged.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D949

llvm-svn: 183750
2013-06-11 16:01:49 +00:00
Alexander Kornienko dd7ece53a2 Fixed calculation of penalty when breaking tokens.
Summary:
Introduced two new style parameters: PenaltyBreakComment and
PenaltyBreakString. Add penalty for each character of a breakable token beyond
the column limit (this relates mainly to comments, as they are broken only on
whitespace). Tuned PenaltyBreakComment to prefer comment breaking over breaking
inside most binary expressions.
Fixed a bug that prevented *, & and && from being considered TT_BinaryOperator
in the presense of adjacent comments.

Reviewers: klimek, djasper

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D933

llvm-svn: 183530
2013-06-07 16:02:52 +00:00
Alexander Kornienko ffcc010767 UTF-8 support for clang-format.
Summary:
Detect if the file is valid UTF-8, and if this is the case, count code
points instead of just using number of bytes in all (hopefully) places, where
number of columns is needed. In particular, use the new
FormatToken.CodePointCount instead of TokenLength where appropriate.
Changed BreakableToken implementations to respect utf-8 character boundaries
when in utf-8 mode.

Reviewers: klimek, djasper

Reviewed By: djasper

CC: cfe-commits, rsmith, gribozavr

Differential Revision: http://llvm-reviews.chandlerc.com/D918

llvm-svn: 183312
2013-06-05 14:09:10 +00:00
Daniel Jasper ce257f296b More fixes for clang-format's multiline comment breaking.
llvm-svn: 182940
2013-05-30 17:27:48 +00:00
Daniel Jasper 58dd2f0652 Fix another clang-format crasher related to multi-line comments.
This fixes:
/*
*
* something long going over the column limit.
*/

llvm-svn: 182932
2013-05-30 15:20:29 +00:00
Manuel Klimek 8910d192d0 Add asserts to guard against regressions.
llvm-svn: 182916
2013-05-30 07:45:53 +00:00
Daniel Jasper 51fb2b2151 Fix crasher when formatting certain block comments.
Smallest reproduction:
/*
**
*/

llvm-svn: 182913
2013-05-30 06:40:07 +00:00
Manuel Klimek ae1fbfb740 Fixes error when splitting block comments.
When trying to fall back to search from the end onwards, we
would still find leading whitespace if the leading whitespace
went on after the end of the line.

llvm-svn: 182886
2013-05-29 22:06:18 +00:00
Manuel Klimek 34d15151c4 Disable tab expansion when counting the columns in block comments.
To fully support this, we also need to expand tabs in the text before
the block comment. This patch breaks indentation when there was a
non-standard mixture of spaces and tabs used for indentation, but
fixes a regression in the simple case:
{
  /*
   * Comment.
   */
  int i;
}
Is now formatted correctly, if there were tabs used for indentation
before.

llvm-svn: 182760
2013-05-28 10:01:59 +00:00
Manuel Klimek 281dcbe026 Fixes indentation of empty lines in block comments.
Block comment indentation of empty lines regressed, as we did not
have a test for it.
 /* Comment with...
  *
  * empty line. */
is now formatted correctly again.

llvm-svn: 182757
2013-05-28 08:55:01 +00:00
Manuel Klimek 9043c74f49 Major refactoring of BreakableToken.
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.

As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
  much more similar
- revamp handling of multi-line comments; we now calculate the
  information about lines in multi-line comments similar to normal
  tokens, and always issue replacements

As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.

llvm-svn: 182738
2013-05-27 15:23:34 +00:00
Manuel Klimek 4fe43002f8 Makes whitespace management more consistent.
Instead of selectively storing some changes and directly generating
replacements for others, we now notify the WhitespaceManager of the
whitespace before every token (and optionally with more changes inside
tokens).

Then, we run over all whitespace in the very end in original source
order, where we have all information available to correctly align
comments and escaped newlines.

The future direction is to pull more of the comment alignment
implementation that is now in the BreakableToken into the
WhitespaceManager.

This fixes a bug when aligning comments or escaped newlines in unwrapped
lines that are handled out of order:
  #define A \
    f({     \
      g();  \
    });
... now gets correctly layouted.

llvm-svn: 182467
2013-05-22 12:51:29 +00:00
Alexander Kornienko 9e90b62e01 Unified token breaking logic: support for line comments.
Summary:
Added BreakableLineComment, moved common code from
BreakableBlockComment to newly added BreakableComment. As a side-effect of the
rewrite, found another problem with escaped newlines and had to change
code which removes trailing whitespace from line comments not to break after
this patch.

Reviewers: klimek, djasper

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D682

llvm-svn: 179693
2013-04-17 17:34:05 +00:00
Alexander Kornienko cb45bc1861 Unified token breaking logic for strings and block comments.
Summary:
Both strings and block comments are broken into lines in
breakProtrudingToken. Logic specific for strings or block comments is abstracted
in implementations of the BreakToken interface. Among other goodness, this
change fixes placement of backslashes after a block comment inside a
preprocessor directive (see removed FIXMEs in unit tests).

The code is far from being polished, and some parts of it will be changed for
line comments support.

Reviewers: klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D665

llvm-svn: 179526
2013-04-15 14:28:00 +00:00