Previously, if an escaped newline was followed by a newline or a nul, we'd lex
the escaped newline as a bogus space character. This led to a bunch of
different broken corner cases:
For the pattern "\\\n\0#", we would then have a (horizontal) space whose
spelling ends in a newline, and would decide that the '#' is at the start of a
line, and incorrectly start preprocessing a directive in the middle of a
logical source line. If we were already in the middle of a directive, this
would result in our attempting to process multiple directives at the same time!
This resulted in crashes, asserts, and hangs on invalid input, as discovered by
fuzz-testing.
For the pattern "\\\n" at EOF (with an implicit following nul byte), we would
produce a bogus trailing space character with spelling "\\\n". This was mostly
harmless, but would lead to clang-format getting confused and misformatting in
rare cases. We now produce a trailing EOF token with spelling "\\\n",
consistent with our handling for other similar cases -- an escaped newline is
always part of the token containing the next character, if any.
For the pattern "\\\n\n", this was somewhat more benign, but would produce an
extraneous whitespace token to clients who care about preserving whitespace.
However, it turns out that our lexing for line comments was relying on this bug
due to an off-by-one error in its computation of the end of the comment, on the
slow path where the comment might contain escaped newlines.
llvm-svn: 300515
At present, clang-format mangles Java containing logical right shift operators
('>>>=' or '>>>'), splitting them in two, resulting in invalid code:
public class Minimal {
public void func(String args) {
int i = 42;
- i >>>= 1;
+ i >> >= 1;
return i;
}
}
This adds both forms of logical right shift to the FormatTokenLexer, so
clang-format won't attempt to split them and insert bogus whitespace.
https://reviews.llvm.org/D31652
Patch from Richard Bradfield <bradfier@fstab.me>!
llvm-svn: 299952
Hopefully fix crashes by unshadowing the variable.
Original commit message:
A big part of the clone detection code is functionality for filtering clones and
clone groups based on different criteria. So far this filtering process was
hardcoded into the CloneDetector class, which made it hard to understand and,
ultimately, to extend.
This patch splits the CloneDetector's logic into a sequence of reusable
constraints that are used for filtering clone groups. These constraints
can be turned on and off and reodreder at will, and new constraints are easy
to implement if necessary.
Unit tests are added for the new constraint interface.
This is a refactoring patch - no functional change intended.
Patch by Raphael Isemann!
Differential Revision: https://reviews.llvm.org/D23418
llvm-svn: 299653
clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END
Before:
auto c1 = u8 'a';
auto c2 = u'a';
Now:
auto c1 = u8'a';
auto c2 = u'a';
Patch from Denis Gladkikh <llvm@denis.gladkikh.email>!
llvm-svn: 299574
A big part of the clone detection code is functionality for filtering clones and
clone groups based on different criteria. So far this filtering process was
hardcoded into the CloneDetector class, which made it hard to understand and,
ultimately, to extend.
This patch splits the CloneDetector's logic into a sequence of reusable
constraints that are used for filtering clone groups. These constraints
can be turned on and off and reodreder at will, and new constraints are easy
to implement if necessary.
Unit tests are added for the new constraint interface.
This is a refactoring patch - no functional change intended.
Patch by Raphael Isemann!
Differential Revision: https://reviews.llvm.org/D23418
llvm-svn: 299544
Summary:
The new test case was crashing before. Now it passes
as expected.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D31441
llvm-svn: 299465
Summary: ... which applies a set of `AtomicChange`s on code.
Reviewers: klimek, djasper
Reviewed By: djasper
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30777
llvm-svn: 298913
The getVirtualFile method would create entries for e.g. libclang's
CXUnsavedFile but not mark them as valid. The effect is that a lookup
through getFile where the file name is not exactly matching the virtual
file (e.g. through mixing slashes and backslashes on Windows) would
result in a normal file "lookup", and re-using the file entry found
by using the UniqueID, and overwrite the file entry fields. Because the
lookup involves opening the file, and moving it into the file entry, the
file is now open. The SourceManager keys its buffers on the UniqueID
(which is still the same), so it will find an already loaded buffer.
Because only the loading a buffer from disk will close the file, the
FileEntry will hold on to an open file for as long as the FileManager
is around. As the FileManager will only get destroyed at a reparse,
you can't safe to the "leaked" and locked file on Windows.
llvm-svn: 298905
This is a fixup for the unit tests from r298278 (originally r298165).
Since the buffer that RawB2 pointed at was later deleted, a new call to
getBuffer may very well return a buffer at the same/old address. Which is
fine. Just delete the spurious check.
A Windows bot was occasionally hitting this in practice:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/7086
llvm-svn: 298414
This reverts commit r298185, effectively reapplying r298165, after fixing the
new unit tests (PR32338). The memory buffer generator doesn't null-terminate
the MemoryBuffer it creates; this version of the commit informs getMemBuffer
about that to avoid the assert.
Original commit message follows:
----
Clang's internal build system for implicit modules uses lock files to
ensure that after a process writes a PCM it will read the same one back
in (without contention from other -cc1 commands). Since PCMs are read
from disk repeatedly while invalidating, building, and importing, the
lock is not released quickly. Furthermore, the LockFileManager is not
robust in every environment. Other -cc1 commands can stall until
timeout (after about eight minutes).
This commit changes the lock file from being necessary for correctness
to a (possibly dubious) performance hack. The remaining benefit is to
reduce duplicate work in competing -cc1 commands which depend on the
same module. Follow-up commits will change the internal build system to
continue after a timeout, and reduce the timeout. Perhaps we should
reconsider blocking at all.
This also fixes a use-after-free, when one part of a compilation
validates a PCM and starts using it, and another tries to swap out the
PCM for something new.
The PCMCache is a new type called MemoryBufferCache, which saves memory
buffers based on their filename. Its ownership is shared by the
CompilerInstance and ModuleManager.
- The ModuleManager stores PCMs there that it loads from disk, never
touching the disk if the cache is hot.
- When modules fail to validate, they're removed from the cache.
- When a CompilerInstance is spawned to build a new module, each
already-loaded PCM is assumed to be valid, and is frozen to avoid
the use-after-free.
- Any newly-built module is written directly to the cache to avoid the
round-trip to the filesystem, making lock files unnecessary for
correctness.
Original patch by Manman Ren; most testcases by Adrian Prantl!
llvm-svn: 298278
Clang's internal build system for implicit modules uses lock files to
ensure that after a process writes a PCM it will read the same one back
in (without contention from other -cc1 commands). Since PCMs are read
from disk repeatedly while invalidating, building, and importing, the
lock is not released quickly. Furthermore, the LockFileManager is not
robust in every environment. Other -cc1 commands can stall until
timeout (after about eight minutes).
This commit changes the lock file from being necessary for correctness
to a (possibly dubious) performance hack. The remaining benefit is to
reduce duplicate work in competing -cc1 commands which depend on the
same module. Follow-up commits will change the internal build system to
continue after a timeout, and reduce the timeout. Perhaps we should
reconsider blocking at all.
This also fixes a use-after-free, when one part of a compilation
validates a PCM and starts using it, and another tries to swap out the
PCM for something new.
The PCMCache is a new type called MemoryBufferCache, which saves memory
buffers based on their filename. Its ownership is shared by the
CompilerInstance and ModuleManager.
- The ModuleManager stores PCMs there that it loads from disk, never
touching the disk if the cache is hot.
- When modules fail to validate, they're removed from the cache.
- When a CompilerInstance is spawned to build a new module, each
already-loaded PCM is assumed to be valid, and is frozen to avoid
the use-after-free.
- Any newly-built module is written directly to the cache to avoid the
round-trip to the filesystem, making lock files unnecessary for
correctness.
Original patch by Manman Ren; most testcases by Adrian Prantl!
llvm-svn: 298165
clang-format treats MSVC `__super` keyword like all other keywords adding
a single space after. This change disables this behavior for `__super`.
Patch originally by jutocz (thanks!).
Differential Revision: https://reviews.llvm.org/D30932
llvm-svn: 297936
This prevents unwanted fallout from r296664. Specifically in proto formatting,
this changed:
optional Aaaaaaaa aaaaaaaa = 12 [
(aaa) = aaaa,
(bbbbbbbbbbbbbbbbbbbbbbbbbb) = {
aaaaaaaaaaaaaaaaa: true,
aaaaaaaaaaaaaaaa: true
}
];
Into:
optional Aaaaaaaa aaaaaaaa = 12 [
(aaa) = aaaa,
(bbbbbbbbbbbbbbbbbbbbbbbbbb) =
{aaaaaaaaaaaaaaaaa: true, aaaaaaaaaaaaaaaa: true}
];
Which is considered less readable. Generally, it seems preferable to
format such dict literals as blocks rather than contract them to one
line.
llvm-svn: 297696
Modified the tests to accept any iteration order, to run only on Unix, and added
additional error reporting to investigate SystemZ bot issue.
The VFS directory iterator and recursive directory iterator behave differently
from the LLVM counterparts. Once the VFS iterators hit a broken symlink they
immediately abort. The LLVM counterparts don't stat entries unless they have to
descend into the next directory, which allows to recover from this issue by
clearing the error code and skipping to the next entry.
This change adds similar behavior to the VFS iterators. There should be no
change in current behavior in the current CLANG source base, because all
clients have loop exit conditions that also check the error code.
This fixes rdar://problem/30934619.
Differential Revision: https://reviews.llvm.org/D30768
llvm-svn: 297693
Summary:
@see is special among JSDoc tags in that it is commonly followed by URLs. The JSDoc spec suggests that users should wrap URLs in an additional {@link url...} tag (@see http://usejsdoc.org/tags-see.html), but this is very commonly violated, with @see being followed by a "naked" URL.
This change special cases all JSDoc lines that contain an @see not to be wrapped to account for that.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D30883
llvm-svn: 297607
Summary:
Previously clang-format would not break after any !. However in TypeScript, ! can be used as a post fix operator for non-nullability:
x.foo()!.bar()!;
With this change, clang-format will wrap after the ! if it is likely a post-fix non null operator.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D30705
llvm-svn: 297606
Summary:
`interface` and `type` are pseudo keywords and cause automatic semicolon
insertion when followed by a line break:
interface // gets parsed as a long variable access to "interface"
VeryLongInterfaceName {
}
With this change, clang-format not longer wraps after `interface` or `type`.
Reviewers: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D30874
llvm-svn: 297605
Modified the tests to accept any iteration order.
The VFS directory iterator and recursive directory iterator behave differently
from the LLVM counterparts. Once the VFS iterators hit a broken symlink they
immediately abort. The LLVM counterparts allow to recover from this issue by
clearing the error code and skipping to the next entry.
This change adds the same functionality to the VFS iterators. There should be
no change in current behavior in the current CLANG source base, because all
clients have loop exit conditions that also check the error code.
This fixes rdar://problem/30934619.
Differential Revision: https://reviews.llvm.org/D30768
llvm-svn: 297528
The VFS directory iterator and recursive directory iterator behave differently
from the LLVM counterparts. Once the VFS iterators hit a broken symlink they
immediately abort. The LLVM counterparts allow to recover from this issue by
clearing the error code and skipping to the next entry.
This change adds the same functionality to the VFS iterators. There should be
no change in current behavior in the current CLANG source base, because all
clients have loop exit conditions that also check the error code.
This fixes rdar://problem/30934619.
Differential Revision: https://reviews.llvm.org/D30768
llvm-svn: 297510
Summary:
This patch makes ContinuationIndenter call breakProtrudingToken only if
NoLineBreak and NoLineBreakInOperand is false.
Previously, clang-format required two runs to converge on the following example with 24 columns:
Note that the second operand shouldn't be splitted according to NoLineBreakInOperand, but the
token breaker doesn't take that into account:
```
func(a, "long long long long", c);
```
After first run:
```
func(a, "long long "
"long long",
c);
```
After second run, where NoLineBreakInOperand is taken into account:
```
func(a,
"long long "
"long long",
c);
```
With the patch, clang-format now obtains in one run:
```
func(a,
"long long long"
"long",
c);
```
which is a better token split overall.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D30575
llvm-svn: 297274
Summary:
This patch enables comment reflowing of lines not matching the comment pragma regex
in multiline comments containing comment pragma lines. Previously, these comments
were dumped without being reindented to the result.
Reviewers: djasper, mprobst
Reviewed By: mprobst
Subscribers: klimek, mprobst, cfe-commits
Differential Revision: https://reviews.llvm.org/D30697
llvm-svn: 297261
Summary:
This patch adds support for namespaces ending in semicolon to the namespace comment fixer.
source:
```
namespace A {
int i;
int j;
};
```
clang-format before:
```
namespace A {
int i;
int j;
} // namespace A;
```
clang-format after:
```
namespace A {
int i;
int j;
}; // namespace A
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D30688
llvm-svn: 297140
Summary:
I've included a unit test with a function template containing a variable
of incomplete type. Clang compiles this without errors (the standard
does not require a diagnostic in this case). Without the fix, this case
triggers the crash.
Reviewers: klimek
Reviewed By: klimek
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30636
llvm-svn: 297129
Summary:
Until now, NamespaceEndCommentFixer was adding missing comments for every run,
which results in multiple end comments for:
```
namespace {
int i;
int j;
}
#if A
int a = 1;
#else
int a = 2;
#endif
```
result before:
```
namespace {
int i;
int j;
}// namespace // namespace
#if A
int a = 1;
#else
int a = 2;
#endif
```
result after:
```
namespace {
int i;
int j;
}// namespace
#if A
int a = 1;
#else
int a = 2;
#endif
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D30659
llvm-svn: 297028
Summary:
This patch makes the namespace comment fixer use the number of unwrapped lines
that a namespace spans to detect it that namespace is short, thus not needing
end comments to be added.
This is needed to ensure clang-format is idempotent. Previously, a short namespace
was detected by the original source code lines. This has the effect of requiring two
runs for this example:
```
namespace { class A; }
```
after first run:
```
namespace {
class A;
}
```
after second run:
```
namespace {
class A;
} // namespace
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D30528
llvm-svn: 296736
Many things were wrong:
- We didn't always allow wrapping after "as", which can be necessary.
- We used to Undestand the identifier after "as" as a start of a name.
- We didn't properly parse the structure of the expression with "as"
having the precedence of relational operators
llvm-svn: 296659