Commit Graph

5020 Commits

Author SHA1 Message Date
Björn Schäpers a74ff3ac2e [clang-format][NFC] Rename test and remove comments
Why put "InMacros" in the name? We test other things to, and I will add
more, withut macros.

Also all our tests are regression tests.

Differential Revision: https://reviews.llvm.org/D120401
2022-02-26 21:22:37 +01:00
Björn Schäpers 1d03548f63 [clang-format][NFC] Remove redundant semi
All "calls" have a semi, as they should, remove the one from the macro.

Differential Revision: https://reviews.llvm.org/D120359
2022-02-26 21:22:36 +01:00
owenca c05da55bdf [clang-format] Handle trailing comment for InsertBraces
Differential Revision: https://reviews.llvm.org/D120503
2022-02-25 12:29:47 -08:00
Stanislav Gatev 53dcd9efd1 [clang][dataflow] Add SAT solver interface and implementation
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120289
2022-02-25 14:46:52 +00:00
Stanislav Gatev baa0f221d6 [clang][dataflow] Update StructValue child when assigning a value
When assigning a value to a storage location of a struct member we
need to also update the value in the corresponding `StructValue`.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120414
2022-02-24 16:41:48 +00:00
Marek Kurdej bfb4afee74 [clang-format] Avoid inserting space after C++ casts.
Fixes https://github.com/llvm/llvm-project/issues/53876.

This is a solution for standard C++ casts: const_cast, dynamic_cast, reinterpret_cast, static_cast.

A general approach handling all possible casts is not possible without semantic information.
Consider the code:
```
static_cast<T>(*function_pointer_variable)(arguments);
```
vs.
```
some_return_type<T> (*function_pointer_variable)(parameters);
// Later used as:
function_pointer_variable = &some_function;
return function_pointer_variable(args);
```
In the latter case, it's not a cast but a variable declaration of a pointer to function.
Without knowing what `some_return_type<T>` is (and clang-format does not know it), it's hard to distinguish between the two cases. Theoretically, one could check whether "parameters" are types (not a cast) and "arguments" are value/expressions (a cast), but that might be inefficient (needs lots of lookahead).

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D120140
2022-02-24 10:21:02 +01:00
Marek Kurdej 46f6c834d9 [clang-format] Fix QualifierOrder breaking the code with requires clause.
Fixes https://github.com/llvm/llvm-project/issues/53962.

Given the config:
```
BasedOnStyle: LLVM
QualifierAlignment: Custom
QualifierOrder: ['constexpr', 'type']
```

The code:
```
template <typename F>
  requires std::invocable<F>
constexpr constructor();
```
was incorrectly formatted to:
```
template <typename F>
  requires
constexpr std::invocable<F> constructor();
```
because we considered `std::invocable<F> constexpr` as a type, not recognising the requires clause.

This patch avoids moving the qualifier across the boundary of the requires clause (checking `ClosesRequiresClause`).

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D120309
2022-02-24 10:16:10 +01:00
Luis Penagos dbc4d281bd [clang-format] Do not insert space after new/delete keywords in C function declarations
Fixes https://github.com/llvm/llvm-project/issues/46915.

Reviewed By: curdeius, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D120374
2022-02-24 10:06:40 +01:00
Sam McCall 7c1ee5e95f [Pseudo] Token/TokenStream, PP directive parser.
The TokenStream class is the representation of the source code that will
be fed into the GLR parser.

This patch allows a "raw" TokenStream to be built by reading source code.
It also supports scanning a TokenStream to find the directive structure.

Next steps (with placeholders in the code): heuristically choosing a
path through #ifs, preprocessing the code by stripping directives and comments.
These will produce a suitable stream to feed into the parser proper.

Differential Revision: https://reviews.llvm.org/D119162
2022-02-23 17:52:02 +01:00
Stanislav Gatev 03dff12197 Revert "Revert "[clang][dataflow] Add support for global storage values""
This reverts commit 169e1aba55.

It also fixes an incorrect assumption in `initGlobalVars`.
2022-02-23 13:57:34 +00:00
Stanislav Gatev 169e1aba55 Revert "[clang][dataflow] Add support for global storage values"
This reverts commit 7ea103de14.
2022-02-23 10:32:17 +00:00
Nathan James c34d898183
[ASTMatchers] Expand isInline matcher to VarDecl
Add support to the `isInline` matcher for C++17's inline variables.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D118900
2022-02-23 08:34:00 +00:00
Chuanqi Xu f85a6a8127 [NFC] Add unittest for Decl::isInExportDeclContext 2022-02-23 16:29:42 +08:00
Stanislav Gatev 7ea103de14 [clang][dataflow] Add support for global storage values
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120149
2022-02-23 08:27:58 +00:00
Haojian Wu a2fab82f33 [pseudo] Implement LRTable.
This patch introduces a dense implementation of the LR parsing table, which is
used by LR parsers.

We build a SLR(1) parsing table from the LR(0) graph.

Statistics of the LR parsing table on the C++ spec grammar:
  - number of states: 1449
  - number of actions: 83069
  - size of the table (bytes): 334928

Differential Revision: https://reviews.llvm.org/D118196
2022-02-23 09:21:34 +01:00
Björn Schäpers 923c3755ea [clang-format] Don't break semi after requires clause ...
..regardless of the chosen style.

Fixes https://github.com/llvm/llvm-project/issues/53818

Differential Revision: https://reviews.llvm.org/D120278
2022-02-22 22:08:03 +01:00
Marek Kurdej 071f870e7f [clang-format] Avoid parsing "requires" as a keyword in non-C++-like languages.
Fixes the issue raised post-review in D113319 (cf. https://reviews.llvm.org/D113319#3337485).

Reviewed By: krasimir

Differential Revision: https://reviews.llvm.org/D120324
2022-02-22 16:55:38 +01:00
Krasimir Georgiev c9592ae49b [clang-format] Fix preprocessor nesting after commit 529aa4b011
In 529aa4b011
by setting the identifier info to nullptr, we started to subtly
interfere with the parts in the beginning of the function,
529aa4b011/clang/lib/Format/UnwrappedLineParser.cpp (L991)
causing the preprocessor nesting to change in some cases. E.g., for the
added regression test, clang-format started incorrectly guessing the
language as C++.

This tries to address this by introducing an internal identifier info
element to use instead.

Reviewed By: curdeius, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D120315
2022-02-22 15:43:18 +01:00
owenca 77e60bc42c [clang-format] Add option to insert braces after control statements
Adds a new option InsertBraces to insert the optional braces after
if, else, for, while, and do in C++.

Differential Revision: https://reviews.llvm.org/D120217
2022-02-21 20:16:25 -08:00
Björn Schäpers be9a7fdd6a [clang-format] Fixed handling of requires clauses followed by attributes
Fixes https://github.com/llvm/llvm-project/issues/53820.

Differential Revision: https://reviews.llvm.org/D119893
2022-02-20 22:33:26 +01:00
Marek Kurdej 4701bcae97 Revert "[clang-format] Avoid inserting space after C++ casts."
This reverts commit e021987273.

This commit provokes failures in formatting tests of polly.
Cf. https://lab.llvm.org/buildbot/#/builders/205/builds/3320.

That's probably because of `)` being annotated as `CastRParen` instead of `Unknown` before, hence being kept on the same line with the next token.
2022-02-20 22:19:51 +01:00
Marek Kurdej e021987273 [clang-format] Avoid inserting space after C++ casts.
Fixes https://github.com/llvm/llvm-project/issues/53876.

This is a solution for standard C++ casts: const_cast, dynamic_cast, reinterpret_cast, static_cast.

A general approach handling all possible casts is not possible without semantic information.
Consider the code:
```
static_cast<T>(*function_pointer_variable)(arguments);
```
vs.
```
some_return_type<T> (*function_pointer_variable)(parameters);
// Later used as:
function_pointer_variable = &some_function;
return function_pointer_variable(args);
```
In the latter case, it's not a cast but a variable declaration of a pointer to function.
Without knowing what `some_return_type<T>` is (and clang-format does not know it), it's hard to distinguish between the two cases. Theoretically, one could check whether "parameters" are types (not a cast) and "arguments" are value/expressions (a cast), but that might be inefficient (needs lots of lookahead).

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D120140
2022-02-20 21:58:37 +01:00
Luis Penagos d9567babef Fix extraneous whitespace addition in line comments on clang-format directives
Fixes https://github.com/llvm/llvm-project/issues/53844.
I believe this regression was caused by not accounting for clang-format directives in https://reviews.llvm.org/D92257.

Reviewed By: HazardyKnusperkeks, curdeius

Differential Revision: https://reviews.llvm.org/D120188
2022-02-20 21:53:50 +01:00
David Goldman 805f7a4fa4 [clang] Add `ObjCProtocolLoc` to represent protocol references
Add `ObjCProtocolLoc` which behaves like `TypeLoc` but for
`ObjCProtocolDecl` references.

RecursiveASTVisitor now synthesizes `ObjCProtocolLoc` during traversal
and the `ObjCProtocolLoc` can be stored in a `DynTypedNode`.

In a follow up patch, I'll update clangd to make use of this
to properly support protocol references for hover + goto definition.

Differential Revision: https://reviews.llvm.org/D119363
2022-02-18 15:24:00 -05:00
Marek Kurdej 331e8e4e27 [clang-format] Do not add space after return-like keywords in macros.
Fixes https://github.com/llvm/llvm-project/issues/33336.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D120028
2022-02-17 22:12:39 +01:00
Stanislav Gatev dd4dde8d39 [clang][dataflow] Add transfer functions for logical and, or, not.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D119953
2022-02-17 09:09:59 +00:00
Marek Kurdej 0ae2464fcd [clang-format] Fix wrong assertion with non-negative shift when aligning tokens.
Fixes https://github.com/llvm/llvm-project/issues/53880.
2022-02-17 09:49:00 +01:00
Marek Kurdej ef39235cb9 [clang-format] Make checking for a record more robust and avoid a loop. 2022-02-16 23:05:49 +01:00
Marek Kurdej d81f003ce1 [clang-format] Fix formatting of struct-like records followed by variable declaration.
Fixes https://github.com/llvm/llvm-project/issues/24781.
Fixes https://github.com/llvm/llvm-project/issues/38160.

This patch splits `TT_RecordLBrace` for classes/enums/structs/unions (and other records, e.g. interfaces) and uses the brace type to avoid the error-prone scanning for record token.

The mentioned bugs were provoked by the scanning being too limited (and so not considering `const` or `constexpr`, or other qualifiers, on an anonymous struct variable declaration).

Moreover, the proposed solution is more efficient as we parse tokens once only (scanning being parsing too).

Reviewed By: MyDeveloperDay, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D119785
2022-02-16 22:37:32 +01:00
Marek Kurdej fdee512048 [clang-format] Add test for SpacesInLineCommentPrefix. NFC.
Fixes https://github.com/llvm/llvm-project/issues/52649.
This was already fixed in commit e967d97a35.
2022-02-16 13:54:55 +01:00
Björn Schäpers b786a4aefe [clang-format] Extend SpaceBeforeParens for requires
We can now configure the space between requires and the following paren,
seperate for clauses and expressions.

Differential Revision: https://reviews.llvm.org/D113369
2022-02-15 21:37:36 +01:00
Björn Schäpers bcd1e4612f [clang-format] Further improve support for requires expressions
Detect requires expressions in more unusable contexts. This is far from
perfect, but currently we have no good metric to decide between a
requires expression and a trailing requires clause.

Differential Revision: https://reviews.llvm.org/D119138
2022-02-15 21:37:35 +01:00
Marek Kurdej e21db15be8 [clang-format] Honour PointerAlignment in statements with initializers.
Fixes https://github.com/llvm/llvm-project/issues/53843.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119814
2022-02-15 18:06:32 +01:00
Eric Li d1e3235f60 [libTooling] Change Tranformer's consumer to take multiple changes
Previously, Transformer would invoke the consumer once per file modified per
match, in addition to any errors encountered. The consumer is not aware of which
AtomicChanges come from any particular match. It is unclear which sets of edits
may be related or whether an error invalidates any previously emitted changes.

Modify the signature of the consumer to accept a set of changes. This keeps
related changes (i.e. all edits from a single match) together, and clarifies
that errors don't produce partial changes.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D119745
2022-02-15 16:34:36 +00:00
Jan Svoboda d8298f04a9 [clang][lex][minimizer] Avoid treating path separators as comments
The minimizer strips out single-line comments (introduced by `//`). This sequence of characters can also appear in `#include` or `#import` directives where they play the role of path separators. We already avoid stripping this character sequence for `#include` but not for `#import` (which has the same semantics). This patch makes it so `#import <A//A.h>` is not affected by minimization. Previously, we would incorrectly reduce it into `#import <A`.

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D119226
2022-02-15 09:49:19 +01:00
Jan Svoboda fd2dff17c5 [clang][lex][minimizer] Ensure whitespace between squashed lines
The minimizer tries to squash multi-line macro definitions into single line. For that to work, contents of each line need to be separated by a space. Since we always strip leading whitespace on lines of a macro definition, the code currently tries to preserve exactly one space that appeared before the backslash.

This means the following code:

```
#define FOO(BAR) \
  #BAR           \
  baz
```

gets minimized into:

```
#define FOO(BAR) #BAR baz
```

However, if there are no spaces before the backslash on line 2:

```
#define FOO(BAR) \
  #BAR\
  baz
```

no space can be preserved, leading to (most likely) malformed macro definition:

```
#define FOO(BAR) #BARbaz
```

This patch makes sure we always put exactly one space at the end of line ending with a backslash.

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D119231
2022-02-15 09:49:03 +01:00
Alex Lorenz 00cd6c0420 [Preprocessor] Reduce the memory overhead of `#define` directives (Recommit)
Recently we observed high memory pressure caused by clang during some parallel builds.
We discovered that we have several projects that have a large number of #define directives
in their TUs (on the order of millions), which caused huge memory consumption in clang due
to a lot of allocations for MacroInfo. We would like to reduce the memory overhead of
clang for a single #define to reduce the memory overhead for these files, to allow us to
reduce the memory pressure on the system during highly parallel builds. This change achieves
that by removing the SmallVector in MacroInfo and instead storing the tokens in an array
allocated using the bump pointer allocator, after all tokens are lexed.

The added unit test with 1000000 #define directives illustrates the problem. Prior to this
change, on arm64 macOS, clang's PP bump pointer allocator allocated 272007616 bytes, and
used roughly 272 bytes per #define. After this change, clang's PP bump pointer allocator
allocates 120002016 bytes, and uses only roughly 120 bytes per #define.

For an example test file that we have internally with 7.8 million #define directives, this
change produces the following improvement on arm64 macOS: Persistent allocation footprint for
this test case file as it's being compiled to LLVM IR went down 22% from 5.28 GB to 4.07 GB
and the total allocations went down 14% from 8.26 GB to 7.05 GB. Furthermore, this change
reduced the total number of allocations made by the system for this clang invocation from
1454853 to 133663, an order of magnitude improvement.

The recommit fixes the LLDB build failure.

Differential Revision: https://reviews.llvm.org/D117348
2022-02-14 09:27:44 -08:00
Marek Kurdej c72fdad71b [clang-format] Reformat. NFC. 2022-02-14 14:05:05 +01:00
Marek Kurdej e967d97a35 [clang-format] Fix SpacesInLineCommentPrefix deleting tokens.
Fixes https://github.com/llvm/llvm-project/issues/53799.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119680
2022-02-14 09:53:16 +01:00
Marek Kurdej e01f624adb [clang-format] Fix PointerAlignment within lambdas in a multi-variable declaration statement.
Fixes https://github.com/llvm/llvm-project/issues/43115.

Also, handle while loops with initializers (C++20) the same way as for loops.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119648
2022-02-14 09:41:24 +01:00
Balázs Kéri 83028ad934 [clang][AST][ASTImporter] Set record to complete during import of its members.
At import of a member it may require that the record is already set to complete.
(For example 'computeDependence' at create of some Expr nodes.)
The record at this time may not be completely imported, the result of layout
calculations can be incorrect, but at least no crash occurs this way.

A good solution would be if fields of every encountered record are imported
before other members of all records. This is much more difficult to implement.

Differential Revision: https://reviews.llvm.org/D116155
2022-02-14 08:27:44 +01:00
Marek Kurdej 09559bc59a Avoid a vulgarism. NFC. 2022-02-13 22:01:06 +01:00
Marek Kurdej 25282bd6c4 [clang-format] Handle PointerAlignment in `if` and `switch` statements with initializers (C++17) the same way as in `for` loops.
Reviewed By: MyDeveloperDay, owenpan

Differential Revision: https://reviews.llvm.org/D119650
2022-02-13 21:36:58 +01:00
Marek Kurdej 9cb9445979 [clang-format] Correctly format loops and `if` statements even if preceded with comments.
Fixes https://github.com/llvm/llvm-project/issues/53758.

Braces in loops and in `if` statements with leading (block) comments were formatted according to `BraceWrapping.AfterFunction` and not `AllowShortBlocksOnASingleLine`/`AllowShortLoopsOnASingleLine`/`AllowShortIfStatementsOnASingleLine`.

Previously, the code:
```
while (true) {
  f();
}
/*comment*/ while (true) {
  f();
}
```

was incorrectly formatted to:
```
while (true) {
  f();
}
/*comment*/ while (true) { f(); }
```

when using config:
```
BasedOnStyle: LLVM
BreakBeforeBraces: Custom
BraceWrapping:
  AfterFunction: false
AllowShortBlocksOnASingleLine: false
AllowShortLoopsOnASingleLine: false
```

and it was (correctly but by chance) formatted to:
```
while (true) {
  f();
}
/*comment*/ while (true) {
  f();
}
```

when using enabling brace wrapping after functions:
```
BasedOnStyle: LLVM
BreakBeforeBraces: Custom
BraceWrapping:
  AfterFunction: true
AllowShortBlocksOnASingleLine: false
AllowShortLoopsOnASingleLine: false
```

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119649
2022-02-13 21:22:17 +01:00
Alex Lorenz 3f05192c4c Revert "[Preprocessor] Reduce the memory overhead of `#define` directives"
This reverts commit 0d9b91524e.

This change broke LLDB's build. I will need to recommit after fixing LLDB.
2022-02-11 15:53:16 -08:00
Alex Lorenz 0d9b91524e [Preprocessor] Reduce the memory overhead of `#define` directives
Recently we observed high memory pressure caused by clang during some parallel builds.
We discovered that we have several projects that have a large number of #define directives
in their TUs (on the order of millions), which caused huge memory consumption in clang due
to a lot of allocations for MacroInfo. We would like to reduce the memory overhead of
clang for a single #define to reduce the memory overhead for these files, to allow us to
reduce the memory pressure on the system during highly parallel builds. This change achieves
that by removing the SmallVector in MacroInfo and instead storing the tokens in an array
allocated using the bump pointer allocator, after all tokens are lexed.

The added unit test with 1000000 #define directives illustrates the problem. Prior to this
change, on arm64 macOS, clang's PP bump pointer allocator allocated 272007616 bytes, and
used roughly 272 bytes per #define. After this change, clang's PP bump pointer allocator
allocates 120002016 bytes, and uses only roughly 120 bytes per #define.

For an example test file that we have internally with 7.8 million #define directives, this
change produces the following improvement on arm64 macOS: Persistent allocation footprint for
this test case file as it's being compiled to LLVM IR went down 22% from 5.28 GB to 4.07 GB
and the total allocations went down 14% from 8.26 GB to 7.05 GB. Furthermore, this change
reduced the total number of allocations made by the system for this clang invocation from
1454853 to 133663, an order of magnitude improvement.

Differential Revision: https://reviews.llvm.org/D117348
2022-02-11 15:01:10 -08:00
Björn Schäpers 9aab0db13f [clang-format] Improve require and concept handling
- Added an option where to put the requires clauses.
- Renamed IndentRequires to IndentRequiresClause.
- Changed BreakBeforeConceptDeclaration from bool to an enum.

Fixes https://llvm.org/PR32165, and https://llvm.org/PR52401.

Differential Revision: https://reviews.llvm.org/D113319
2022-02-11 22:42:37 +01:00
Marek Kurdej fd16eeea9d [clang-format] Assert default style instead of commenting. NFC. 2022-02-11 12:01:25 +01:00
Marek Kurdej a218706cba [clang-format] Add tests for spacing between ref-qualifier and `noexcept`. NFC.
Cf. https://github.com/llvm/llvm-project/issues/44542.
Cf. ae1b7859cb.
2022-02-11 10:50:05 +01:00
Balazs Benics abc873694f [analyzer] Restrict CallDescription fuzzy builtin matching
`CallDescriptions` for builtin functions relaxes the match rules
somewhat, so that the `CallDescription` will match for calls that have
some prefix or suffix. This was achieved by doing a `StringRef::contains()`.
However, this is somewhat problematic for builtins that are substrings
of each other.

Consider the following:

`CallDescription{ builtin, "memcpy"}` will match for
`__builtin_wmemcpy()` calls, which is unfortunate.

This patch addresses/works around the issue by checking if the
characters around the function's name are not part of the 'name'
semantically. In other words, to accept a match for `"memcpy"` the call
should not have alphanumeric (`[a-zA-Z]`) characters around the 'match'.

So, `CallDescription{ builtin, "memcpy"}` will not match on:

 - `__builtin_wmemcpy: there is a `w` alphanumeric character before the match.
 - `__builtin_memcpyFOoBar_inline`: there is a `F` character after the match.
 - `__builtin_memcpyX_inline`: there is an `X` character after the match.

But it will still match for:
 - `memcpy`: exact match
 - `__builtin_memcpy`: there is an _ before the match
 - `__builtin_memcpy_inline`: there is an _ after the match
 - `memcpy_inline_builtinFooBar`: there is an _ after the match

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D118388
2022-02-11 10:45:18 +01:00
Marek Kurdej 6c7e6fc7b6 [clang-format] Do not remove required spaces when aligning tokens.
Fixes https://github.com/llvm/llvm-project/issues/44292.
Fixes https://github.com/llvm/llvm-project/issues/45874.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119419
2022-02-10 19:15:27 +01:00
Ivan Murashko 71d7c8d870 [clangd] Crash in __memcmp_avx2_movbe
There is a clangd crash at `__memcmp_avx2_movbe`. Short problem description is below.

The method `HeaderIncludes::addExistingInclude` stores `Include` objects by reference at 2 places: `ExistingIncludes` (primary storage) and `IncludesByPriority` (pointer to the object's location at ExistingIncludes). `ExistingIncludes` is a map where value is a `SmallVector`. A new element is inserted by `push_back`. The operation might do resize. As result pointers stored at `IncludesByPriority` might become invalid.

Typical stack trace
```
    frame #0: 0x00007f11460dcd94 libc.so.6`__memcmp_avx2_movbe + 308
    frame #1: 0x00000000004782b8 clangd`llvm::StringRef::compareMemory(Lhs="
\"t2.h\"", Rhs="", Length=6) at StringRef.h:76:22
    frame #2: 0x0000000000701253 clangd`llvm::StringRef::compare(this=0x0000
7f10de7d8610, RHS=(Data = "", Length = 7166742329480737377)) const at String
Ref.h:206:34
  * frame #3: 0x00000000007603ab clangd`llvm::operator<(llvm::StringRef, llv
m::StringRef)(LHS=(Data = "\"t2.h\"", Length = 6), RHS=(Data = "", Length =
7166742329480737377)) at StringRef.h:907:23
    frame #4: 0x0000000002d0ad9f clangd`clang::tooling::HeaderIncludes::inse
rt(this=0x00007f10de7fb1a0, IncludeName=(Data = "t2.h\"", Length = 4), IsAng
led=false) const at HeaderIncludes.cpp:365:22
    frame #5: 0x00000000012ebfdd clangd`clang::clangd::IncludeInserter::inse
rt(this=0x00007f10de7fb148, VerbatimHeader=(Data = "\"t2.h\"", Length = 6))
const at Headers.cpp:262:70
```

A unit test test for the crash was created (`HeaderIncludesTest.RepeatedIncludes`). The proposed solution is to use std::list instead of llvm::SmallVector

Test Plan
```
./tools/clang/unittests/Tooling/ToolingTests --gtest_filter=HeaderIncludesTest.RepeatedIncludes
```

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D118755
2022-02-10 09:40:44 -08:00
Marek Kurdej a7b5e5b413 [clang-format] Fix formatting of macro definitions with a leading comment.
Fixes https://github.com/llvm/llvm-project/issues/43206.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D118924
2022-02-09 22:39:59 +01:00
Marek Kurdej a77c67f939 [clang-format] Fix formatting of the array form of delete.
Fixes https://github.com/llvm/llvm-project/issues/53576.

There was an inconsistency in formatting of delete expressions.

Before:
```
delete (void*)a;
delete[](void*) a;
```

After this patch:
```
delete (void*)a;
delete[] (void*)a;
```

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119117
2022-02-09 22:36:13 +01:00
Marek Kurdej e329b5866f [clang-format] Honour "// clang-format off" when using QualifierOrder.
Fixes https://github.com/llvm/llvm-project/issues/53643.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D119218
2022-02-09 22:15:20 +01:00
Haojian Wu f1984b1433 [pseudo] Implement LRGraph
LRGraph is the key component of the clang pseudo parser, it is a
deterministic handle-finding finite-state machine, which is used to
generated the LR parsing table.

Separate from https://reviews.llvm.org/D118196.

Differential Revision: https://reviews.llvm.org/D119172
2022-02-09 11:20:07 +01:00
Kirill Bobyrev 46a6f5ae14 [clangd] NFC: Move stdlib headers handling to Clang
This will allow moving the IncludeCleaner library essentials to Clang
and decoupling them from the majority of clangd.

The patch itself just moves the code, it doesn't change existing
functionality.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D119130
2022-02-09 11:05:39 +01:00
Haojian Wu fe932a88e9 [pseudo] Add first and follow set computation in Grammar.
These will be used when building parsing table for LR parsers.

Separate from https://reviews.llvm.org/D118196.

Differential Revision: https://reviews.llvm.org/D118990
2022-02-09 09:16:27 +01:00
ksyx a70549ae43 [clang-format] Fix DefSeparator empty line issues
- Add or remove empty lines surrounding union blocks.
- Fixes https://github.com/llvm/llvm-project/issues/53229, in which
  keywords like class and struct in a line ending with left brace or
  whose next line is left brace only, will be falsely recognized as
  definition line, causing extra empty lines inserted surrounding blocks
  with no need to be formatted.

Reviewed By: MyDeveloperDay, curdeius, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D119067
2022-02-07 14:23:21 +00:00
Kim Gräsman d7ddad408f Reformat CastExpr unittest suite; NFC
In preparation for adding new tests. No functional change.
2022-02-07 09:21:41 -05:00
Kadir Cetinkaya f59787084e
[clang][Lexer] Fix tests after ff77071a4d 2022-02-07 14:06:32 +01:00
Paul Robinson 0d54457f8a [IntrospectionTest] Replace "return" with "GTEST_SKIP"
If a test simply returns, it gets mis-reported as a pass; being
reported as SKIPPED is correct.

Found by the Rotten Green Tests project.
2022-02-04 12:35:44 -08:00
Owen Pan 35f7dd601d [clang-format][NFC] Fix a bug in setting type FunctionLBrace
The l_brace token in a macro definition should not be set to
TT_FunctionLBrace.

This patch could have fixed #42087.

Differential Revision: https://reviews.llvm.org/D118969
2022-02-04 11:36:30 -08:00
David Goldman 9385ece95a [HeaderSearch] Track framework name in LookupFile
Previously, the Framework name was only set if the file
came from a header mapped framework; now we'll always
set the framework name if the file is in a framework.

Differential Revision: https://reviews.llvm.org/D117830
2022-02-04 13:32:39 -05:00
Sam McCall cc8ed7b5aa [Format] Also test rvalue-qualified functions 2022-02-04 12:17:25 +01:00
Sam McCall acc3ce945c [Format] Don't derive pointers right based on space before method ref-qualifiers
The second space in `void foo() &` is always produced by clang-format,
and isn't evidence of any particular style.

Before this patch, it was considered evidence of PAS_Right, because
there is a space before a pointerlike ampersand.

This caused the following code to have "unstable" pointer alignment:
  void a() &;
  void b() &;
  int *x;
PAS_Left, Derive=false would produce 'int* x' with other lines unchanged.
But subsequent formatting with Derive=true would produce 'int *x' again.

Differential Revision: https://reviews.llvm.org/D118921
2022-02-04 12:13:58 +01:00
Haojian Wu b94f09524e [pseudo] NFC, clangSyntaxPsuedo => clangToolingSyntaxPseudo
To be consistent with existing name pattern.
2022-02-04 09:57:20 +01:00
Haojian Wu 2189960e65 [pseudo] Rename Tests.cpp => Test.cpp
To be consistent with other files in clang unittest directory.
2022-02-04 09:48:14 +01:00
ksyx 88e4e6be16 [clang-format] Use wider comment prefix space rule
This commit changes the condition of requiring comment to start with
alphanumeric characters to make no change only for a certain set of
characters, currently horizontal whitespace and punctuation characters,
to support wider set of leading characters unrelated to documentation
generation directives.

Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D118869
2022-02-03 21:49:10 +00:00
mydeveloperday 23fc20e06c [clang-format] regression from clang-format v13
https://github.com/llvm/llvm-project/issues/53567

The following source

```
namespace A {

template <int N> struct Foo<char[N]> {
  void foo() { std::cout << "Bar"; }
}; // namespace A
```

is incorrectly formatted as:

```
namespace A {

template <int N> struct Foo<char[N]>{void foo(){std::cout << "Bar";
}
}
; // namespace A
```

This looks to be caused by 5c2e7c9ca0

Reviewed By: curdeius

Differential Revision: https://reviews.llvm.org/D118911
2022-02-03 18:37:43 +00:00
Marek Kurdej ca0d97072e [clang-format] Avoid merging macro definitions.
Fixes https://github.com/llvm/llvm-project/issues/42087.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118879
2022-02-03 18:54:46 +01:00
Marek Kurdej 529aa4b011 [clang-format] Avoid adding space after the name of a function-like macro when the name is a keyword.
Fixes https://github.com/llvm/llvm-project/issues/31086.

Before the code:
```
#define if(x)
```

was erroneously formatted to:
```
#define if (x)
```

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118844
2022-02-03 18:45:51 +01:00
Owen Pan eaef54f213 [clang-format] Revert a feature in RemoveBracesLLVM
Revert the handling of a single-statement block that gets wrapped.

See issue #53543.

Differential Revision: https://reviews.llvm.org/D118873
2022-02-03 02:56:09 -08:00
Haojian Wu 20e05b9f0e [syntax][pseudo] Add Grammar for the clang pseudo-parser
This patch introduces the Grammar class, which is a critial piece for constructing
a tabled-based parser.

As the first patch, the scope is limited to:
  - define base types (symbol, rules) of modeling the grammar
  - construct Grammar by parsing the BNF file (annotations are excluded for now)

Differential Revision: https://reviews.llvm.org/D114790
2022-02-03 11:28:27 +01:00
Marek Kurdej bc40b76b5b [clang-format] Correctly parse C99 digraphs: "<:", ":>", "<%", "%>", "%:", "%:%:".
Fixes https://github.com/llvm/llvm-project/issues/31592.

This commits enables lexing of digraphs in C++11 and onwards.
Enabling them in C++03 is error-prone, as it would unconditionally treat sequences like "<:" as digraphs, even if they are followed by a single colon, e.g. "<::" would be treated as "[:" instead of "<" followed by "::". Lexing in C++11 doesn't have this problem as it looks ahead the following token.
The relevant excerpt from Lexer::LexTokenInternal:
```
        // C++0x [lex.pptoken]p3:
        //  Otherwise, if the next three characters are <:: and the subsequent
        //  character is neither : nor >, the < is treated as a preprocessor
        //  token by itself and not as the first character of the alternative
        //  token <:.
```

Also, note that both clang and gcc turn on digraphs by default (-fdigraphs), so clang-format should match this behaviour.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118706
2022-02-02 10:25:24 +01:00
Stanislav Gatev 6b8800dfb5 [clang][dataflow] Enable comparison of distinct values in Environment
Make specializations of `DataflowAnalysis` extendable with domain-specific
logic for comparing distinct values when comparing environments.

This includes a breaking change to the `runDataflowAnalysis` interface
as the return type is now `llvm::Expected<...>`.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D118596
2022-02-01 15:25:59 +00:00
Marek Kurdej fd33cca762 [clang-format] Fix AlignConsecutiveAssignments breaking lambda formatting.
Fixes https://github.com/llvm/llvm-project/issues/52772.

This patch fixes the formatting of the code:
```
auto aaaaaaaaaaaaaaaaaaaaa = {};
auto b                     = g([] {
  return;
});
```
which should be left as is, but before this patch was formatted to:
```
auto aaaaaaaaaaaaaaaaaaaaa = {};
auto b                     = g([] {
  return;
                    });
```

Reviewed By: MyDeveloperDay, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D115972
2022-02-01 09:17:59 +01:00
Marek Kurdej 95bf0a9ebd [clang-format] Don't break block comments when sorting includes.
Fixes https://github.com/llvm/llvm-project/issues/34626.

Before, the include sorter would break the code:
```
#include <stdio.h>
#include <stdint.h> /* long
                       comment */
```
and change it into:
```
#include <stdint.h> /* long
#include <stdio.h>
                       comment */
```

This commit handles only the most basic case of a single block comment on an include line, but does not try to handle all the possible edge cases with multiple comments.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D118627
2022-02-01 08:51:10 +01:00
Kadir Cetinkaya ff77071a4d
[clang][Lexer] Make raw and normal lexer behave the same for line comments
Normally there are heruistics in lexer to treat `//*` specially in
language modes that don't have line comments (to emit `/`). Unfortunately this
only applied to the first occurence of a line comment inside the file, as the
subsequent line comments were treated as if language had support for them.

This unfortunately only holds in normal lexing mode, as in raw mode all
occurences of line comments received this treatment, which created discrepancies
when comparing expanded and spelled tokens.

The proper fix would be to just make sure we treat all the line comments with a
subsequent `*` the same way, but it would imply breaking some code that's
accepted by clang today. So instead we introduce the same bug into raw lexing
mode.

Fixes https://github.com/clangd/clangd/issues/1003.

Differential Revision: https://reviews.llvm.org/D118471
2022-01-31 16:15:16 +01:00
Marek Kurdej 438f0e1f00 [clang-format] Use EXPECT_EQ instead of setting style to a default value. NFC. 2022-01-31 09:06:00 +01:00
Philip Sigillito d1aed486ef [clang-format] Handle C variables with name that matches c++ access specifier
Reviewed By: MyDeveloperDay, curdeius, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D117416
2022-01-30 20:56:50 +01:00
Stanislav Gatev 56cc697323 [clang][dataflow] Merge distinct pointer values in Environment::join
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D118480
2022-01-29 16:33:15 +00:00
Marek Kurdej 64df51624f [clang-format] Fix misaligned trailing comments in the presence of an empty block comment.
Fixes https://github.com/llvm/llvm-project/issues/53441.

Expected code:
```
/**/   //
int a; //
```

was before misformatted to:
```
/**/     //
int a; //
```

Because the "remaining length" (after the starting `/*`) of an empty block comment `/**/` was computed to be 0 instead of 2.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118475
2022-01-28 22:28:48 +01:00
Marek Kurdej f4d5195d2f [clang-format] Move irrelevant code from getRangeLength to getRemainingLength. NFC. 2022-01-28 12:01:02 +01:00
Martin Probst 03c59765b3 clang-format: [JS] sort import aliases. Users can define aliases for long symbols using import aliases:
import X = A.B.C;

Previously, these were unhandled and would terminate import sorting.
With this change, aliases sort as their own group, coming last after all
other imports.

Aliases are not sorted within their group, as they may reference each
other, so order is significant.

This reverts commit f750c3d95a. It fixes
the msan issue by not parsing past the end of the line when handling
import aliases.

Differential Revision: https://reviews.llvm.org/D118446
2022-01-28 11:51:28 +01:00
Vitaly Buka f750c3d95a Revert "clang-format: [JS] sort import aliases."
Triggers MSAN report.

This reverts commit c6d5efb5d9.
2022-01-27 21:16:53 -08:00
Marek Kurdej 36622c4e1a [clang-format] Fix AllowShortFunctionsOnASingleLine: InlineOnly with wrapping after record.
Fixes https://github.com/llvm/llvm-project/issues/53430.

Initially, I had a quick and dirty approach, but it led to a myriad of special cases handling comments (that may add unwrapped lines).
So I added TT_RecordLBrace type annotations and it seems like a much nicer solution.
I think that in the future it will allow us to clean up some convoluted code that detects records.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D118337
2022-01-27 18:06:31 +01:00
Martin Probst c6d5efb5d9 clang-format: [JS] sort import aliases.
Users can define aliases for long symbols using import aliases:

    import X = A.B.C;

Previously, these were unhandled and would terminate import sorting.
With this change, aliases sort as their own group, coming last after all
other imports.

Aliases are not sorted within their group, as they may reference each
other, so order is significant.

Revision URI: https://reviews.llvm.org/D118361
2022-01-27 16:16:37 +01:00
Jim Lin ad39b5bc59 [NFC] Remove duplicate include 2022-01-27 13:56:13 +08:00
Yitzhak Mandelbaum 3595189217 [clang][dataflow] Allow clients to disable built-in transfer functions.
These built-in functions build the (sophisticated) model of the code's
memory. This model isn't used by all analyses, so we provide for disabling it to
avoid incurring the costs associated with its construction.

Differential Revision: https://reviews.llvm.org/D118178
2022-01-26 17:24:59 +00:00
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
Stanislav Gatev 75c22b382f [clang][dataflow] Add a transfer function for CXXBoolLiteralExpr
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D118236
2022-01-26 15:33:00 +00:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
Marek Kurdej 93948c5299 [clang-format] Correctly format lambdas with variadic template parameters.
Fixes https://github.com/llvm/llvm-project/issues/53405.

Reviewed By: MyDeveloperDay, owenpan

Differential Revision: https://reviews.llvm.org/D118220
2022-01-26 16:10:52 +01:00
Kadir Cetinkaya b777d354f6
[clang][DeclPrinter] Fix printing for noexcept expressions
We are already building into the final result, no need to append it
again.

Fixes https://github.com/clangd/vscode-clangd/issues/290.

Differential Revision: https://reviews.llvm.org/D118245
2022-01-26 16:04:24 +01:00
Sam McCall 33c3ef2fbe [CodeCompletion][clangd] Clean __uglified parameter names in completion & hover
Underscore-uglified identifiers are used in standard library implementations to
guard against collisions with macros, and they hurt readability considerably.
(Consider `push_back(Tp_ &&__value)` vs `push_back(Tp value)`.
When we're describing an interface, the exact names of parameters are not
critical so we can drop these prefixes.

This patch adds a new PrintingPolicy flag that can applies this stripping
when recursively printing pieces of AST.
We set it in code completion/signature help, and in clangd's hover display.
All three features also do a bit of manual poking at names, so fix up those too.

Fixes https://github.com/clangd/clangd/issues/736

Differential Revision: https://reviews.llvm.org/D116387
2022-01-26 15:51:17 +01:00
Stanislav Gatev d3597ec0aa [clang][dataflow] Enable merging distinct values in Environment::join
Make specializations of `DataflowAnalysis` extendable with domain-specific
logic for merging distinct values when joining environments. This could be
a strict lattice join or a more general widening operation.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D118038
2022-01-26 11:40:51 +00:00
Stanislav Gatev 188d28f73c [clang][dataflow] Assign aggregate storage locations to union stmts
This patch ensures that the dataflow analysis framework does not crash
when it encounters access to members of union types.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: xazax.hun

Differential Revision: https://reviews.llvm.org/D118226
2022-01-26 10:36:49 +00:00
Marek Kurdej 72e29caf03 [clang-format] Fix regression in parsing pointers to arrays.
Fixes https://github.com/llvm/llvm-project/issues/53293.

After commit 5c2e7c9, the code:
```
template <> struct S : Template<int (*)[]> {};
```
was misformatted as:
```
template <> struct S : Template<int (*)[]>{};
```

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118106
2022-01-26 09:27:38 +01:00
Yitzhak Mandelbaum 0944c196c5 [libTooling] Adds more support for constructing object access expressions.
This patch adds a `buildAccess` function, which constructs a string with the
proper operator to use based on the expression's form and type. It also adds two
predicates related to smart pointers, which are needed by `buildAccess` but are
also of general value.

We deprecate `buildDot` and `buildArrow` in favor of the more general
`buildAccess`. These will be removed in a future patch.

Differential Revision: https://reviews.llvm.org/D116377
2022-01-25 19:43:36 +00:00