Summary:
During CTU analysis of complex projects, the loaded AST-contents of
imported TUs can grow bigger than available system memory. This option
introduces a threshold on the number of TUs to be imported for a single
TU in order to prevent such cases.
Differential Revision: https://reviews.llvm.org/D59798
llvm-svn: 365314
This patch is a major part of my GSoC project, aimed to improve the bug
reports of the analyzer.
TL;DR: Help the analyzer understand that some conditions are important,
and should be explained better. If an CFGBlock is a control dependency
of a block where an expression value is tracked, explain the condition
expression better by tracking it.
if (A) // let's explain why we believe A to be true
10 / x; // division by zero
This is an experimental feature, and can be enabled by the
off-by-default analyzer configuration "track-conditions".
In detail:
This idea was inspired by the program slicing algorithm. Essentially,
two things are used to produce a program slice (a subset of the program
relevant to a (statement, variable) pair): data and control
dependencies. The bug path (the linear path in the ExplodedGraph that leads
from the beginning of the analysis to the error node) enables to
analyzer to argue about data dependencies with relative ease.
Control dependencies are a different slice of the cake entirely.
Just because we reached a branch during symbolic execution, it
doesn't mean that that particular branch has any effect on whether the
bug would've occured. This means that we can't simply rely on the bug
path to gather control dependencies.
In previous patches, LLVM's IDFCalculator, which works on a control flow
graph rather than the ExplodedGraph was generalized to solve this issue.
We use this information to heuristically guess that the value of a tracked
expression depends greatly on it's control dependencies, and start
tracking them as well.
After plenty of evaluations this was seen as great idea, but still
lacking refinements (we should have different descriptions about a
conditions value), hence it's off-by-default.
Differential Revision: https://reviews.llvm.org/D62883
llvm-svn: 365207
Summary:
This new piece is similar to our macro expansion printing in HTML reports:
On mouse-hover event it pops up on variables. Similar to note pieces it
supports `plist` diagnostics as well.
It is optional, on by default: `add-pop-up-notes=true`.
Extra: In HTML reports `background-color: LemonChiffon` was too light,
changed to `PaleGoldenRod`.
Reviewers: NoQ, alexfh
Reviewed By: NoQ
Subscribers: cfe-commits, gerazo, gsd, george.karpenkov, alexfh, xazax.hun,
baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho,
Szelethus, donat.nagy, dkrupp
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60670
llvm-svn: 362014
The more entries we have in AnalyzerOptions::ConfigTable, the more helpful
debug.ConfigDumper is. With this patch, I'm pretty confident that it'll now emit
the entire state of the analyzer, minus the frontend flags.
It would be nice to reserve the config table specifically to checker options
only, as storing the regular analyzer configs is kinda redundant.
Differential Revision: https://reviews.llvm.org/D57922
llvm-svn: 361006
Summary: This is just changing naming and documentation to be general about external definitions that can be imported for cross translation unit analysis. There is at least a plan to add VarDecls: D46421
Reviewers: NoQ, xazax.hun, martong, a.sidorin, george.karpenkov, serge-sans-paille
Reviewed By: xazax.hun, martong
Subscribers: mgorny, whisperity, baloghadamsoftware, szepet, rnkovacs, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, cfe-commits
Differential Revision: https://reviews.llvm.org/D56441
llvm-svn: 350852
Summary:
With a new switch we may be able to print to stderr if a new TU is being loaded
during CTU. This is very important for higher level scripts (like CodeChecker)
to be able to parse this output so they can create e.g. a zip file in case of
a Clang crash which contains all the related TU files.
Reviewers: xazax.hun, Szelethus, a_sidorin, george.karpenkov
Subscribers: whisperity, baloghadamsoftware, szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp,
Differential Revision: https://reviews.llvm.org/D55135
llvm-svn: 348594
In earlier patches regarding AnalyzerOptions, a lot of effort went into
gathering all config options, and changing the interface so that potential
misuse can be eliminited.
Up until this point, AnalyzerOptions only evaluated an option when it was
querried. For example, if we had a "-no-false-positives" flag, AnalyzerOptions
would store an Optional field for it that would be None up until somewhere in
the code until the flag's getter function is called.
However, now that we're confident that we've gathered all configs, we can
evaluate off of them before analysis, so we can emit a error on invalid input
even if that prticular flag will not matter in that particular run of the
analyzer. Another very big benefit of this is that debug.ConfigDumper will now
show the value of all configs every single time.
Also, almost all options related class have a similar interface, so uniformity
is also a benefit.
The implementation for errors on invalid input will be commited shorty.
Differential Revision: https://reviews.llvm.org/D53692
llvm-svn: 348031
I'm in the process of refactoring AnalyzerOptions. The main motivation behind
here is to emit warnings if an invalid -analyzer-config option is given from
the command line, and be able to list them all.
In this patch, I found some flags that should've been used as checker options,
or have absolutely no mention of in AnalyzerOptions, or are nonexistent.
- NonLocalizedStringChecker now uses its "AggressiveReport" flag as a checker
option
- lib/StaticAnalyzer/Frontend/ModelInjector.cpp now accesses the "model-path"
option through a getter in AnalyzerOptions
- -analyzer-config path-diagnostics-alternate=false is not a thing, I removed it,
- lib/StaticAnalyzer/Checkers/AllocationDiagnostics.cpp and
lib/StaticAnalyzer/Checkers/AllocationDiagnostics.h are weird, they actually
only contain an option getter. I deleted them, and fixed RetainCountChecker
to get it's "leak-diagnostics-reference-allocation" option as a checker option,
- "region-store-small-struct-limit" has a proper getter now.
Differential Revision: https://reviews.llvm.org/D53276
llvm-svn: 345985
Before C++17 copy elision was optional, even if the elidable copy/move
constructor had arbitrary side effects. The elidable constructor is present
in the AST, but marked as elidable.
In these cases CFG now contains additional information that allows its clients
to figure out if a temporary object is only being constructed so that to pass
it to an elidable constructor. If so, it includes a reference to the elidable
constructor's construction context, so that the client could elide the
elidable constructor and construct the object directly at its final destination.
Differential Revision: https://reviews.llvm.org/D47616
llvm-svn: 335795
This patch adds two new CFG elements CFGScopeBegin and CFGScopeEnd that indicate
when a local scope begins and ends respectively. We use first VarDecl declared
in a scope to uniquely identify it and add CFGScopeBegin and CFGScopeEnd elements
into corresponding basic blocks.
Differential Revision: https://reviews.llvm.org/D16403
llvm-svn: 327258
Don't enable c++-temp-dtor-inlining by default yet, due to this reference
counting pointe problem.
Otherwise the new mode seems stable and allows us to incrementally fix C++
problems in much less hacky ways.
Differential Revision: https://reviews.llvm.org/D43804
llvm-svn: 326461
It was introduced when two -analyzer-config options were added almost
simultaneously in r324793 and r324668 and the option count was not
rebased correctly in the tests.
Fixes the buildbots.
llvm-svn: 324801
This patch adds a new CFGStmt sub-class, CFGConstructor, which replaces
the regular CFGStmt with CXXConstructExpr in it whenever the CFG has additional
information to provide regarding what sort of object is being constructed.
It is useful for figuring out what memory is initialized in client of the
CFG such as the Static Analyzer, which do not operate by recursive AST
traversal, but instead rely on the CFG to provide all the information when they
need it. Otherwise, the statement that triggers the construction and defines
what memory is being initialized would normally occur after the
construct-expression, and the client would need to peek to the next CFG element
or use statement parent map to understand the necessary facts about
the construct-expression.
As a proof of concept, CFGConstructors are added for new-expressions
and the respective test cases are provided to demonstrate how it works.
For now, the only additional data contained in the CFGConstructor element is
the "trigger statement", such as new-expression, which is the parent of the
constructor. It will be significantly expanded in later commits. The additional
data is organized as an auxiliary structure - the "construction context",
which is allocated separately from the CFGElement.
Differential Revision: https://reviews.llvm.org/D42672
llvm-svn: 324668
This patch introduces a new CFG element CFGLoopExit that indicate when a loop
ends. It does not deal with returnStmts yet (left it as a TODO).
It hidden behind a new analyzer-config flag called cfg-loopexit (false by
default).
Test cases added.
The main purpose of this patch right know is to make loop unrolling and loop
widening easier and more efficient. However, this information can be useful for
future improvements in the StaticAnalyzer core too.
Differential Revision: https://reviews.llvm.org/D35668
llvm-svn: 311235
This feature allows the analyzer to consider loops to completely unroll.
New requirements/rules (for unrolling) can be added easily via ASTMatchers.
Right now it is hidden behind a flag, the aim is to find the correct heuristic
and create a solution which results higher coverage % and more precise
analysis, thus can be enabled by default.
Right now the blocks which belong to an unrolled loop are marked by the
LoopVisitor which adds them to the ProgramState.
Then whenever we encounter a CFGBlock in the processCFGBlockEntrance which is
marked then we skip its investigating. That means, it won't be considered to
be visited more than the maximal bound for visiting since it won't be checked.
llvm-svn: 309006
requirements/rules (for unrolling) can be added easily via ASTMatchers.
The current implementation is hidden behind a flag.
Right now the blocks which belong to an unrolled loop are marked by the
LoopVisitor which adds them to the ProgramState. Then whenever we encounter a
CFGBlock in the processCFGBlockEntrance which is marked then we skip its
investigating. That means, it won't be considered to be visited more than the
maximal bound for visiting since it won't be checked.
Differential Revision: https://reviews.llvm.org/D34260
llvm-svn: 308558
Summary:
This mimics the implementation for the implicit destructors. The
generation of this scope leaving elements is hidden behind
a flag to the CFGBuilder, thus it should not affect existing code.
Currently, I'm missing a test (it's implicitly tested by the clang-tidy
lifetime checker that I'm proposing).
I though about a test using debug.DumpCFG, but then I would
have to add an option to StaticAnalyzer/Core/AnalyzerOptions
to enable the scope leaving CFGElement,
which would only be useful to that particular test.
Any other ideas how I could make a test for this feature?
Reviewers: krememek, jordan_rose
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D15031
llvm-svn: 307759
This makes the analyzer around 10% slower by default,
allowing it to find deeper bugs.
Default values for the following -analyzer-config change:
max-nodes: 150000 -> 225000;
max-inlinable-size: 50 -> 100.
rdar://problem/32539666
Differential Revision: https://reviews.llvm.org/D34277
llvm-svn: 305900
Summary:
Dear All,
We have been looking at the following problem, where any code after the constant bound loop is not analyzed because of the limit on how many times the same block is visited, as described in bugzillas #7638 and #23438. This problem is of interest to us because we have identified significant bugs that the checkers are not locating. We have been discussing a solution involving ranges as a longer term project, but I would like to propose a patch to improve the current implementation.
Example issue:
```
for (int i = 0; i < 1000; ++i) {...something...}
int *p = 0;
*p = 0xDEADBEEF;
```
The proposal is to go through the first and last iterations of the loop. The patch creates an exploded node for the approximate last iteration of constant bound loops, before the max loop limit / block visit limit is reached. It does this by identifying the variable in the loop condition and finding the value which is “one away” from the loop being false. For example, if the condition is (x < 10), then an exploded node is created where the value of x is 9. Evaluating the loop body with x = 9 will then result in the analysis continuing after the loop, providing x is incremented.
The patch passes all the tests, with some modifications to coverage.c, in order to make the ‘function_which_gives_up’ continue to give up, since the changes allowed the analysis to progress past the loop.
This patch does introduce possible false positives, as a result of not knowing the state of variables which might be modified in the loop. I believe that, as a user, I would rather have false positives after loops than do no analysis at all. I understand this may not be the common opinion and am interested in hearing your views. There are also issues regarding break statements, which are not considered. A more advanced implementation of this approach might be able to consider other conditions in the loop, which would allow paths leading to breaks to be analyzed.
Lastly, I have performed a study on large code bases and I think there is little benefit in having “max-loop” default to 4 with the patch. For variable bound loops this tends to result in duplicated analysis after the loop, and it makes little difference to any constant bound loop which will do more than a few iterations. It might be beneficial to lower the default to 2, especially for the shallow analysis setting.
Please let me know your opinions on this approach to processing constant bound loops and the patch itself.
Regards,
Sean Eveson
SN Systems - Sony Computer Entertainment Group
Reviewers: jordan_rose, krememek, xazax.hun, zaks.anna, dcoughlin
Subscribers: krememek, xazax.hun, cfe-commits
Differential Revision: http://reviews.llvm.org/D12358
llvm-svn: 251621
Add an option (-analyzer-config min-blocks-for-inline-large=14) to control the function
size the inliner considers as large, in relation to "max-times-inline-large". The option
defaults to the original hard coded behaviour, which I believe should be adjustable with
the other inlining settings.
The analyzer-config test has been modified so that the analyzer will reach the
getMinBlocksForInlineLarge() method and store the result in the ConfigTable, to ensure it
is dumped by the debug checker.
A patch by Sean Eveson!
Differential Revision: http://reviews.llvm.org/D12406
llvm-svn: 247463
The analyzer doesn't currently expect CFG blocks with terminators to be
empty, but this can happen when generating conditional destructors for
a complex logical expression, such as (a && (b || Temp{})). Moreover,
the branch conditions for these expressions are not persisted in the
state. Even for handling noreturn destructors this needs more work.
This reverts r186498.
llvm-svn: 186925
Summary:
This patch enables ExprEndgine to reason about temporary object destructors.
However, these destructor calls are never inlined, since this feature is still
broken. Still, this is sufficient to properly handle noreturn temporary
destructors and close bug #15599. I have also enabled the cfg-temporary-dtors
analyzer option by default.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1131
llvm-svn: 186498
The analyzer uses LazyCompoundVals to represent rvalues of aggregate types,
most importantly structs and arrays. This allows us to efficiently copy
around an entire struct, rather than doing a memberwise load every time a
struct rvalue is encountered. This can also keep memory usage down by
allowing several structs to "share" the same snapshotted bindings.
However, /lookup/ through LazyCompoundVals can be expensive, especially
since they can end up chaining back to the original value. While we try
to reuse LazyCompoundVals whenever it's safe, and cache information about
this transitivity, the fact is it's sometimes just not a good idea to
perpetuate LazyCompoundVals -- the tradeoffs just aren't worth it.
This commit changes RegionStore so that binding a LazyCompoundVal to struct
will do a memberwise copy if the struct is simple enough. Today's definition
of "simple enough" is "up to N scalar members" (see below), but that could
easily be changed in the future. This is enough to bring the test case in
PR15697 back down to a manageable analysis time (within 20% of its original
time, in an unfair test where the new analyzer is not compiled with LTO).
The actual value of "N" is controlled by a new -analyzer-config option,
'region-store-small-struct-limit'. It defaults to "2", meaning structs with
zero, one, or two scalar members will be considered "simple enough" for
this code path.
It's worth noting that a more straightforward implementation would do this
on load, not on bind, and make use of the structure we already have for this:
CompoundVal. A long time ago, this was actually how RegionStore modeled
aggregate-to-aggregate copies, but today it's only used for compound literals.
Unfortunately, it seems that we've special-cased LazyCompoundVal in certain
places (such as liveness checks) but failed to similarly special-case
CompoundVal in all of them. Until we're confident that CompoundVal is
handled properly everywhere, this solution is safer, since the entire
optimization is just an implementation detail of RegionStore.
<rdar://problem/13599304>
llvm-svn: 179767
This is an opt-in tweak for leak diagnostics to reference the allocation
site if the diagnostic consumer only wants a pithy amount of information,
and not the entire path.
This is a strawman enhancement that I expect to see some experimentation
with over the next week, and can go away if we don't want it.
Currently it is only used by RetainCountChecker, but could be used
by MallocChecker if and when we decide this should stay in.
llvm-svn: 179634
Redefine the shallow mode to inline all functions for which we have a
definite definition (ipa=inlining). However, only inline functions that
are up to 4 basic blocks large and cut the max exploded nodes generated
per top level function in half.
This makes shallow faster and allows us to keep inlining small
functions. For example, we would keep inlining wrapper functions and
constructors/destructors.
With the new shallow, it takes 104s to analyze sqlite3, whereas
the deep mode is 658s and previous shallow is 209s.
llvm-svn: 173958
The idea is to introduce a higher level "user mode" option for
different use scenarios. For example, if one wants to run the analyzer
for a small project each time the code is built, they would use
the "shallow" mode.
The user mode option will influence the default settings for the
lower-level analyzer options. For now, this just influences the ipa
modes, but we plan to find more optimal settings for them.
llvm-svn: 173386
The idea is to eventually place all analyzer options under
"analyzer-config". In addition, this lays the ground for introduction of
a high-level analyzer mode option, which will influence the
default setting for IPAMode.
llvm-svn: 173385
performance heuristic
After inlining a function with more than 13 basic blocks 32 times, we
are not going to inline it anymore. The idea is that inlining large
functions leads to drastic performance implications. Since the function
has already been inlined, we know that we've analyzed it in many
contexts.
The following metrics are used:
- Large function is a function with more than 13 basic blocks (we
should switch to another metric, like cyclomatic complexity)
- We consider that we've inlined a function many times if it's been
inlined 32 times. This number is configurable with -analyzer-config
max-times-inline-large=xx
This heuristic addresses a performance regression introduced with
inlining on one benchmark. The analyzer on this benchmark became 60
times slower with inlining turned on. The heuristic allows us to analyze
it in 24% of the time. The performance improvements on the other
benchmarks I've tested with are much lower - under 10%, which is
expected.
llvm-svn: 170361