Add a new callback that notifies checkers when a const pointer escapes. Currently, this only works
for const pointers passed as a top level parameter into a function. We need to differentiate the const
pointers escape from regular escape since the content pointed by const pointer will not change;
if it’s a file handle, a file cannot be closed; but delete is allowed on const pointers.
This should suppress several false positives reported by the NewDelete checker on llvm codebase.
llvm-svn: 178310
We should only suppress a bug report if the IDCed or null returned nil value is directly related to the value we are warning about. This was
not the case for nil receivers - we would suppress a bug report that had an IDCed nil receiver on the path regardless of how it’s
related to the warning.
1) Thread EnableNullFPSuppression parameter through the visitors to differentiate between tracking the value which
is directly responsible for the bug and other values that visitors are tracking (ex: general tracking of nil receivers).
2) in trackNullOrUndef specifically address the case when a value of the message send is nil due to the receiver being nil.
llvm-svn: 178309
+ Improved display names for allocators and deallocators
The checker checks if a deallocation function matches allocation one. ('free' for 'malloc', 'delete' for 'new' etc.)
llvm-svn: 178250
These types will not have a CXXConstructExpr to do the initialization for
them. Previously we just used a simple call to ProgramState::bindLoc, but
that doesn't trigger proper checker callbacks (like pointer escape).
Found by Anton Yartsev.
llvm-svn: 178160
The visitor should look for the PreStmt node as the receiver is nil in the PreStmt and this is the node. Also, tag the nil
receiver nodes with a special tag for consistency.
llvm-svn: 178152
Register the nil tracking visitors with the region and refactor trackNullOrUndefValue a bit.
Also adds the cast and paren stripping before checking if the value is an OpaqueValueExpr
or ExprWithCleanups.
llvm-svn: 178093
This addresses an undefined value false positive from concreteOffsetBindingIsInvalidatedBySymbolicOffsetAssignment.
Fixes PR14877; radar://12991168.
llvm-svn: 177905
These aren't generated by default, but they are needed when either side of
the comparison is tainted.
Should fix our internal buildbot.
llvm-svn: 177846
In C, comparisons between signed and unsigned numbers are always done in
unsigned-space. Thus, we should know that "i >= 0U" is always true, even
if 'i' is signed. Similarly, "u >= 0" is also always true, even though '0'
is signed.
Part of <rdar://problem/13239003> (false positives related to std::vector)
llvm-svn: 177806
For two concrete locations, we were producing another concrete location and
then casting it to an integer. We should just create a nonloc::ConcreteInt
to begin with.
No functionality change.
llvm-svn: 177805
We can support the full range of comparison operations between two locations
by canonicalizing them as subtraction, as in the previous commit.
This won't work (well) if either location includes an offset, or (again)
if the comparisons are not consistent about which region comes first.
<rdar://problem/13239003>
llvm-svn: 177803
Canonicalizing these two forms allows us to better model containers like
std::vector, which use "m_start != m_finish" to implement empty() but
"m_finish - m_start" to implement size(). The analyzer should have a
consistent interpretation of these two symbolic expressions, even though
it's not properly reasoning about either one yet.
The other unfortunate thing is that while the size() expression will only
ever be written "m_finish - m_start", the comparison may be written
"m_finish == m_start" or "m_start == m_finish". Right now the analyzer does
not attempt to canonicalize those two expressions, since it doesn't know
which length expression to pick. Doing this correctly will probably require
implementing unary minus as a new SymExpr kind (<rdar://problem/12351075>).
For now, the analyzer inverts the order of arguments in the comparison to
build the subtraction, on the assumption that "begin() != end()" is
written more often than "end() != begin()". This is purely speculation.
<rdar://problem/13239003>
llvm-svn: 177801
We just treat this as opaque symbols, but even that allows us to handle
simple cases where the same condition is tested twice. This is very common
in the STL, which means that any project using the STL gets spurious errors.
Part of <rdar://problem/13239003>.
llvm-svn: 177800
The algorithm used here was ridiculously slow when a potential back-edge
pointed to a node that already had a lot of successors. The previous commit
makes this feature unnecessary anyway.
This reverts r177468 / f4cf6b10f863b9bc716a09b2b2a8c497dcc6aa9b.
Conflicts:
lib/StaticAnalyzer/Core/BugReporter.cpp
llvm-svn: 177765
For a given bug equivalence class, we'd like to emit the report with the
shortest path. So far to do this we've been trimming the ExplodedGraph to
only contain relevant nodes, then doing a reverse BFS (starting at all the
error nodes) to find the shortest paths from the root. However, this is
fairly expensive when we are suppressing many bug reports in the same
equivalence class.
r177468-9 tried to solve this problem by breaking cycles during graph
trimming, then updating the BFS priorities after each suppressed report
instead of recomputing the whole thing. However, breaking cycles is not
a cheap operation because an analysis graph minus cycles is still a DAG,
not a tree.
This fix changes the algorithm to do a single forward BFS (starting from the
root) and to use that to choose the report with the shortest path by looking
at the error nodes with the lowest BFS priorities. This was Anna's idea, and
has the added advantage of requiring no update step: we can just pick the
error node with the next lowest priority to produce the next bug report.
<rdar://problem/13474689>
llvm-svn: 177764
This fixes some mistaken condition logic in RegionStore that caused
global variables to be invalidated when /any/ region was invalidated,
rather than only as part of opaque function calls. This was only
being used by CStringChecker, and so users will now see that strcpy()
and friends do not invalidate global variables.
Also, add a test case we don't handle properly: explicitly-assigned
global variables aren't being invalidated by opaque calls. This is
being tracked by <rdar://problem/13464044>.
llvm-svn: 177572
Due to improper modelling of copy constructors (specifically, their
const reference arguments), we were producing spurious leak warnings
for allocated memory stored in structs. In order to silence this, we
decided to consider storing into a struct to be the same as escaping.
However, the previous commit has fixed this issue and we can now properly
distinguish leaked memory that happens to be in a struct from a buffer
that escapes within a struct wrapper.
Originally applied in r161511, reverted in r174468.
<rdar://problem/12945937>
llvm-svn: 177571
In this case, the value of 'x' may be changed after the call to indirectAccess:
struct Wrapper {
int *ptr;
};
void indirectAccess(const Wrapper &w);
void test() {
int x = 42;
Wrapper w = { x };
clang_analyzer_eval(x == 42); // TRUE
indirectAccess(w);
clang_analyzer_eval(x == 42); // UNKNOWN
}
This is important for modelling return-by-value objects in C++, to show
that the contents of the struct are escaping in the return copy-constructor.
<rdar://problem/13239826>
llvm-svn: 177570
This is a bit of old code trying to deal with the fact that functions that
take pointers often use them to access an entire array via pointer
arithmetic. However, RegionStore already conservatively assumes you can use
pointer arithmetic to access any part of a region.
Some day we may want to go back to handling this specifically for calls,
but we can do that in the future.
No functionality change.
llvm-svn: 177569
With the assurance that the trimmed graph does not contain cycles,
this patch is safe (with a few tweaks), and provides the performance
boost it was intended to.
Part of performance work for <rdar://problem/13433687>.
llvm-svn: 177469
Having a trimmed graph with no cycles (a DAG) is much more convenient for
trying to find shortest paths, which is exactly what BugReporter needs to do.
Part of the performance work for <rdar://problem/13433687>.
llvm-svn: 177468
This fixes a crash when analyzing LLVM that was exposed by r177220 (modeling of
trivial copy/move assignment operators).
When we look up a lazy binding for “Builder”, we see the direct binding of Loc at offset 0.
Previously, we believed the binding, which led to a crash. Now, we do not believe it as
the types do not match.
llvm-svn: 177453
The whole reason we were doing a BFS in the first place is because an
ExplodedGraph can have cycles. Unfortunately, my removeErrorNode "update"
doesn't work at all if there are cycles.
I'd still like to be able to avoid doing the BFS every time, but I'll come
back to it later.
This reverts r177353 / 481fa5071c203bc8ba4f88d929780f8d0f8837ba.
llvm-svn: 177448
Splitting the graph trimming and the path-finding (r177216) already
recovered quite a bit of performance lost to increased suppression.
We can still do better by also performing the reverse BFS up front
(needed for shortest-path-finding) and only walking the shortest path
for each report. This does mean we have to walk back up the path and
invalidate all the BFS numbers if the report turns out to be invalid,
but it's probably still faster than redoing the full BFS every time.
More performance work for <rdar://problem/13433687>
llvm-svn: 177353
r175234 allowed the analyzer to model trivial copy/move constructors as
an aggregate bind. This commit extends that to trivial assignment
operators as well. Like the last commit, one of the motivating factors here
is not warning when the right-hand object is partially-initialized, which
can have legitimate uses.
<rdar://problem/13405162>
llvm-svn: 177220
When we generate a path diagnostic for a bug report, we have to take the
full ExplodedGraph and limit it down to a single path. We do this in two
steps: "trimming", which limits the graph to all the paths that lead to
this particular bug, and "creating the report graph", which finds the
shortest path in the trimmed path to any error node.
With BugReporterVisitor false positive suppression, this becomes more
expensive: it's possible for some paths through the trimmed graph to be
invalid (i.e. likely false positives) but others to be valid. Therefore
we have to run the visitors over each path in the graph until we find one
that is valid, or until we've ruled them all out. This can become quite
expensive.
This commit separates out graph trimming from creating the report graph,
performing the first only once per bug equivalence class and the second
once per bug report. It also cleans up that portion of the code by
introducing some wrapper classes.
This seems to recover most of the performance regression described in my
last commit.
<rdar://problem/13433687>
llvm-svn: 177216
...in favor of this typedef:
typedef llvm::DenseMap<const ExplodedNode *, const ExplodedNode *>
InterExplodedGraphMap;
Use this everywhere the previous class and typedef were used.
Took the opportunity to ArrayRef-ize ExplodedGraph::trim while I'm at it.
No functionality change.
llvm-svn: 177215
I removed this check in the recursion->iteration commit, but forgot that
generatePathDiagnostic may be called multiple times if there are multiple
PathDiagnosticConsumers.
llvm-svn: 177214
Fixes a FIXME, improves dead symbol collection, suppresses a false positive,
which resulted from reusing the same symbol twice for simulation of 2 calls to the same function.
Fixing this lead to 2 possible false negatives in CString checker. Since the checker is still alpha and
the solution will not require revert of this commit, move the tests to a FIXME section.
llvm-svn: 177206
The previous generatePathDiagnostic() was intended to be tail-recursive,
restarting and trying again if a report was marked invalid. However:
(1) this leaked all the cloned visitors, which weren't being deleted, and
(2) this wasn't actually tail-recursive because some local variables had
non-trivial destructors.
This was causing us to overflow the stack on inputs with large numbers of
reports in the same equivalence class, such as sqlite3.c. Being iterative
at least prevents us from blowing out the stack, but doesn't solve the
performance issue: suppressing thousands (yes, thousands) of paths in the
same equivalence class is expensive. I'm looking into that now.
<rdar://problem/13423498>
llvm-svn: 177189