Add the wide character strdup variants (wcsdup, _wcsdup) and the MSVC
version of alloca (_alloca) and other differently named function used
by the Malloc checker.
A patch by Alexander Riccio!
Differential Revision: http://reviews.llvm.org/D17688
llvm-svn: 262894
The analyzer assumes that system functions will not free memory or modify the
arguments in other ways, so we assume that arguments do not escape when
those are called. However, this may lead to false positive leak errors. For
example, in code like this where the pointers added to the rb_tree are freed
later on:
struct alarm_event *e = calloc(1, sizeof(*e));
<snip>
rb_tree_insert_node(&alarm_tree, e);
Add a heuristic to assume that calls to system functions taking void*
arguments allow for pointer escape.
llvm-svn: 251449
Currently realloc(ptr, 0) is treated as free() which seems to be not correct. C
standard (N1570) establishes equivalent behavior for malloc(0) and realloc(ptr,
0): "7.22.3 Memory management functions calloc, malloc, realloc: If the size of
the space requested is zero, the behavior is implementation-defined: either a
null pointer is returned, or the behavior is as if the size were some nonzero
value, except that the returned pointer shall not be used to access an object."
The patch equalizes the processing of malloc(0) and realloc(ptr,0). The patch
also enables unix.Malloc checker to detect references to zero-allocated memory
returned by realloc(ptr,0) ("Use of zero-allocated memory" warning).
A patch by Антон Ярцев!
Differential Revision: http://reviews.llvm.org/D9040
llvm-svn: 248336
The analyzer trims unnecessary nodes from the exploded graph before reporting
path diagnostics. However, in some cases it can trim all nodes (including the
error node), leading to an assertion failure (see
https://llvm.org/bugs/show_bug.cgi?id=24184).
This commit addresses the issue by adding two new APIs to CheckerContext to
explicitly create error nodes. Unless the client provides a custom tag, these
APIs tag the node with the checker's tag -- preventing it from being trimmed.
The generateErrorNode() method creates a sink error node, while
generateNonFatalErrorNode() creates an error node for a path that should
continue being explored.
The intent is that one of these two methods should be used whenever a checker
creates an error node.
This commit updates the checkers to use these APIs. These APIs
(unlike addTransition() and generateSink()) do not take an explicit Pred node.
This is because there are not any error nodes in the checkers that were created
with an explicit different than the default (the CheckerContext's Pred node).
It also changes generateSink() to require state and pred nodes (previously
these were optional) to reduce confusion.
Additionally, there were several cases where checkers did check whether a
generated node could be null; we now explicitly check for null in these places.
This commit also includes a test case written by Ying Yi as part of
http://reviews.llvm.org/D12163 (that patch originally addressed this issue but
was reverted because it introduced false positive regressions).
Differential Revision: http://reviews.llvm.org/D12780
llvm-svn: 247859
This is making our internal build bot fail because it results in extra warnings being
emitted past what should be sink nodes. (There is actually an example of this in the
updated malloc.c test in the reverted commit.)
I'm working on a patch to fix the original issue by adding a new checker API to explicitly
create error nodes. This API will ensure that error nodes are always tagged in order to
prevent them from being reclaimed.
This reverts commit r246188.
llvm-svn: 247103
The assertion is caused by reusing a “filler” ExplodedNode as an error node.
The “filler” nodes are only used for intermediate processing and are not
essential for analyzer history, so they can be reclaimed when the
ExplodedGraph is trimmed by the “collectNode” function. When a checker finds a
bug, they generate a new transition in the ExplodedGraph. The analyzer will
try to reuse the existing predecessor node. If it cannot, it creates a new
ExplodedNode, which always has a tag to uniquely identify the creation site.
The assertion is caused when the analyzer reuses a “filler” node.
In the test case, some “filler” nodes were reused and then reclaimed later
when the ExplodedGraph was trimmed. This caused an assertion because the node
was needed to generate the report. The “filler” nodes should not be reused as
error nodes. The patch adds a constraint to prevent this happening, which
solves the problem and makes the test cases pass.
Differential Revision: http://reviews.llvm.org/D11433
Patch by Ying Yi!
llvm-svn: 246188
TODO: support realloc(). Currently it is not possible due to the present realloc() handling. Currently RegionState is not being attached to realloc() in case of a zero Size argument.
llvm-svn: 234889
This is another regression fixed by reverting r189090.
In this case, the problem is not live variables but the approach that was taken in r189090. This regression was caused by explicitly binding "true" to the condition when we take the true branch. Normally that's okay, but in this case we're planning to reuse that condition as the value of the expression.
llvm-svn: 196599
New rules of invalidation/escape of the source buffer of memcpy: the source buffer contents is invalidated and escape while the source buffer region itself is neither invalidated, nor escape.
In the current modeling of memcpy the information about allocation state of regions, accessible through the source buffer, is not copied to the destination buffer and we can not track the allocation state of those regions anymore. So we invalidate/escape the source buffer indirect regions in anticipation of their being invalidated for real later. This eliminates false-positive leaks reported by the unix.Malloc and alpha.cplusplus.NewDeleteLeaks checkers for the cases like
char *f() {
void *x = malloc(47);
char *a;
memcpy(&a, &x, sizeof a);
return a;
}
llvm-svn: 194953
This keeps the analyzer from making silly assumptions, like thinking
strlen(foo)+1 could wrap around to 0. This fixes PR16558.
Patch by Karthik Bhat!
llvm-svn: 188680
When a region is realloc()ed, MallocChecker records whether it was known
to be allocated or not. If it is, and the reallocation fails, the original
region has to be freed. Previously, when an allocated region escaped,
MallocChecker completely stopped tracking it, so a failed reallocation
still (correctly) wouldn't require freeing the original region. Recently,
however, MallocChecker started tracking escaped symbols, so that if it were
freed we could check that the deallocator matched the allocator. This
broke the reallocation model for whether or not a symbol was allocated.
Now, MallocChecker will actually check if a symbol is owned, and only
require freeing after a failed reallocation if it was owned before.
PR16730
llvm-svn: 188468
Consider this example:
char *p = malloc(sizeof(char));
systemFunction(&p);
free(p);
In this case, when we call systemFunction, we know (because it's a system
function) that it won't free 'p'. However, we /don't/ know whether or not
it will /change/ 'p', so the analyzer is forced to invalidate 'p', wiping
out any bindings it contains. But now the malloc'd region looks like a
leak, since there are no more bindings pointing to it, and we'll get a
spurious leak warning.
The fix for this is to notice when something is becoming inaccessible due
to invalidation (i.e. an imperfect model, as opposed to being explicitly
overwritten) and stop tracking it at that point. Currently, the best way
to determine this for a call is the "indirect escape" pointer-escape kind.
In practice, all the patch does is take the "system functions don't free
memory" special case and limit it to direct parameters, i.e. just the
arguments to a call and not other regions accessible to them. This is a
conservative change that should only cause us to escape regions more
eagerly, which means fewer leak warnings.
This isn't perfect for several reasons, the main one being that this
example is treated the same as the one above:
char **p = malloc(sizeof(char *));
systemFunction(p + 1);
// leak
Currently, "addresses accessible by offsets of the starting region" and
"addresses accessible through bindings of the starting region" are both
considered "indirect" regions, hence this uniform treatment.
Another issue is our longstanding problem of not distinguishing const and
non-const bindings; if in the first example systemFunction's parameter were
a char * const *, we should know that the function will not overwrite 'p',
and thus we can safely report the leak.
<rdar://problem/13758386>
llvm-svn: 181607
Test that the path notes do not change. I don’t think we should print a note on escape.
Also, I’ve removed a check that assumed that the family stored in the RefStete could be
AF_None and added an assert in the constructor.
llvm-svn: 179075
Due to improper modelling of copy constructors (specifically, their
const reference arguments), we were producing spurious leak warnings
for allocated memory stored in structs. In order to silence this, we
decided to consider storing into a struct to be the same as escaping.
However, the previous commit has fixed this issue and we can now properly
distinguish leaked memory that happens to be in a struct from a buffer
that escapes within a struct wrapper.
Originally applied in r161511, reverted in r174468.
<rdar://problem/12945937>
llvm-svn: 177571
Fixes a FIXME, improves dead symbol collection, suppresses a false positive,
which resulted from reusing the same symbol twice for simulation of 2 calls to the same function.
Fixing this lead to 2 possible false negatives in CString checker. Since the checker is still alpha and
the solution will not require revert of this commit, move the tests to a FIXME section.
llvm-svn: 177206
The malloc checker will now catch the case when a previously malloc'ed
region is freed, but the pointer passed to free does not point to the
start of the allocated memory. For example:
int *p1 = malloc(sizeof(int));
p1++;
free(p1); // warn
From the "memory.LeakPtrValChanged enhancement to unix.Malloc" entry
in the list of potential checkers.
A patch by Branden Archer!
llvm-svn: 174678
The checkPointerEscape callback previously did not specify how a
pointer escaped. This change includes an enum which describes the
different ways a pointer may escape. This enum is passed to the
checkPointerEscape callback when a pointer escapes. If the escape
is due to a function call, the call is passed. This changes
previous behavior where the call is passed as NULL if the escape
was due to indirectly invalidating the region the pointer referenced.
A patch by Branden Archer!
llvm-svn: 174677
This is a "quick fix".
The underlining issue is that when a const pointer to a struct is passed
into a function, we do not invalidate the pointer fields. This results
in false positives that are common in C++ (since copy constructors are
prevalent). (Silences two llvm false positives.)
llvm-svn: 174468
This fixes a few cases where we'd emit path notes like this:
+---+
1| v
p = malloc(len);
^ |2
+---+
In general this should make path notes more consistent and more correct,
especially in cases where the leak happens on the false branch of an if
that jumps directly to the end of the function. There are a couple places
where the leak is reported farther away from the cause; these are usually
cases where there are several levels of nested braces before the end of
the function. This still matches our current behavior for when there /is/
a statement after all the braces, though.
llvm-svn: 168070
This allows us to properly remove dead bindings at the end of the top-level
stack frame, using the ReturnStmt, if there is one, to keep the return value
live. This in turn removes the need for a check::EndPath callback in leak
checkers.
This does cause some changes in the path notes for leak checkers. Previously,
a leak would be reported at the location of the closing brace in a function.
Now, it gets reported at the last statement. This matches the way leaks are
currently reported for inlined functions, but is less than ideal for both.
llvm-svn: 168066
'Inputs' subdirectory.
The general desire has been to have essentially all of the non-test
input files live in such directories, with some exceptions for obvious
and common patterns like 'foo.c' using 'foo.h'.
This came up because our distributed test runner couldn't find some of
the headers, for example with stl.cpp.
No functionality changed, just shuffling around here.
llvm-svn: 163674
This fixes several issues:
- removes egregious hack where PlistDiagnosticConsumer would forward to HTMLDiagnosticConsumer,
but diagnostics wouldn't be generated consistently in the same way if PlistDiagnosticConsumer
was used by itself.
- emitting diagnostics to the terminal (using clang's diagnostic machinery) is no longer a special
case, just another PathDiagnosticConsumer. This also magically resolved some duplicate warnings,
as we now use PathDiagnosticConsumer's diagnostic pruning, which has scope for the entire translation
unit, not just the scope of a BugReporter (which is limited to a particular ExprEngine).
As an interesting side-effect, diagnostics emitted to the terminal also have their trailing "." stripped,
just like with diagnostics emitted to plists and HTML. This required some tests to be updated, but now
the tests have higher fidelity with what users will see.
There are some inefficiencies in this patch. We currently generate the report graph (from the ExplodedGraph)
once per PathDiagnosticConsumer, which is a bit wasteful, but that could be pulled up higher in the
logic stack. There is some intended duplication, however, as we now generate different PathDiagnostics (for the same issue)
for different PathDiagnosticConsumers. This is necessary to produce the diagnostics that a particular
consumer expects.
llvm-svn: 162028
Unfortunately, generalized region printing is very difficult:
- ElementRegions are used both for casting and as actual elements.
- Accessing values through a pointer means going through an intermediate
SymbolRegionValue; symbolic regions are untyped.
- Referring to implicitly-defined variables like 'this' and 'self' could be
very confusing if they come from another stack frame.
We fall back to simply not printing the region name if we can't be sure it
will print well. This will allow us to improve in the future.
llvm-svn: 161512
The main blocker on this (besides the previous commit) was that
ScanReachableSymbols was not looking through LazyCompoundVals.
Once that was fixed, it's easy enough to clear out malloc data on return,
just like we do when we bind to a global region.
<rdar://problem/10872635>
llvm-svn: 161511
There is no reason why we should not track the memory which was not
allocated in the current function, but was freed there. This would
allow to catch more use-after-free and double free with no/limited IPA.
Also fix a realloc issue which surfaced as the result of this patch.
llvm-svn: 161248
This involved refactoring some common pointer-escapes code onto CallEvent,
then having MallocChecker use those callbacks for whether or not to consider
a pointer's /ownership/ as escaping. This still needs to be pinned down, and
probably we want to make the new argumentsMayEscape() function a little more
discerning (content invalidation vs. ownership/metadata invalidation), but
this is a good improvement.
As a bonus, also remove CallOrObjCMessage from the source completely.
llvm-svn: 159557
Specifically, although the bitmap context does not take ownership of the
buffer (unlike CGBitmapContextCreateWithData), the data buffer can be extracted
out of the created CGContextRef. Thus the buffer is not leaked even if its
original pointer goes out of scope, as long as
- the context escapes, or
- it is retrieved via CGBitmapContextGetData and freed.
Actually implementing that logic is beyond the current scope of MallocChecker,
so for now CGBitmapContextCreate goes on our system function exception list.
llvm-svn: 158579
I falsely assumed that the memory spaces are equal when we reach this
point, they might not be when memory space of one or more is stack or
Unknown. We don't want a region from Heap space alias something with
another memory space.
llvm-svn: 158165
Add a concept of symbolic memory region belonging to heap memory space.
When comparing symbolic regions allocated on the heap, assume that they
do not alias.
Use symbolic heap region to suppress a common false positive pattern in
the malloc checker, in code that relies on malloc not returning the
memory aliased to other malloc allocations, stack.
llvm-svn: 158136
We need to identify the value of ptr as
ElementRegion (result of pointer arithmetic) in the following code.
However, before this commit '(2-x)' evaluated to Unknown value, and as
the result, 'p + (2-x)' evaluated to Unknown value as well.
int *p = malloc(sizeof(int));
ptr = p + (2-x);
llvm-svn: 156052
The change resulted in multiple issues on the buildbot, so it's not
ready for prime time. Only enable history tracking for tainted
data(which is experimental) for now.
llvm-svn: 156049
reason about the expression.
This essentially keeps more history about how symbolic values were
constructed. As an optimization, previous to this commit, we only kept
the history if one of the symbols was tainted, but it's valuable keep
the history around for other purposes as well: it allows us to avoid
constructing conjured symbols.
Specifically, we need to identify the value of ptr as
ElementRegion (result of pointer arithmetic) in the following code.
However, before this commit '(2-x)' evaluated to Unknown value, and as
the result, 'p + (2-x)' evaluated to Unknown value as well.
int *p = malloc(sizeof(int));
ptr = p + (2-x);
This change brings 2% slowdown on sqlite. Fixes radar://11329382.
llvm-svn: 155944