Summary:
Dear All,
We have been looking at the following problem, where any code after the constant bound loop is not analyzed because of the limit on how many times the same block is visited, as described in bugzillas #7638 and #23438. This problem is of interest to us because we have identified significant bugs that the checkers are not locating. We have been discussing a solution involving ranges as a longer term project, but I would like to propose a patch to improve the current implementation.
Example issue:
```
for (int i = 0; i < 1000; ++i) {...something...}
int *p = 0;
*p = 0xDEADBEEF;
```
The proposal is to go through the first and last iterations of the loop. The patch creates an exploded node for the approximate last iteration of constant bound loops, before the max loop limit / block visit limit is reached. It does this by identifying the variable in the loop condition and finding the value which is “one away” from the loop being false. For example, if the condition is (x < 10), then an exploded node is created where the value of x is 9. Evaluating the loop body with x = 9 will then result in the analysis continuing after the loop, providing x is incremented.
The patch passes all the tests, with some modifications to coverage.c, in order to make the ‘function_which_gives_up’ continue to give up, since the changes allowed the analysis to progress past the loop.
This patch does introduce possible false positives, as a result of not knowing the state of variables which might be modified in the loop. I believe that, as a user, I would rather have false positives after loops than do no analysis at all. I understand this may not be the common opinion and am interested in hearing your views. There are also issues regarding break statements, which are not considered. A more advanced implementation of this approach might be able to consider other conditions in the loop, which would allow paths leading to breaks to be analyzed.
Lastly, I have performed a study on large code bases and I think there is little benefit in having “max-loop” default to 4 with the patch. For variable bound loops this tends to result in duplicated analysis after the loop, and it makes little difference to any constant bound loop which will do more than a few iterations. It might be beneficial to lower the default to 2, especially for the shallow analysis setting.
Please let me know your opinions on this approach to processing constant bound loops and the patch itself.
Regards,
Sean Eveson
SN Systems - Sony Computer Entertainment Group
Reviewers: jordan_rose, krememek, xazax.hun, zaks.anna, dcoughlin
Subscribers: krememek, xazax.hun, cfe-commits
Differential Revision: http://reviews.llvm.org/D12358
llvm-svn: 251621
The analyzer assumes that system functions will not free memory or modify the
arguments in other ways, so we assume that arguments do not escape when
those are called. However, this may lead to false positive leak errors. For
example, in code like this where the pointers added to the rb_tree are freed
later on:
struct alarm_event *e = calloc(1, sizeof(*e));
<snip>
rb_tree_insert_node(&alarm_tree, e);
Add a heuristic to assume that calls to system functions taking void*
arguments allow for pointer escape.
llvm-svn: 251449
This patch adds hashes to the plist and html output to be able to identfy bugs
for suppressing false positives or diff results against a baseline. This hash
aims to be resilient for code evolution and is usable to identify bugs in two
different snapshots of the same software. One missing piece however is a
permanent unique identifier of the checker that produces the warning. Once that
issue is resolved, the hashes generated are going to change. Until that point
this feature is marked experimental, but it is suitable for early adoption.
Differential Revision: http://reviews.llvm.org/D10305
Original patch by: Bence Babati!
llvm-svn: 251011
Prevent invalidation of `this' when a method is const; fixing PR 21606.
A patch by Sean Eveson!
Differential Revision: http://reviews.llvm.org/D13099
llvm-svn: 250237
These test updates almost exclusively around the change in behavior
around enum: enums without a definition are considered incomplete except
when targeting MSVC ABIs. Since these tests are interested in the
'incomplete-enum' behavior, restrict them to %itanium_abi_triple.
llvm-svn: 249660
Change the analyzer's modeling of memcpy to be more precise when copying into fixed-size
array fields. With this change, instead of invalidating the entire containing region the
analyzer now invalidates only offsets for the array itself when it can show that the
memcpy stays within the bounds of the array.
This addresses false positive memory leak warnings of the kind reported by
krzysztof in https://llvm.org/bugs/show_bug.cgi?id=22954
(This is the second attempt, now with assertion failures resolved.)
A patch by Pierre Gousseau!
Differential Revision: http://reviews.llvm.org/D12571
llvm-svn: 248516
This patch ignores malloc-overflow bug in two cases:
Case1:
x = a/b; where n < b
malloc (x*n); Then x*n will not overflow.
Case2:
x = a; // when 'a' is a known value.
malloc (x*n);
Also replaced isa with dyn_cast.
Reject multiplication by zero cases in MallocOverflowSecurityChecker
Currently MallocOverflowSecurityChecker does not catch cases like:
malloc(n * 0 * sizeof(int));
This patch rejects such cases.
Two test cases added. malloc-overflow2.c has an example inspired from a code
in linux kernel where the current checker flags a warning while it should not.
A patch by Aditya Kumar!
Differential Revision: http://reviews.llvm.org/D9924
llvm-svn: 248446
Various improvements to the localization checker:
* Adjusted copy to be consistent with diagnostic text in other Apple
API checkers.
* Added in ~150 UIKit / AppKit methods that require localized strings in
UnlocalizedStringsChecker.
* UnlocalizedStringChecker now checks for UI methods up the class hierarchy and
UI methods that conform for a certain Objective-C protocol.
* Added in alpha version of PluralMisuseChecker and some regression tests. False
positives are still not ideal.
(This is the second attempt, with the memory issues on Linux resolved.)
A patch by Kulpreet Chilana!
Differential Revision: http://reviews.llvm.org/D12417
llvm-svn: 248432
Various improvements to the localization checker:
* Adjusted copy to be consistent with diagnostic text in other Apple
API checkers.
* Added in ~150 UIKit / AppKit methods that require localized strings in
UnlocalizedStringsChecker.
* UnlocalizedStringChecker now checks for UI methods up the class hierarchy and
UI methods that conform for a certain Objective-C protocol.
* Added in alpha version of PluralMisuseChecker and some regression tests. False
positives are still not ideal.
A patch by Kulpreet Chilana!
Differential Revision: http://reviews.llvm.org/D12417
llvm-svn: 248350
Currently realloc(ptr, 0) is treated as free() which seems to be not correct. C
standard (N1570) establishes equivalent behavior for malloc(0) and realloc(ptr,
0): "7.22.3 Memory management functions calloc, malloc, realloc: If the size of
the space requested is zero, the behavior is implementation-defined: either a
null pointer is returned, or the behavior is as if the size were some nonzero
value, except that the returned pointer shall not be used to access an object."
The patch equalizes the processing of malloc(0) and realloc(ptr,0). The patch
also enables unix.Malloc checker to detect references to zero-allocated memory
returned by realloc(ptr,0) ("Use of zero-allocated memory" warning).
A patch by Антон Ярцев!
Differential Revision: http://reviews.llvm.org/D9040
llvm-svn: 248336
This fixes PR16833, in which the analyzer was using large amounts of memory
for switch statements with large case ranges.
rdar://problem/14685772
A patch by Aleksei Sidorin!
Differential Revision: http://reviews.llvm.org/D5102
llvm-svn: 248318
Summary:
`TypeTraitExpr`s are not supported by the ExprEngine today. Analyzer
creates a sink, and aborts the block. Therefore, certain bugs that
involve type traits intrinsics cannot be detected (see PR24710).
This patch creates boolean `SVal`s for `TypeTraitExpr`s, which are
evaluated by the compiler.
Test within the patch is a summary of PR24710.
Reviewers: zaks.anna, dcoughlin, krememek
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D12482
llvm-svn: 248314
Summary:
Name `Out` refers to the parameter. It is moved into the member `Out`
in ctor-init. Dereferencing null pointer will crash clang, if user
passes '-analyzer-viz-egraph-ubigraph' argument.
Reviewers: zaks.anna, krememek
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D12119
llvm-svn: 248050
The analyzer trims unnecessary nodes from the exploded graph before reporting
path diagnostics. However, in some cases it can trim all nodes (including the
error node), leading to an assertion failure (see
https://llvm.org/bugs/show_bug.cgi?id=24184).
This commit addresses the issue by adding two new APIs to CheckerContext to
explicitly create error nodes. Unless the client provides a custom tag, these
APIs tag the node with the checker's tag -- preventing it from being trimmed.
The generateErrorNode() method creates a sink error node, while
generateNonFatalErrorNode() creates an error node for a path that should
continue being explored.
The intent is that one of these two methods should be used whenever a checker
creates an error node.
This commit updates the checkers to use these APIs. These APIs
(unlike addTransition() and generateSink()) do not take an explicit Pred node.
This is because there are not any error nodes in the checkers that were created
with an explicit different than the default (the CheckerContext's Pred node).
It also changes generateSink() to require state and pred nodes (previously
these were optional) to reduce confusion.
Additionally, there were several cases where checkers did check whether a
generated node could be null; we now explicitly check for null in these places.
This commit also includes a test case written by Ying Yi as part of
http://reviews.llvm.org/D12163 (that patch originally addressed this issue but
was reverted because it introduced false positive regressions).
Differential Revision: http://reviews.llvm.org/D12780
llvm-svn: 247859
In Objective-C, method calls with nil receivers are essentially no-ops. They
do not fault (although the returned value may be garbage depending on the
declared return type and architecture). Programmers are aware of this
behavior and will complain about a false alarm when the analyzer
diagnoses API violations for method calls when the receiver is known to
be nil.
Rather than require each individual checker to be aware of this behavior
and suppress a warning when the receiver is nil, this commit
changes ExprEngineObjC so that VisitObjCMessage skips calling checker
pre/post handlers when the receiver is definitely nil. Instead, it adds a
new event, ObjCMessageNil, that is only called in that case.
The CallAndMessageChecker explicitly cares about this case, so I've changed it
to add a callback for ObjCMessageNil and moved the logic in PreObjCMessage
that handles nil receivers to the new callback.
rdar://problem/18092611
Differential Revision: http://reviews.llvm.org/D12123
llvm-svn: 247653
Add an option (-analyzer-config min-blocks-for-inline-large=14) to control the function
size the inliner considers as large, in relation to "max-times-inline-large". The option
defaults to the original hard coded behaviour, which I believe should be adjustable with
the other inlining settings.
The analyzer-config test has been modified so that the analyzer will reach the
getMinBlocksForInlineLarge() method and store the result in the ConfigTable, to ensure it
is dumped by the debug checker.
A patch by Sean Eveson!
Differential Revision: http://reviews.llvm.org/D12406
llvm-svn: 247463
This is making our internal build bot fail because it results in extra warnings being
emitted past what should be sink nodes. (There is actually an example of this in the
updated malloc.c test in the reverted commit.)
I'm working on a patch to fix the original issue by adding a new checker API to explicitly
create error nodes. This API will ensure that error nodes are always tagged in order to
prevent them from being reclaimed.
This reverts commit r246188.
llvm-svn: 247103
Change the analyzer's modeling of memcpy to be more precise when copying into fixed-size
array fields. With this change, instead of invalidating the entire containing region the
analyzer now invalidates only offsets for the array itself when it can show that the
memcpy stays within the bounds of the array.
This addresses false positive memory leak warnings of the kind reported by
krzysztof in https://llvm.org/bugs/show_bug.cgi?id=22954
A patch by Pierre Gousseau!
Differential Revision: http://reviews.llvm.org/D11832
llvm-svn: 246345
The assertion is caused by reusing a “filler” ExplodedNode as an error node.
The “filler” nodes are only used for intermediate processing and are not
essential for analyzer history, so they can be reclaimed when the
ExplodedGraph is trimmed by the “collectNode” function. When a checker finds a
bug, they generate a new transition in the ExplodedGraph. The analyzer will
try to reuse the existing predecessor node. If it cannot, it creates a new
ExplodedNode, which always has a tag to uniquely identify the creation site.
The assertion is caused when the analyzer reuses a “filler” node.
In the test case, some “filler” nodes were reused and then reclaimed later
when the ExplodedGraph was trimmed. This caused an assertion because the node
was needed to generate the report. The “filler” nodes should not be reused as
error nodes. The patch adds a constraint to prevent this happening, which
solves the problem and makes the test cases pass.
Differential Revision: http://reviews.llvm.org/D11433
Patch by Ying Yi!
llvm-svn: 246188
Add checkers that detect code-level localizability issues for OS X / iOS:
- A path sensitive checker that warns about uses of non-localized
NSStrings passed to UI methods expecting localized strings.
- A syntax checker that warns against not including a comment in
NSLocalizedString macros.
A patch by Kulpreet Chilana!
(This is the second attempt with the compilation issue on Windows and
the random test failures resolved.)
llvm-svn: 245093
This reverts commit fc885033a30b6e30ccf82398ae7c30e646727b10.
Revert all localization checker commits until the proper fix is implemented.
llvm-svn: 244394
Add checkers that detect code-level localizability issues for OS X / iOS:
- A path sensitive checker that warns about uses of non-localized
NSStrings passed to UI methods expecting localized strings.
- A syntax checker that warns against not including a comment in
NSLocalizedString macros.
A patch by Kulpreet Chilana!
llvm-svn: 244389
The ObjCSuperCallChecker issues alarms for various Objective-C APIs that require
a subclass to call to its superclass's version of a method when overriding it.
So, for example, it raises an alarm when the -viewDidLoad method in a subclass
of UIViewController does not call [super viewDidLoad].
This patch fixes a false alarm where the analyzer erroneously required the
implementation of the superclass itself (e.g., UIViewController) to call
super.
rdar://problem/18416944
Differential Revision: http://reviews.llvm.org/D11842
llvm-svn: 244386
BlockDecl has a poor AST representation because it doesn't carry its type
with it. Instead, the containing BlockExpr has the full type. This almost
never matters for the analyzer, but if the block decl contains static
local variables we need to synthesize a region to put them in, and this
region will necessarily not have the right type.
Even /that/ doesn't matter, unless
(1) the block calls the function or method containing the block, and
(2) the value of the block expr is used in some interesting way.
In this case, we actually end up needing the type of the block region,
and it will be set to our synthesized type. It turns out we've been doing
a terrible job faking that type -- it wasn't a block pointer type at all.
This commit fixes that to at least guarantee a block pointer type, using
the signature written by the user if there is one.
This is not really a correct answer because the block region's type will
/still/ be wrong, but further efforts to make this right in the analyzer
would probably be silly. We should just change the AST.
rdar://problem/21698099
llvm-svn: 241944
A patch by Karthik Bhat!
This patch fixes a regression introduced by r224398. Prior to r224398
we were able to analyze the following code in test-include.c and report
a null deref in this case. But post r224398 this analysis is being skipped.
E.g.
// test-include.c
#include "test-include.h"
void test(int * data) {
data = 0;
*data = 1;
}
// test-include.h
void test(int * data);
This patch uses the function body (instead of its declaration) as the location
of the function when deciding if the Decl should be analyzed with path-sensitive
analysis. (Prior to r224398, the call graph was guaranteed to have a definition
when available.)
llvm-svn: 240800
Addresses a conflict with glibc's __nonnull macro by renaming the type
nullability qualifiers as follows:
__nonnull -> _Nonnull
__nullable -> _Nullable
__null_unspecified -> _Null_unspecified
This is the major part of rdar://problem/21530726, but does not yet
provide the Darwin-specific behavior for the old names.
llvm-svn: 240596
That is,
void cf2(CFTypeRef * __nullable p CF_RETURNS_NOT_RETAINED);
is equivalent to
void cf2(CFTypeRef __nullable * __nullable p CF_RETURNS_NOT_RETAINED);
More rdar://problem/18742441
llvm-svn: 240186
Includes a simple static analyzer check and not much else, but we'll also
be able to take advantage of this in Swift.
This feature can be tested for using __has_feature(cf_returns_on_parameters).
This commit also contains two fixes:
- Look through non-typedef sugar when deciding whether something is a CF type.
- When (cf|ns)_returns(_not)?_retained is applied to invalid properties,
refer to "property" instead of "method" in the error message.
rdar://problem/18742441
llvm-svn: 240185
Update ObjCContainersChecker to be notified when pointers escape so it can
remove size information for escaping CFMutableArrayRefs. When such pointers
escape, un-analyzed code could mutate the array and cause the size information
to be incorrect.
rdar://problem/19406485
llvm-svn: 239709
Based on previous discussion on the mailing list, clang currently lacks support
for C99 partial re-initialization behavior:
Reference: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-April/029188.html
Reference: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_253.htm
This patch attempts to fix this problem.
Given the following code snippet,
struct P1 { char x[6]; };
struct LP1 { struct P1 p1; };
struct LP1 l = { .p1 = { "foo" }, .p1.x[2] = 'x' };
// this example is adapted from the example for "struct fred x[]" in DR-253;
// currently clang produces in l: { "\0\0x" },
// whereas gcc 4.8 produces { "fox" };
// with this fix, clang will also produce: { "fox" };
Differential Review: http://reviews.llvm.org/D5789
llvm-svn: 239446
Emit warning when operand to `delete` is allocated with `new[]` or
operand to `delete[]` is allocated with `new`.
rev 2 update:
`getNewExprFromInitListOrExpr` should return `dyn_cast_or_null`
instead of `dyn_cast`, since `E` might be null.
Reviewers: rtrieu, jordan_rose, rsmith
Subscribers: majnemer, cfe-commits
Differential Revision: http://reviews.llvm.org/D4661
llvm-svn: 237608
This reverts commit 742dc9b6c9686ab52860b7da39c3a126d8a97fbc.
This is generating multiple segfaults in our internal builds.
Test case coming up shortly.
llvm-svn: 237391
Emit warning when operand to `delete` is allocated with `new[]` or
operand to `delete[]` is allocated with `new`.
Reviewers: rtrieu, jordan_rose, rsmith
Subscribers: majnemer, cfe-commits
Differential Revision: http://reviews.llvm.org/D4661
llvm-svn: 237368
TODO: support realloc(). Currently it is not possible due to the present realloc() handling. Currently RegionState is not being attached to realloc() in case of a zero Size argument.
llvm-svn: 234889
Again, this is being applied in a separate commit so that the previous commit
can be reverted while leaving the test in place.
rdar://problem/20335433
llvm-svn: 233593
This is imitating a pre-r228174 state where ivars are not considered tracked by
default, but with the addition that even ivars /with/ retain count information
(e.g. "[_ivar retain]; [ivar _release];") are not being tracked as well. This is
to ensure that we don't regress on values accessed through both properties and
ivars, which is what r228174 was trying to fix.
The issue occurs in code like this:
[_contentView retain];
[_contentView removeFromSuperview];
[self addSubview:_contentView]; // invalidates 'self'
[_contentView release];
In this case, the call to -addSubview: may change the value of self->_contentView,
and so the analyzer can't be sure that we didn't leak the original _contentView.
This is a correct conservative view of the world, but not a useful one. Until we
have a heuristic that allows us to not consider this a leak, not emitting a
diagnostic is our best bet.
This commit disables all of the ivar-related retain count tests, but does not
remove them to ensure that we don't crash trying to evaluate either valid or
erroneous code. The next commit will add a new test for the example above so
that this commit (and the previous one) can be reverted wholesale when a better
solution is implemented.
Rest of rdar://problem/20335433
llvm-svn: 233592
Give up this checking in order to continue tracking that these values came from
direct ivar access, which will be important in the next commit.
Part of rdar://problem/20335433
llvm-svn: 233591
Fixes https://llvm.org/bugs/show_bug.cgi?id=20744
struct A {
A() = default;
};
Previously the source range of the declaration of A ended at the ')'. It should
include the '= default' part as well. The same for '= delete'.
Note: this will break one of the clang-tidy fixers, which is going to be
addessed in a follow-up patch.
Differential Revision: http://reviews.llvm.org/D8465
llvm-svn: 233028
Similarly, don't assume +0 if the property's setter is manually implemented.
In both cases, if the property's ownership is explicitly written, then we /do/
assume the ivar has the same ownership.
rdar://problem/20218183
llvm-svn: 232849
CloudABI also supports the arc4random() function. We can enable compiler
warnings for rand(), random() and *rand48() on this system as well.
llvm-svn: 231914
In theory we could assume a CF property is stored at +0 if there's not a custom
setter, but that's not really worth the complexity. What we do know is that a
CF property can't have ownership attributes, and so we shouldn't assume anything
about the ownership of the ivar.
rdar://problem/20076963
llvm-svn: 231553
Modules and Tooling tests in particular tend to want to change the cwd,
so we were missing test coverage in this area on Windows. It should now
be easier to write such portable tests.
llvm-svn: 231029
We expect in general that any nil value has no retain count information
associated with it; violating this results in unexpected state unification
/later/ when we decide to throw the information away. Unexpectedly caching
out can lead to an assertion failure or crash.
rdar://problem/19862648
llvm-svn: 229934
+ separate bug report for "Free alloca()" error to be able to customize checkers responsible for this error.
+ Muted "Free alloca()" error for NewDelete checker that is not responsible for c-allocated memory, turned on for unix.MismatchedDeallocator checker.
+ RefState for alloca() - to be able to detect usage of zero-allocated memory by upcoming ZeroAllocDereference checker.
+ AF_Alloca family to handle alloca() consistently - keep proper family in RefState, handle 'alloca' by getCheckIfTracked() facility, etc.
+ extra tests.
llvm-svn: 229850
PrettyStackTrace now requires the ENABLE_BACKTRACES option. We need to check for that here or these tests fail llvm-lit.
Reviewed by chandlerc
llvm-svn: 228735
to the plist output. This check_name field does not guaranteed to be the
same as the name of the checker in the future.
Reviewer: Anna Zaks
Differential Revision: http://reviews.llvm.org/D6841
llvm-svn: 228624
The analyzer thinks that ArraySubscriptExpr cannot be an r-value (ever).
However, it can be in some corner cases. Specifically, C forbids expressions
of unqualified void type from being l-values.
Note, the analyzer will keep modeling the subscript expr as an l-value. The
analyzer should be treating void* as a char array
(https://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Pointer-Arith.html).
llvm-svn: 228249
Instead of handling edge cases (mostly involving blocks), where we have difficulty finding
an allocation statement, allow the allocation site to be in a parent node.
Previously we assumed that the allocation site can always be found in the same frame
as allocation, but there are scenarios in which an element is leaked in a child
frame but is allocated in the parent.
llvm-svn: 228247
A refinement of r204730, itself a refinement of r198953, to better handle
cases where an object is accessed both through a property getter and
through direct ivar access. An object accessed through a property should
always be treated as +0, i.e. not owned by the caller. However, an object
accessed through an ivar may be at +0 or at +1, depending on whether the
ivar is a strong reference. Outside of ARC, we don't always have that
information.
The previous attempt would clear out the +0 provided by a getter, but only
if that +0 hadn't already participated in other retain counting operations.
(That is, "self.foo" is okay, but "[[self.foo retain] autorelease]" is
problematic.) This turned out to not be good enough when our synthesized
getters get involved.
This commit drops the notion of "overridable" reference counting and instead
just tracks whether a value ever came from a (strong) ivar. If it has, we
allow one more release than we otherwise would. This has the added benefit
of being able to catch /some/ overreleases of instance variables, though
it's not likely to come up in practice.
We do still get some false negatives because we currently throw away
refcount state upon assigning a value into an ivar. We should probably
improve on that in the future, especially once we synthesize setters as
well as getters.
rdar://problem/18075108
llvm-svn: 228174
A patch by Daniel DeFreez!
We were previously dropping edges on re-declarations. Store the
canonical declarations in the graph to ensure that different
references to the same function end up reflected with the same call graph
node.
(Note, this might lead to performance fluctuation because call graph
is used to determine the function analysis order.)
llvm-svn: 224398
This suppresses a common false positive when analyzing libc++.
Along the way, introduce some tests to show this checker actually
works with C++ static_cast<>.
llvm-svn: 220160
The return value of mempcpy is only correct when the destination type is
one byte in size. This patch casts the argument to a char* so the
calculation is also correct for structs, ints etc.
A patch by Daniel Fahlgren!
llvm-svn: 219024
in the super class, do not issue the warning about property
in current class's protocol will not be auto synthesized.
// rdar://18179833
llvm-svn: 216769
People have been incorrectly using "-analyzer-disable-checker" to
silence analyzer warnings on a file, when analyzing a project. Add
the "-analyzer-disable-all-checks" option, which would allow the
suppression and suggest it as part of the error message for
"-analyzer-disable-checker". The idea here is to compose this with
"--analyze" so that users can selectively opt out specific files from
static analysis.
llvm-svn: 216763
Do not warn when property declared in class's protocol will be auto-synthesized
by its uper class implementation because super class has also declared this
property while this class has not. Continue to warn if current class
has declared the property also (because this declaration will not result
in a 2nd synthesis).
rdar://18152478
llvm-svn: 216753
Currently the analyzer lazily models some functions using 'BodyFarm',
which constructs a fake function implementation that the analyzer
can simulate that approximates the semantics of the function when
it is called. BodyFarm does this by constructing the AST for
such definitions on-the-fly. One strength of BodyFarm
is that all symbols and types referenced by synthesized function
bodies are contextual adapted to the containing translation unit.
The downside is that these ASTs are hardcoded in Clang's own
source code.
A more scalable model is to allow these models to be defined as source
code in separate "model" files and have the analyzer use those
definitions lazily when a function body is needed. Among other things,
it will allow more customization of the analyzer for specific APIs
and platforms.
This patch provides the initial infrastructure for this feature.
It extends BodyFarm to use an abstract API 'CodeInjector' that can be
used to synthesize function bodies. That 'CodeInjector' is
implemented using a new 'ModelInjector' in libFrontend, which lazily
parses a model file and injects the ASTs into the current translation
unit.
Models are currently found by specifying a 'model-path' as an
analyzer option; if no path is specified the CodeInjector is not
used, thus defaulting to the current behavior in the analyzer.
Models currently contain a single function definition, and can
be found by finding the file <function name>.model. This is an
initial starting point for something more rich, but it bootstraps
this feature for future evolution.
This patch was contributed by Gábor Horváth as part of his
Google Summer of Code project.
Some notes:
- This introduces the notion of a "model file" into
FrontendAction and the Preprocessor. This nomenclature
is specific to the static analyzer, but possibly could be
generalized. Essentially these are sources pulled in
exogenously from the principal translation.
Preprocessor gets a 'InitializeForModelFile' and
'FinalizeForModelFile' which could possibly be hoisted out
of Preprocessor if Preprocessor exposed a new API to
change the PragmaHandlers and some other internal pieces. This
can be revisited.
FrontendAction gets a 'isModelParsingAction()' predicate function
used to allow a new FrontendAction to recycle the Preprocessor
and ASTContext. This name could probably be made something
more general (i.e., not tied to 'model files') at the expense
of losing the intent of why it exists. This can be revisited.
- This is a moderate sized patch; it has gone through some amount of
offline code review. Most of the changes to the non-analyzer
parts are fairly small, and would make little sense without
the analyzer changes.
- Most of the analyzer changes are plumbing, with the interesting
behavior being introduced by ModelInjector.cpp and
ModelConsumer.cpp.
- The new functionality introduced by this change is off-by-default.
It requires an analyzer config option to enable.
llvm-svn: 216550
Also, make it slightly clearer what's being tested by only differentiating integer
literals based on their suffix, rather than using a very large constant.
llvm-svn: 216133
Yet more problems due to the missing CXXBindTemporaryExpr in the CFG for
default arguments.
Unfortunately we cannot just switch off inserting temporaries for the
corresponding default arguments, as that breaks existing tests
(test/SemaCXX/return-noreturn.cpp:245).
llvm-svn: 215554
In cases like:
struct C { ~C(); }
void f(C c = C());
void t() {
f();
}
We currently do not add the CXXBindTemporaryExpr for the temporary (the
code mentions that as the default parameter expressions are owned by
the declaration, we'd otherwise add the same expression multiple times),
but we add the temporary destructor pointing to the CXXBindTemporaryExpr.
We need to fix that before we can re-enable the assertion.
llvm-svn: 215357
Changes to the original patch:
- model the CFG for temporary destructors in conditional operators so that
the destructors of the true and false branch are always exclusive. This
is necessary because we must not have impossible paths for the path
based analysis to work.
- add multiple regression tests with ternary operators
Original description:
Fix modelling of non-lifetime-extended temporary destructors in the
analyzer.
Changes to the CFG:
When creating the CFG for temporary destructors, we create a structure
that mirrors the branch structure of the conditionally executed
temporary constructors in a full expression.
The branches we create use a CXXBindTemporaryExpr as terminator which
corresponds to the temporary constructor which must have been executed
to enter the destruction branch.
2. Changes to the Analyzer:
When we visit a CXXBindTemporaryExpr we mark the CXXBindTemporaryExpr as
executed in the state; when we reach a branch that contains the
corresponding CXXBindTemporaryExpr as terminator, we branch out
depending on whether the corresponding CXXBindTemporaryExpr was marked
as executed.
llvm-svn: 215096
This reverts commit r214962 because after the change the
following code doesn't compile with -Wreturn-type -Werror.
#include <cstdlib>
class NoReturn {
public:
~NoReturn() __attribute__((noreturn)) { exit(1); }
};
int check() {
true ? NoReturn() : NoReturn();
}
llvm-svn: 214998
1. Changes to the CFG:
When creating the CFG for temporary destructors, we create a structure
that mirrors the branch structure of the conditionally executed
temporary constructors in a full expression.
The branches we create use a CXXBindTemporaryExpr as terminator which
corresponds to the temporary constructor which must have been executed
to enter the destruction branch.
2. Changes to the Analyzer:
When we visit a CXXBindTemporaryExpr we mark the CXXBindTemporaryExpr as
executed in the state; when we reach a branch that contains the
corresponding CXXBindTemporaryExpr as terminator, we branch out
depending on whether the corresponding CXXBindTemporaryExpr was marked
as executed.
llvm-svn: 214962
MaterializeTemporaryExpr already contains information about the lifetime
of the temporary; if the lifetime is not the full statement, we do not
want to emit a destructor at the end of the full statement for it.
llvm-svn: 214292
This new checker, alpha.core.TestAfterDivZero, catches issues like this:
int sum = ...
int avg = sum / count; // potential division by zero...
if (count == 0) { ... } // ...caught here
Because the analyzer does not necessarily explore /all/ paths through a program,
this check is restricted to only work on zero checks that immediately follow a
division operation (/ % /= %=). This could later be expanded to handle checks
dominated by a division operation but not necessarily in the same CFG block.
Patch by Anders Rönnholm! (with very minor modifications by me)
llvm-svn: 212731
This silences false positives (leaks, use of uninitialized value) in simple
code that uses containers such as std::vector and std::list. The analyzer
cannot reason about the internal invariances of those data structures which
leads to false positives. Until we come up with a better solution to that
problem, let's just not inline the methods of the containers and allow objects
to escape whenever such methods are called.
This just extends an already existing flag "c++-container-inlining" and applies
the heuristic not only to constructors and destructors of the containers, but
to all of their methods.
We have a bunch of distinct user reports all related to this issue
(radar://16058651, radar://16580751, radar://16384286, radar://16795491
[PR19637]).
llvm-svn: 211832
When adding the implicit compound statement (required for Codegen?), the
end location was previously overridden by the start location, probably
based on the assumptions:
* The location of the compound statement should be the member's location
* The compound statement if present is the last element of a FunctionDecl
This patch changes the location of the compound statement to the
member's end location.
Code review: http://reviews.llvm.org/D4175
llvm-svn: 211344
Doing this caused us to mistakenly think we'd seen a particular state before
when we actually hadn't, which resulted in false negatives. Credit to
Rafael Auler for discovering this issue!
llvm-svn: 211209
Fixes a crash in Retain Count checker error reporting logic by handing
the allocation statement retrieval from a BlockEdge program point.
Also added a simple CFG dump routine for debugging.
llvm-svn: 210960
This applies to __attribute__((pure)) as well, but 'const' is more interesting
because many of our builtins are marked 'const'.
PR19661
llvm-svn: 208154
The assignment needs to be before the destruction of the temporary.
This patch calls out to addStmt, which invokes VisitDeclStmt, which has
all the correct logic for handling temporaries.
llvm-svn: 207985
Document and simplify ResolveCondition.
1. Introduce a temporary special case for temporary desctructors when resolving
the branch condition - in an upcoming patch, alexmc will change temporary
destructor conditions to not run through this logic, in which case we can remove
this (marked as FIXME); this currently fixes a crash.
2. Simplify ResolveCondition; while documenting the function, I noticed that it
always returns the last statement - either that statement is the condition
itself (in which case the condition was returned anyway), or the rightmost
leaf is returned; for correctness, the rightmost leaf must be evaluated anyway
(which the CFG does in the last statement), thus we can just return the last
statement in that case, too. Added an assert to verify the invariant.
llvm-svn: 207957
The constructor that comes right before a variable declaration in the CFG might
not be the initialization of that variable. Previously, we just checked that
the variable's initializer expression was different from the construction
expression, but forgot to see whether the variable had an initializer expression
at all.
Thanks for the prompting, David.
llvm-svn: 207562
While we don't model pointer-to-member operators yet (neither .* nor ->*),
CallAndMessageChecker still checks to make sure the 'this' object is not
null or undefined first. However, it also expects that the object should
always have a valid MemRegion, something that's generally important elsewhere
in the analyzer as well. Ensure this is true ahead of time, just like we do
for member access.
PR19531
llvm-svn: 207561
This also includes some infrastructure to make it easier to build multi-argument
selectors, rather than trying to use string matching on each piece. There's a bit
more setup code, but less cost at runtime.
PR18908
llvm-svn: 205827
Fixes a false positive when temporary destructors are enabled where a temporary
is destroyed after a variable is constructed but before the VarDecl itself is
processed, which occurs when the variable is in the condition of an if or while.
Patch by Alex McCarthy, with an extra test from me.
llvm-svn: 205661
For namespaces, this is consistent with mangling and GCC's debug info
behavior. For structs, GCC uses <anonymous struct> but we prefer
consistency between all anonymous entities but don't want to confuse
them with template arguments, etc, so we'll just go with parens in all
cases.
llvm-svn: 205398
If we're trying to get the zero element region of something that's not a region,
we should be returning UnknownVal, which is what ProgramState::getLValue will
do for us.
Patch by Alex McCarthy!
llvm-svn: 205327
Add M_ZERO awareness to malloc() static analysis in Clang for FreeBSD,
NetBSD, and OpenBSD in a similar fashion to O_CREAT for open(2).
These systems have a three-argument malloc() in the kernel where the
third argument contains flags; the M_ZERO flag will zero-initialize the
allocated buffer.
This should reduce the number of false positives when running static
analysis on BSD kernels.
Additionally, add kmalloc() (Linux kernel malloc()) and treat __GFP_ZERO
like M_ZERO on Linux.
Future work involves a better method of checking for named flags without
hardcoding values.
Patch by Conrad Meyer, with minor modifications by me.
llvm-svn: 204832
A refinement of r198953 to handle cases where an object is accessed both through
a property getter and through direct ivar access. An object accessed through a
property should always be treated as +0, i.e. not owned by the caller. However,
an object accessed through an ivar may be at +0 or at +1, depending on whether
the ivar is a strong reference. Outside of ARC, we don't have that information,
so we just don't track objects accessed through ivars.
With this change, accessing an ivar directly will deliberately override the +0
provided by a getter, but only if the +0 hasn't participated in other retain
counting yet. That isn't perfect, but it's already unusual for people to be
mixing property access with direct ivar access. (The primary use case for this
is in setters, init methods, and -dealloc.)
Thanks to Ted for spotting a few mistakes in private review.
<rdar://problem/16333368>
llvm-svn: 204730
Passing a pointer to an uninitialized memory buffer is normally okay,
but if the function is declared to take a pointer-to-const then it's
very unlikely it will be modifying the buffer. In this case the analyzer
should warn that there will likely be a read of uninitialized memory.
This doesn't check all elements of an array, only the first one.
It also doesn't yet check Objective-C methods, only C functions and
C++ methods.
This is controlled by a new check: alpha.core.CallAndMessageUnInitRefArg.
Patch by Per Viberg!
llvm-svn: 203822
Like the binary operator check of r201702, this actually checks the
condition of every if in a chain against every other condition, an
O(N^2) operation. In most cases N should be small enough to make this
practical, and checking all cases like this makes it much more likely
to catch a copy-paste error within the same series of branches.
Part of IdenticalExprChecker; patch by Daniel Fahlgren!
llvm-svn: 203585
null comparison when the pointer is known to be non-null.
This catches the array to pointer decay, function to pointer decay and
address of variables. This does not catch address of function since this
has been previously used to silence a warning.
Pointer to bool conversion is under -Wbool-conversion.
Pointer to null comparison is under -Wtautological-pointer-compare, a sub-group
of -Wtautological-compare.
void foo() {
int arr[5];
int x;
// warn on these conditionals
if (foo);
if (arr);
if (&x);
if (foo == null);
if (arr == null);
if (&x == null);
if (&foo); // no warning
}
llvm-svn: 202216
For now, just ignore them. Later, we could try looking through LazyCompoundVals,
but we at least shouldn't crash.
<rdar://problem/16153464>
llvm-svn: 202212
Somehow both Daniel and I missed the fact that while loops are only identical
if they have identical bodies.
Patch by Daniel Fahlgren!
llvm-svn: 201829
IdenticalExprChecker now warns if any expressions in a logical or bitwise
chain (&&, ||, &, |, or ^) are the same. Unlike the previous patch, this
actually checks all subexpressions against each other (an O(N^2) operation,
but N is likely to be small).
Patch by Daniel Fahlgren!
llvm-svn: 201702
This extends the checks for identical expressions to handle identical
statements, and compares the consequent and alternative ("then" and "else")
branches of an if-statement to see if they are identical, treating a single
statement surrounded by braces as equivalent to one without braces.
This does /not/ check subsequent branches in an if/else chain, let alone
branches that are not consecutive. This may improve in a future patch, but
it would certainly take more work.
Patch by Daniel Fahlgren!
llvm-svn: 201701
This will let us stage in the modeling of operator new. The -analyzer-config
opton 'c++-inline-allocators' is currently off by default.
Patch by Karthik Bhat!
llvm-svn: 201122
This means always walking the whole call stack for the end path node, but
we'll assume that's always fairly tractable.
<rdar://problem/15952973>
llvm-svn: 200980
redeclaration, not just when looking them up for a use -- we need the implicit
declaration to appropriately check various properties of them (notably, whether
they're deleted).
llvm-svn: 200729
Due to statement expressions supported as GCC extension, it is possible
to put 'break' or 'continue' into a loop/switch statement but outside
its body, for example:
for ( ; ({ if (first) { first = 0; continue; } 0; }); )
This code is rejected by GCC if compiled in C mode but is accepted in C++
code. GCC bug 44715 tracks this discrepancy. Clang used code generation
that differs from GCC in both modes: only statement of the third
expression of 'for' behaves as if it was inside loop body.
This change makes code generation more close to GCC, considering 'break'
or 'continue' statement in condition and increment expressions of a
loop as it was inside the loop body. It also adds error for the cases
when 'break'/'continue' appear outside loop due to this syntax. If
code generation differ from GCC, warning is issued.
Differential Revision: http://llvm-reviews.chandlerc.com/D2518
llvm-svn: 199897
If there are non-trivially-copyable types /other/ than C++ records, we
won't have a synthesized copy expression, but we can't just use a simple
load/return.
Also, add comments and shore up tests, making sure to test in both ARC
and non-ARC.
llvm-svn: 199869
Citation: C++11 [expr.shift]p1 (and the equivalent text in C11).
This fixes PR18073, but the right thing to do (as noted in the FIXME) is to
have a real checker for too-large shifts.
llvm-svn: 199405
test the CC1 layer.
This actually uncovered that the test semes to no longer be passing for
the reasons intended. =[ The name of the test would lead me to believe
that it should be testing the semantics of noreturn in the static
analyzer.... but there are in fact no -verify assertions about noreturn
that i can find. And the noreturn checker is no longer in 'alpha.core'.
It is in 'core.builtins'. The test *does* have one assertion for a null
dereference warning. This *also* isn't in 'alpha.core', but the driver
inserts a pile of other checker packages, including 'core' which has
this warning.
So I have switch the RUN line to actually do the minimal thing that this
test currently exercises, but someone who works on the static analyzer
should probably look at this and either nuke it or move it to actually
check the noreturn behavior.
llvm-svn: 199307
In preparation for making the Win32 triple imply MS ABI mode,
make all tests pass in this mode, or make them use the Itanium
mode explicitly.
Differential Revision: http://llvm-reviews.chandlerc.com/D2401
llvm-svn: 199130
In an expression like "new (a, b) Foo(x, y)", two things happen:
- Memory is allocated by calling a function named 'operator new'.
- The memory is initialized using the constructor for 'Foo'.
Currently the analyzer only models the second event, though it has special
cases for both the default and placement forms of operator new. This patch
is the first step towards properly modeling both events: it changes the CFG
so that the above expression now generates the following elements.
1. a
2. b
3. (CFGNewAllocator)
4. x
5. y
6. Foo::Foo
The analyzer currently ignores the CFGNewAllocator element, but the next
step is to treat that as a call like any other.
The CFGNewAllocator element is not added to the CFG for analysis-based
warnings, since none of them take advantage of it yet.
llvm-svn: 199123
...by synthesizing their body to be "return self->_prop;", with an extra
nudge to RetainCountChecker to still treat the value as +0 if we have no
other information.
This doesn't handle weak properties, but that's mostly correct anyway,
since they can go to nil at any time. This also doesn't apply to properties
whose implementations we can't see, since they may not be backed by an
ivar at all. And finally, this doesn't handle properties of C++ class type,
because we can't invoke the copy constructor. (Sema has actually done this
work already, but the AST it synthesizes is one the analyzer doesn't quite
handle -- it has an rvalue DeclRefExpr.)
Modeling setters is likely to be more difficult (since it requires
handling strong/copy), but not impossible.
<rdar://problem/11956898>
llvm-svn: 198953
...rather somewhere in the destructor when we try to access something and
realize the object has already been deleted. This is necessary because
the destructor is processed before the 'delete' itself.
Patch by Karthik Bhat!
llvm-svn: 198779
...even though the argument is declared "const void *", because this is
just a way to pass pointers around as objects. (Though NSData is often
a better one.)
PR18262
llvm-svn: 198710
This checker has not been updated to work with interprocedural analysis,
and actually contains both logical correctness issues but also
memory bugs. We can resuscitate it from version control once there
is focused interest in making it a real viable checker again.
llvm-svn: 198476
We have assertions for this, but a few edge cases had snuck through where
we were still unconditionally using 'int'.
<rdar://problem/15703011>
llvm-svn: 197733
Fixes <rdar://problem/15584219> and <rdar://problem/12241361>.
This change looks large, but all it does is reuse and consolidate
the delayed diagnostic logic for deprecation warnings with unavailability
warnings. By doing so, it showed various inconsistencies between the
diagnostics, which were close, but not consistent. It also revealed
some missing "note:"'s in the deprecated diagnostics that were showing
up in the unavailable diagnostics, etc.
This change also changes the wording of the core deprecation diagnostics.
Instead of saying "function has been explicitly marked deprecated"
we now saw "'X' has been been explicitly marked deprecated". It
turns out providing a bit more context is useful, and often we
got the actual term wrong or it was not very precise
(e.g., "function" instead of "destructor"). By just saying the name
of the thing that is deprecated/deleted/unavailable we define
this issue away. This diagnostic can likely be further wordsmithed
to be shorter.
llvm-svn: 197627
cstring, converted to NSString, produce the
matching AST for it. This also required some
refactoring of the previous code. // rdar://14106083
llvm-svn: 197605
This patch was submitted to the list for review and didn't receive a LGTM.
(In fact one explicit objection and one query were raised.)
This reverts commit r197295.
llvm-svn: 197299
Previously, a line like
// expected-error-re {{foo}}
treats the entirety of foo as a regex. This is inconvenient when matching type
names containing regex characters. For example, to match
"void *(class test8::A::*)(void)" inside such a regex, one would have to type
"void \*\(class test8::A::\*\)\(void\)".
This patch changes the semantics of expected-error-re to only treat the parts
of the directive wrapped in double curly braces as regexes. This avoids the
escaping problem and leads to nicer patterns for those cases; see e.g. the
change to test/Sema/format-strings-scanf.c.
(The balanced search for closing }} of a directive also makes us handle the
full directive in test\SemaCXX\constexpr-printing.cpp:41 and :53.)
Differential Revision: http://llvm-reviews.chandlerc.com/D2388
llvm-svn: 197092
Warn if both result expressions of a ternary operator (? :) are the same.
Because only one of them will be executed, this warning will fire even if
the expressions have side effects.
Patch by Anders Rönnholm and Per Viberg!
llvm-svn: 196937
This is another regression fixed by reverting r189090.
In this case, the problem is not live variables but the approach that was taken in r189090. This regression was caused by explicitly binding "true" to the condition when we take the true branch. Normally that's okay, but in this case we're planning to reuse that condition as the value of the expression.
llvm-svn: 196599
This reverts commit r189090.
The original patch introduced regressions (see the added live-variables.* tests). The patch depends on the correctness of live variable analyses, which are not computed correctly. I've opened PR18159 to track the proper resolution to this problem.
The patch was a stepping block to r189746. This is why part of the patch reverts temporary destructor tests that started crashing. The temporary destructors feature is disabled by default.
llvm-svn: 196593
New rules of invalidation/escape of the source buffer of memcpy: the source buffer contents is invalidated and escape while the source buffer region itself is neither invalidated, nor escape.
In the current modeling of memcpy the information about allocation state of regions, accessible through the source buffer, is not copied to the destination buffer and we can not track the allocation state of those regions anymore. So we invalidate/escape the source buffer indirect regions in anticipation of their being invalidated for real later. This eliminates false-positive leaks reported by the unix.Malloc and alpha.cplusplus.NewDeleteLeaks checkers for the cases like
char *f() {
void *x = malloc(47);
char *a;
memcpy(&a, &x, sizeof a);
return a;
}
llvm-svn: 194953
This is similar to r194004: because we can't reason about the data structure
invariants of std::basic_string, the analyzer decides it's possible for an
allocator to be used to deallocate the string's inline storage. Just ignore
this by walking up the stack, skipping past methods in classes with
"allocator" in the name, and seeing if we reach std::basic_string that way.
PR17866
llvm-svn: 194764
This syntactic checker looks for expressions on both sides of comparison
operators that are structurally the same. As a special case, the
floating-point idiom "x != x" for "isnan(x)" is left alone.
Currently this only checks comparison operators, but in the future we could
extend this to include logical operators or chained if-conditionals.
Checker by Per Viberg!
llvm-svn: 194236
An Objective-C for-in loop will have zero iterations if the collection is
empty. Previously, we could only detect this case if the program asked for
the collection's -count /before/ the for-in loop. Now, the analyzer
distinguishes for-in loops that had zero iterations from those with at
least one, and can use this information to constrain the result of calling
-count after the loop.
In order to make this actually useful, teach the checker that methods on
NSArray, NSDictionary, and the other immutable collection classes don't
change the count.
<rdar://problem/14992886>
llvm-svn: 194235
The path note that says "Loop body executed 0 times" has been changed to
"Loop body skipped when range is empty" for C++11 for-range loops, and to
"Loop body skipped when collection is empty" for Objective-C for-in loops.
Part of <rdar://problem/14992886>
llvm-svn: 194234
We could certainly be more precise in many of our diagnostics, but before we
were printing "Assuming x is && y", which is just ridiculous.
<rdar://problem/15167979>
llvm-svn: 193455
This ensures that variables accessible through a union are invalidated when
the union value is passed to a function. We still don't fully handle union
values, but this should at least quiet some false positives.
PR16596
llvm-svn: 193265
This patch wasn't reviewed, and isn't correctly preserving the behaviors
relied upon by QT. I don't have a direct example of fallout, but it
should go through the standard code review process. For example, it
should never have removed the QT test case that was added when fixing
those users.
llvm-svn: 193174
Due to statement expressions supported as GCC extension, it is possible
to put 'break' or 'continue' into a loop/switch statement but outside its
body, for example:
for ( ; ({ if (first) { first = 0; continue; } 0; }); )
Such usage must be diagnosed as an error, GCC rejects it. To recognize
this and similar patterns the flags BreakScope and ContinueScope are
temporarily turned off while parsing condition expression.
Differential Revision: http://llvm-reviews.chandlerc.com/D1762
llvm-svn: 193073
Since these aren't lexically in the constructor, drawing arrows would
be a horrible jump across the body of the class. We could still do
better here by skipping over unimportant initializers, but this at least
keeps everything within the body of the constructor.
<rdar://problem/14960554>
llvm-svn: 192818
This will emit a warning if a call to clang_analyzer_warnIfReached is
executed, printing REACHABLE. This is a more explicit way to declare
expected reachability than using clang_analyzer_eval or triggering
a bug (divide-by-zero or null dereference), and unlike the former will
work the same in inlined functions and top-level functions. Like the
other debug helpers, it is part of the debug.ExprInspection checker.
Patch by Jared Grubb!
llvm-svn: 191909
Also add some tests that there is actually a message and that the bug is
actually a hard error. This actually behaved correctly before, because:
- addTransition() doesn't actually add a transition if the new state is null;
it assumes you want to propagate the predecessor forward and does nothing.
- generateSink() is called in order to emit a bug report.
- If at least one new node has been generated, the predecessor node is /not/
propagated forward.
But now it's spelled out explicitly.
Found by Richard Mazorodze, who's working on a patch that may require this.
llvm-svn: 191805
...rather than trying to figure it out from the call site, and having
people complain that we guessed wrong and that a prototype-less call is
the same as a variadic call on their system. More importantly, fix a
crash when there's no decl at the call site (though we could have just
returned a default value).
<rdar://problem/15037033>
llvm-svn: 191599
Now that the CFG includes nodes for the destructors in a delete-expression,
process them in the analyzer using the same common destructor interface
currently used for local, member, and base destructors. Also, check for when
the value is known to be null, in which case no destructor is actually run.
This does not yet handle destructors for deleted /arrays/, which may need
more CFG work. It also causes a slight regression in the location of
double delete warnings; the double delete is detected at the destructor
call, which is implicit, and so is reported on the first access within the
destructor instead of at the 'delete' statement. This will be fixed soon.
Patch by Karthik Bhat!
llvm-svn: 191381
We now have symbols with floating-point type to make sure that
(double)x == (double)x comes out true, but we still can't do much with
these. For now, don't even bother trying to create a floating-point zero
value; just give up on conversion to bool.
PR14634, C++ edition.
llvm-svn: 190953
"+method_name: cannot take ownership of memory allocated by 'new'."
instead of the old
"Memory allocated by 'new' should be deallocated by 'delete', not +method_name"
llvm-svn: 190800
variable uninitialized every time we reach its (reachable) declaration, or
every time we call the surrounding function, promote the warning from
-Wmaybe-uninitialized to -Wsometimes-uninitialized.
This is still slightly weaker than desired: we should, in general, warn
if a use is uninitialized the first time it is evaluated.
llvm-svn: 190623
RegionStore tries to protect against accidentally initializing the same
region twice, but it doesn't take subregions into account very well. If
the outer region being initialized is a struct with an empty base class,
the offset of the first field in the struct will be 0. When we initialize
the base class, we may invalidate the contents of the struct by providing
a default value of Unknown (or some new symbol). We then go to initialize
the member with a zeroing constructor, only to find that the region at
that offset in the struct already has a value. The best we can do here is
to invalidate that value and continue; neither the old default value nor
the new 0 is correct for the entire struct after the member constructor call.
The correct solution for this is to track region extents in the store.
<rdar://problem/14914316>
llvm-svn: 190530
Summary:
If a noreturn destructor is executed while returning a value from a function,
the resulting CFG has had two edges to the exit block. This crashed the analyzer,
because it expects that blocks with no terminators have only one outgoing edge.
I added code to avoid creating the second edge in this case.
PS: The crashes did not manifest themselves always, as usually the
NoReturnFunctionChecker would stop program evaluation before the analyzer hit
the assertion, but in the case of lifetime extended temporaries, the checker
failed to do that (which is a separate bug in itself).
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1513
llvm-svn: 190125
Summary:
I've had a test failure here while experimenting and I've found that it's
impossible to find what is wrong with the previous structure of the file. So I
have grouped the expected output with the function that produces it, to make
searching for discrepancies more obvious.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1595
llvm-svn: 190037
This paves the way for adding support for modeling the destructor of a
region before it is deleted. The statement "delete <expr>" now generates
this series of CFG elements:
1. <expr>
2. [B1.1]->~Foo() (Implicit destructor)
3. delete [B1.1]
Patch by Karthik Bhat!
llvm-svn: 189828
This is an improved version of r186498. It enables ExprEngine to reason about
temporary object destructors. However, these destructor calls are never
inlined, since this feature is still broken. Still, this is sufficient to
properly handle noreturn temporary destructors.
Now, the analyzer correctly handles expressions like "a || A()", and executes the
destructor of "A" only on the paths where "a" evaluted to false.
Temporary destructor processing is still off by default and one has to
explicitly request it by setting cfg-temporary-dtors=true.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1259
llvm-svn: 189746
This will never happen in the analyzed code code, but can happen for checkers
that over-eagerly dereference pointers without checking that it's safe.
UnknownVal is a harmless enough value to get back.
Fixes an issue added in r189590, caught by our internal buildbot.
llvm-svn: 189688
Summary:
Previously, Sema was reusing parts of the AST when synthesizing an assignment
operator, turning it into a AS-dag. This caused problems for the static
analyzer, which assumed an expression appears in the tree only once.
Here I make sure to always create a fresh Expr, when inserting something into
the AST, fixing PR16745 in the process.
Reviewers: doug.gregor
CC: cfe-commits, jordan_rose
Differential Revision: http://llvm-reviews.chandlerc.com/D1425
llvm-svn: 189659
Summary:
RegionStoreManager had an optimization which replaces references to empty
structs with UnknownVal. Unfortunately, this check didn't take into account
possible field members in base classes.
To address this, I changed this test to "is empty and has no base classes". I
don't consider it worth the trouble to go through base classes and check if all
of them are empty.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1547
llvm-svn: 189590
When casting the address of a FunctionTextRegion to bool, or when adding
constraints to such an address, use a stand-in symbol to represent the
presence or absence of the function if the function is weakly linked.
This is groundwork for possible simple availability testing checks, and
can already catch mistakes involving inverted null checks for
weakly-linked functions.
Currently, the implementation reuses the "extent" symbols, originally created
for tracking the size of a malloc region. Since FunctionTextRegions cannot
be dereferenced, the extent symbol will never be used for anything else.
Still, this probably deserves a refactoring in the future.
This patch does not attempt to support testing the presence of weak
/variables/ (global variables), which would likely require much more of
a change and a generalization of "region structure metadata", like the
current "extents", vs. "region contents metadata", like CStringChecker's
"string length".
Patch by Richard <tarka.t.otter@googlemail.com>!
llvm-svn: 189492
Summary:
-fno-exceptions does not implicitly attach a nothrow specifier to every operator
new. Even in this mode, non-nothrow new must not return a null pointer. Failure
to allocate memory can be signalled by other means, or just by killing the
program. This behaviour is consistent with the compiler - even with
-fno-exceptions, the generated code never tests for null (and would segfault if
the opeator actually happened to return null).
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1528
llvm-svn: 189452
Summary:
Instead of digging through the ExplodedGraph, to figure out which edge brought
us here, I compute the value of conditional expression by looking at the
sub-expression values.
To do this, I needed to change the liveness algorithm a bit -- now, the full
conditional expression also depends on all atomic sub-expressions, not only the
outermost ones.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1340
llvm-svn: 189090
This is still an alpha checker, but we use it in certain tests to make sure
something is not being executed.
This should fix the buildbots.
llvm-svn: 188682
This keeps the analyzer from making silly assumptions, like thinking
strlen(foo)+1 could wrap around to 0. This fixes PR16558.
Patch by Karthik Bhat!
llvm-svn: 188680
This builtin does not actually evaluate its arguments for side effects,
so we shouldn't include them in the CFG. In the analyzer, rely on the
constant expression evaluator to get the proper semantics, at least for
now. (In the future, we could get ambitious and try to provide path-
sensitive size values.)
In theory, this does pose a problem for liveness analysis: a variable can
be used within the __builtin_object_size argument expression but not show
up as live. However, it is very unlikely that such a value would be used
to compute the object size and not used to access the object in some way.
<rdar://problem/14760817>
llvm-svn: 188679
This once again restores notes to following their associated warnings
in -analyzer-output=text mode. (This is still only intended for use as a
debugging aid.)
One twist is that the warning locations in "regular" analysis output modes
(plist, multi-file-plist, html, and plist-html) are reported at a different
location on the command line than in the output file, since the command
line has no path context. This commit makes -analyzer-output=text behave
like a normal output format, which means that the *command line output
will be different* in -analyzer-text mode. Again, since -analyzer-text is
a debugging aid and lo-fi stand-in for a regular output mode, this change
makes sense.
Along the way, remove a few pieces of stale code related to the path
diagnostic consumers.
llvm-svn: 188514
When a region is realloc()ed, MallocChecker records whether it was known
to be allocated or not. If it is, and the reallocation fails, the original
region has to be freed. Previously, when an allocated region escaped,
MallocChecker completely stopped tracking it, so a failed reallocation
still (correctly) wouldn't require freeing the original region. Recently,
however, MallocChecker started tracking escaped symbols, so that if it were
freed we could check that the deallocator matched the allocator. This
broke the reallocation model for whether or not a symbol was allocated.
Now, MallocChecker will actually check if a symbol is owned, and only
require freeing after a failed reallocation if it was owned before.
PR16730
llvm-svn: 188468
Various tests had sprung up over the years which had --check-prefix=ABC on the
RUN line, but "CHECK-ABC:" later on. This happened to work before, but was
strictly incorrect. FileCheck is getting stricter soon though.
Patch by Ron Ofir.
llvm-svn: 188174
Summary:
ExprEngine had code which specificaly disabled using CXXTempObjectRegions in
InitListExprs. This was a hack put in r168757 to silence a false positive.
The underlying problem seems to have been fixed in the mean time, as removing
this code doesn't seem to break anything. Therefore I propose to remove it and
solve PR16629 in the process.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1325
llvm-svn: 188059
We process autorelease counts when we exit functions, but if there's an
issue in a synthesized body the report will get dropped. Just skip the
processing for now and let it get handled when the caller gets around to
processing autoreleases.
(This is still suboptimal: objects autoreleased in the caller context
should never be warned about when exiting a callee context, synthesized
or not.)
Second half of <rdar://problem/14611722>
llvm-svn: 187625
Much of our diagnostic machinery is set up to assume that the report
end path location is valid. Moreover, the user may be quite confused
when something goes wrong in our BodyFarm-synthesized function bodies,
which may be simplified or modified from the real implementations.
Rather than try to make this all work somehow, just drop the report so
that we don't try to go on with an invalid source location.
Note that we still handle reports whose /paths/ go through invalid
locations, just not those that are reported in one.
We do have to be careful not to lose warnings because of this.
The impetus for this change was an autorelease being processed within
the synthesized body, and there may be other possible issues that are
worth reporting in some way. We'll take these as they come, however.
<rdar://problem/14611722>
llvm-svn: 187624
Summary:
When binding a temporary object to a static local variable, the analyzer would
complain about a dangling reference even though the temporary's lifetime should
be extended past the end of the function. This commit tries to detect these
cases and construct them in a global memory region instead of a local one.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1133
llvm-svn: 187196
These are cases where a scalar type is "destructed", usually due to
template instantiation (e.g. "obj.~T()", where 'T' is 'int'). This has
no actual effect and the analyzer should just skip over it.
llvm-svn: 186927
The analyzer doesn't currently expect CFG blocks with terminators to be
empty, but this can happen when generating conditional destructors for
a complex logical expression, such as (a && (b || Temp{})). Moreover,
the branch conditions for these expressions are not persisted in the
state. Even for handling noreturn destructors this needs more work.
This reverts r186498.
llvm-svn: 186925
Previously, we would simply abort the path when we saw a default member
initialization; now, we actually attempt to evaluate it. Like default
arguments, the contents of these expressions are not actually part of the
current function, so we fall back to constant evaluation.
llvm-svn: 186521
Previously, SValBuilder knew how to evaluate StringLiterals, but couldn't
handle an array-to-pointer decay for constant values. Additionally,
RegionStore was being too strict about loading from an array, refusing to
return a 'char' value from a 'const char' array. Both of these have been
fixed.
llvm-svn: 186520
Previously, the use of a std::initializer_list (actually, a
CXXStdInitializerListExpr) would cause the analyzer to give up on the rest
of the path. Now, it just uses an opaque symbolic value for the
initializer_list and continues on.
At some point in the future we can add proper support for initializer_list,
with access to the elements in the InitListExpr.
<rdar://problem/14340207>
llvm-svn: 186519
Summary:
This patch enables ExprEndgine to reason about temporary object destructors.
However, these destructor calls are never inlined, since this feature is still
broken. Still, this is sufficient to properly handle noreturn temporary
destructors and close bug #15599. I have also enabled the cfg-temporary-dtors
analyzer option by default.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1131
llvm-svn: 186498
Previously, we asserted that whenever 'new' did not include a constructor
call, the type must be a non-record type. In C++11, however, uniform
initialization syntax (braces) allow 'new' to construct records with
list-initialization: "new Point{1, 2}".
Removing this assertion should be perfectly safe; the code here matches
what VisitDeclStmt does for regions allocated on the stack.
<rdar://problem/14403437>
llvm-svn: 186028
list is the name of a class, not a namespace. Change the test as well - the previous
version did not test properly.
Fixes radar://14317928.
llvm-svn: 185898
The motivation is to suppresses false use-after-free reports that occur when calling
std::list::pop_front() or std::list::pop_back() twice. The analyzer does not
reason about the internal invariants of the list implementation, so just do not report
any of warnings in std::list.
Fixes radar://14317928.
llvm-svn: 185609
Summary:
The analyzer incorrectly handled noreturn destructors which were hidden inside
function calls. This happened because NoReturnFunctionChecker only listened for
PostStmt events, which are not executed for destructor calls. I've changed it to
listen to PostCall events, which should catch both cases.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1056
llvm-svn: 185522
While we don't model pointers-to-members besides "null" and "non-null",
we were using Loc symbols for valid pointers and NonLoc integers for the
null case. This hit the assert committed in r185401.
Fixed by using a true (Loc) null for null member pointers.
llvm-svn: 185444
Summary:
Static analyzer used to abort when encountering AttributedStmts, because it
asserted that the statements should not appear in the CFG. This is however not
the case, since at least the clang::fallthrough annotation makes it through.
This commit simply makes the analyzer ignore the statement attributes.
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1030
llvm-svn: 185417
Re-apply r184511, reverted in r184561, with the trivial default constructor
fast path removed -- it turned out not to be necessary here.
Certain expressions can cause a constructor invocation to zero-initialize
its object even if the constructor itself does no initialization. The
analyzer now handles that before evaluating the call to the constructor,
using the same "default binding" mechanism that calloc() uses, rather
than simply ignoring the zero-initialization flag.
<rdar://problem/14212563>
llvm-svn: 184815
In order to make sure virtual base classes are always initialized once,
the AST contains initializers for the base class in /all/ of its
descendents, not just the immediate descendents. However, at runtime,
the most-derived object is responsible for initializing all the virtual
base classes; all the other initializers will be ignored.
The analyzer now checks to see if it's being called from another base
constructor, and if so does not perform virtual base initialization.
<rdar://problem/14236851>
llvm-svn: 184814
This fixes false positives by allowing us to know that a loop is always entered if
the collection count method returns a positive value and vice versa.
Addresses radar://14169391.
llvm-svn: 184618
Per review from Anna, this really should have been two commits, and besides
it's causing problems on our internal buildbot. Reverting until these have
been worked out.
This reverts r184511 / 98123284826bb4ce422775563ff1a01580ec5766.
llvm-svn: 184561
Certain expressions can cause a constructor invocation to zero-initialize
its object even if the constructor itself does no initialization. The
analyzer now handles that before evaluating the call to the constructor,
using the same "default binding" mechanism that calloc() uses, rather
than simply ignoring the zero-initialization flag.
As a bonus, trivial default constructors are now no longer inlined; they
are instead processed explicitly by ExprEngine. This has a (positive)
effect on the generated path edges: they no longer stop at a default
constructor call unless there's a user-provided implementation.
<rdar://problem/14212563>
llvm-svn: 184511
Summary:
When doing a reinterpret+dynamic cast from an incomplete type, the analyzer
would crash (bug #16308). This fix makes the dynamic cast evaluator ignore
incomplete types, as they can never be used in a dynamic_cast. Also adding a
regression test.
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1006
llvm-svn: 184403
Summary:
When processing a call to a function, which got passed less arguments than it
expects, the analyzer would crash.
I've also added a test for that and a analyzer warning which detects these
cases.
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D994
llvm-svn: 184288
This silences warnings that could occur when one is swapping partially initialized structs. We suppress
not only the assignments of uninitialized members, but any values inside swap because swap could
potentially be used as a subroutine to swap class members.
This silences a warning from std::try::function::swap() on partially initialized objects.
llvm-svn: 184256
We drew the diagnostic edges to wrong statements in cases the note was on a macro.
The fix is simple, but seems to work just fine for a whole bunch of test cases (plist-macros.cpp).
Also, removes an unnecessary edge in edges-new.mm, when function signature starts with a macro.
llvm-svn: 183599
The function in which we were doing it used to be conditionalized. Add a new unconditional
cleanup step.
This fixes PR16227 (radar://14073870) - a crash when generating html output for one of the test files.
llvm-svn: 183451
Previously our edges were completely broken here; now, the final result
is a very simple set of edges in most cases: one up to the "for" keyword
for context, and one into the body of the loop. This matches the behavior
for ObjC for-in loops.
In the AST, however, CXXForRangeStmts are handled very differently from
ObjCForCollectionStmts. Since they are specified in terms of equivalent
statements in the C++ standard, we actually have implicit AST nodes for
all of the semantic statements. This makes evaluation very easy, but
diagnostic locations a bit trickier. Fortunately, the problem can be
generally defined away by marking all of the implicit statements as
part of the top-level for-range statement.
One of the implicit statements in a for-range statement is the declaration
of implicit iterators __begin and __end. The CFG synthesizes two
separate DeclStmts to match each of these decls, but until now these
synthetic DeclStmts weren't in the function's ParentMap. Now, the CFG
keeps track of its synthetic statements, and the AnalysisDeclContext will
make sure to add them to the ParentMap.
<rdar://problem/14038483>
llvm-svn: 183449
We based decisions during analysis and during path generation on whether
or not an expression is consumed, so if a top-level expression has
cleanups it's important for us to look through that.
<rdar://problem/14076125>
llvm-svn: 183368
We previously asserted that there was a top-level function entry edge, but
if the function decl's location is invalid (or within a macro) this edge
might not exist. Change the assertion to an actual check, and don't drop
the first path piece if it doesn't match.
<rdar://problem/14070304>
llvm-svn: 183358
The edge optimizer needs to see edges for, say, implicit casts (which have
the same source location as their operand) to uniformly simplify the
entire path. However, we still don't want to produce edges from a statement
to /itself/, which could occur when two nodes in a row have the same
statement location.
This necessitated moving the check for redundant notes to after edge
optimization, since the check relies on notes being adjacent in the path.
<rdar://problem/14061675>
llvm-svn: 183357
Consider the case where a SwitchStmt satisfied isAllEnumCasesCovered()
as well as having no cases at all (i.e. the enum it covers has no
enumerators).
In this case, we should add a successor to repair the CFG.
This fixes PR16212.
llvm-svn: 183237
...but don't yet migrate over the existing plist tests. Some of these
would be trivial to migrate; others could use a bit of inspection first.
In any case, though, the new edge algorithm seems to have proven itself,
and we'd like more coverage (and more usage) of it going forwards.
llvm-svn: 183165
A.1 -> A -> B
becomes
A.1 -> B
This only applies if there's an edge from a subexpression to its parent
expression, and that is immediately followed by another edge from the
parent expression to a subsequent expression. Normally this is useful for
bringing the edges back to the left side of the code, but when the
subexpression is on a different line the backedge ends up looking strange,
and may even obscure code. In these cases, it's better to just continue
to the next top-level statement.
llvm-svn: 183164
Specifically, if the line is over 80 characters, or if the top-level
statement spans mulitple lines, we should preserve sub-expression edges
even if they form a simple cycle as described in the last commit, because
it's harder to infer what's going on than it is for shorter lines.
llvm-svn: 183163
Generating context arrows can result in quite a few arrows surrounding a
relatively simple expression, often containing only a single path note.
|
1 +--2---+
v/ v
auto m = new m // 3 (the path note)
|\ |
5 +--4---+
v
Note also that 5 and 1 are two ends of the "same" arrow, i.e. they go from
event to event. 3 is not an arrow but the path note itself.
Now, if we see a pair of edges like 2 and 4---where 4 is the reverse of 2
and there is optionally a single path note between them---we will
eliminate /both/ edges. Anything more complicated will be left as is
(more edges involved, an inlined call, etc).
The next commit will refine this to preserve the arrows in a larger
expression, so that we don't lose all context.
llvm-svn: 183162
The old edge builder didn't have a notion of nested statement contexts,
so there was no special treatment of a logical operator inside an if
(or inside another logical operator). The new edge builder always tries
to establish the full context up to the top-level statement, so it's
important to know how much context has been established already rather
than just checking the innermost context.
This restores some of the old behavior for the old edge generation:
the context of a logical operator's non-controlling expression is the
subexpression in the old edge algorithm, but the entire operator
expression in the new algorithm.
llvm-svn: 183160
The current edge-generation algorithm sometimes creates edges from a
top-level statement A to a sub-expression B.1 that's not at the start of B.
This creates a "swoosh" effect where the arrow is drawn on top of the
text at the start of B. In these cases, the results are clearer if we see
an edge from A to B, then another one from B to B.1.
Admittedly, this does create a /lot/ of arrows, some of which merely hop
into a subexpression and then out again for a single note. The next commit
will eliminate these if the subexpression is simple enough.
This updates and reuses some of the infrastructure from the old edge-
generation algorithm to find the "enclosing statement" context for a
given expression. One change in particular marks the context of the
LHS or RHS of a logical binary operator (&&, ||) as the entire operator
expression, rather than the subexpression itself. This matches our behavior
for ?:, and allows us to handle nested context information.
<rdar://problem/13902816>
llvm-svn: 183159
Neither the compiler nor the analyzer are doing anything with non-VarDecl
decls in the CFG, and having them there creates extra nodes in the
analyzer's path diagnostics. Simplify the CFG (and the path edges) by
simply leaving them out. We can always add interesting decls back in when
they become relevant.
Note that this only affects decls declared in a DeclStmt, and then only
those that appear within a function body.
llvm-svn: 183157
Jordan has pointed out that it is valuable to warn in cases when the arguments to init escape.
For example, NSData initWithBytes id not going to free the memory.
llvm-svn: 183062
In many cases, the edge from the "if" to the condition, followed by an edge from the branch condition to the target code, is uninteresting.
In such cases, we should fold the two edges into one from the "if" to the target.
This also applies to loops.
Implements <rdar://problem/14034763>.
llvm-svn: 183018
...and make this work correctly in the current codebase.
After living on this for a while, it turns out to look very strange for
inlined functions that have only a single statement, and somewhat strange
for inlined functions in general (since they are still conceptually in the
middle of the path, and there is a function-entry path note).
It's worth noting that this only affects inlined functions; in the new
arrow generation algorithm, the top-level function still starts at the
first real statement in the function body, not the enclosing CompoundStmt.
This reverts r182078 / dbfa950abe0e55b173286a306ee620eff5f72ea.
llvm-svn: 182963
It is okay to declare a block without an argument list: ^ {} or ^void {}.
In these cases, the BlockDecl's signature-as-written will just contain
the return type, rather than the entire function type. It is unclear if
this is intentional, but the analyzer shouldn't crash because of it.
<rdar://problem/14018351>
llvm-svn: 182948
Most loop notes (like "entering loop body") are attached to the condition
expression guarding a loop or its equivalent. For loops may not have a
condition expression, though. Rather than crashing, just use the entire
ForStmt as the location. This is probably the best we can do.
<rdar://problem/14016063>
llvm-svn: 182904
In C, 'void' is treated like any other incomplete type, and though it is
never completed, you can cast the address of a void-typed variable to do
something useful. (In C++ it's illegal to declare a variable with void type.)
Previously we asserted on this code; now we just treat it like any other
incomplete type.
And speaking of incomplete types, we don't know their extent. Actually
check that in TypedValueRegion::getExtent, though that's not being used
by any checkers that are on by default.
llvm-svn: 182880
This gives slightly better precision, specifically, in cases where a non-typed region represents the array
or when the type is a non-array type, which can happen when an array is a result of a reinterpret_cast.
llvm-svn: 182810
It’s important for us to reason about the cast as it is used in std::addressof. The reason we did not
handle the cast previously was a crash on a test case (see commit r157478). The crash was in
processing array to pointer decay when the region type was not an array. Address the issue, by
just returning an unknown in that case.
llvm-svn: 182808
In addition to enabling more code reuse, this suppresses some false positives by allowing us to
compare an element region to its base. See the ptr-arith.cpp test cases for an example.
llvm-svn: 182780
When generating path notes, implicit function bodies are shown at the call
site, so that, say, copying a POD type in C++ doesn't jump you to a header
file. This is especially important when the synthesized function itself
calls another function (or block), in which case we should try to jump the
user around as little as possible.
By checking whether a called function has a body in the AST, we can tell
if the analyzer synthesized the body, and if we should therefore collapse
the call down to the call site like a true implicitly-defined function.
<rdar://problem/13978414>
llvm-svn: 182677
The new edge algorithm would keep track of the previous location in each
location context, so that it could draw arrows coming in and out of each
inlined call. However, it tried to access the location of the call before
it was actually set (at the CallEnter node). This only affected
unterminated calls at the end of a path; calls with visible exit nodes
already had a valid location.
This patch ditches the location context map, since we're processing the
nodes in order anyway, and just unconditionally updates the PrevLoc
variable after popping out of an inlined call.
<rdar://problem/13983470>
llvm-svn: 182676
Currently, blocks instantiated in templates lose their "signature as
written"; it's not clear if this is intentional. Change the analyzer's
use of BlockDecl::getSignatureAsWritten to check whether or not the
signature is actually there.
<rdar://problem/13954714>
llvm-svn: 182497
The crash is triggered by the newly added option (-analyzer-config report-in-main-source-file=true) introduced in r182058.
Note, ideally, we’d like to report the issue within the main source file here as well.
For now, just do not crash.
llvm-svn: 182445
The analyzer can't see the reference count for shared_ptr, so it doesn't
know whether a given destruction is going to delete the referenced object.
This leads to spurious leak and use-after-free warnings.
For now, just ban destructors named '~shared_ptr', which catches
std::shared_ptr, std::tr1::shared_ptr, and boost::shared_ptr.
PR15987
llvm-svn: 182071
Previously, we’ve used the last location of the analyzer issue path as the location of the
report. This might not provide the best user experience, when one analyzer a source
file and the issue appears in the header. Introduce an option to use the last location
of the path that is in the main source file as the report location.
New option can be enabled with -analyzer-config report-in-main-source-file=true.
llvm-svn: 182058
found for a receiver, note where receiver class
is declaraed (this is most common when receiver is a forward
class). // rdar://3258331
llvm-svn: 181847
In most cases it is, by just looking at the name. Also, this check prevents the heuristic from working in strange user settings.
radar://13839692
llvm-svn: 181615
Consider this example:
char *p = malloc(sizeof(char));
systemFunction(&p);
free(p);
In this case, when we call systemFunction, we know (because it's a system
function) that it won't free 'p'. However, we /don't/ know whether or not
it will /change/ 'p', so the analyzer is forced to invalidate 'p', wiping
out any bindings it contains. But now the malloc'd region looks like a
leak, since there are no more bindings pointing to it, and we'll get a
spurious leak warning.
The fix for this is to notice when something is becoming inaccessible due
to invalidation (i.e. an imperfect model, as opposed to being explicitly
overwritten) and stop tracking it at that point. Currently, the best way
to determine this for a call is the "indirect escape" pointer-escape kind.
In practice, all the patch does is take the "system functions don't free
memory" special case and limit it to direct parameters, i.e. just the
arguments to a call and not other regions accessible to them. This is a
conservative change that should only cause us to escape regions more
eagerly, which means fewer leak warnings.
This isn't perfect for several reasons, the main one being that this
example is treated the same as the one above:
char **p = malloc(sizeof(char *));
systemFunction(p + 1);
// leak
Currently, "addresses accessible by offsets of the starting region" and
"addresses accessible through bindings of the starting region" are both
considered "indirect" regions, hence this uniform treatment.
Another issue is our longstanding problem of not distinguishing const and
non-const bindings; if in the first example systemFunction's parameter were
a char * const *, we should know that the function will not overwrite 'p',
and thus we can safely report the leak.
<rdar://problem/13758386>
llvm-svn: 181607
This occurs because in C++11 the compound literal syntax can trigger a
constructor call via list-initialization. That is, "Point{x, y}" and
"(Point){x, y}" end up being equivalent. If this occurs, the inner
CXXConstructExpr will have already handled the object construction; the
CompoundLiteralExpr just needs to propagate that value forwards.
<rdar://problem/13804098>
llvm-svn: 181213
FindLastStoreBRVisitor is responsible for finding where a particular region
gets its value; if the region is a VarRegion, it's possible that value was
assigned at initialization, i.e. at its DeclStmt. However, if a function is
called recursively, the same DeclStmt may be evaluated multiple times in
multiple stack frames. FindLastStoreBRVisitor was not taking this into
account and just picking the first one it saw.
<rdar://problem/13787723>
llvm-svn: 180997
There were actually two bugs here:
- if we decided to look for an interesting lvalue or call expression, we
wouldn't go find its node if we also knew we were at a (different) call.
- if we looked through one message send with a nil receiver, we thought we
were still looking at an argument to the original call.
Put together, this kept us from being able to track the right values, which
means sub-par diagnostics and worse false-positive suppression.
Noticed by inspection.
llvm-svn: 180996
...and don't consider '0' to be a null pointer constant if it's the
initializer for a float!
Apparently null pointer constant evaluation looks through both
MaterializeTemporaryExpr and ImplicitCastExpr, so we have to be more
careful about types in the callers. For RegionStore this just means giving
up a little more; for ExprEngine this means handling the
MaterializeTemporaryExpr case explicitly.
Follow-up to r180894.
llvm-svn: 180944
It is unfortunate that we have to mark these exceptions in multiple places.
This was already in CallEvent. I suppose it does let us be more precise
about saying /which/ arguments have their retain counts invalidated -- the
connection's is still valid even though the context object's isn't -- but
we're not tracking the retain count of XPC objects anyway.
<rdar://problem/13783514>
llvm-svn: 180904
Previously, this was scattered across Environment (literal expressions),
ExprEngine (default arguments), and RegionStore (global constants). The
former special-cased several kinds of simple constant expressions, while
the latter two deferred to the AST's constant evaluator.
Now, these are all unified as SValBuilder::getConstantVal(). To keep
Environment fast, the special cases for simple constant expressions have
been left in, but the main benefits are that (a) unusual constants like
ObjCStringLiterals now work as default arguments and global constant
initializers, and (b) we're not duplicating code between ExprEngine and
RegionStore.
This actually caught a bug in our test suite, which is awesome: we stop
tracking allocated memory if it's passed as an argument along with some
kind of callback, but not if the callback is 0. We were testing this in
a case where the callback parameter had a default value, but that value
was 0. After this change, the analyzer now (correctly) flags that as a
leak!
<rdar://problem/13773117>
llvm-svn: 180894
This goes with r178516, which instructed the analyzer not to inline the
constructors and destructors of C++ container classes. This goes a step
further and does the same thing for iterators, so that the analyzer won't
falsely decide we're trying to construct an iterator pointing to a
nonexistent element.
The heuristic for determining whether something is an iterator is the
presence of an 'iterator_category' member. This is controlled under the
same -analyzer-config option as container constructor/destructor inlining:
'c++-container-inlining'.
<rdar://problem/13770187>
llvm-svn: 180890
This doesn't appear to be the cause of the slowdown. I'll have to try a
manual bisect to see if there's really anything there, or if it's just
the bot itself taking on additional load. Meanwhile, this change helps
with correctness.
This changes an assertion and adds a test case, then re-applies r180638,
which was reverted in r180714.
<rdar://problem/13296133> and PR15863
llvm-svn: 180864
This seems to be causing quite a slowdown on our internal analyzer bot,
and I'm not sure why. Needs further investigation.
This reverts r180638 / 9e161ea981f22ae017b6af09d660bfc3ddf16a09.
llvm-svn: 180714
In an Objective-C for-in loop "for (id element in collection) {}", the loop
will run 0 times if the collection is nil. This is because the for-in loop
is implemented using a protocol method that returns 0 when there are no
elements to iterate, and messages to nil will result in a 0 return value.
At some point we may want to actually model this message send, but for now
we may as well get the nil case correct, and avoid the false positives that
would come with this case.
<rdar://problem/13744632>
llvm-svn: 180639
Casts to bool (and _Bool) are equivalent to checks against zero,
not truncations to 1 bit or 8 bits.
This improved reasoning does cause a change in the behavior of the alpha
BoolAssignment checker. Previously, this checker complained about statements
like "bool x = y" if 'y' was known not to be 0 or 1. Now it does not, since
that conversion is well-defined. It's hard to say what the "best" behavior
here is: this conversion is safe, but might be better written as an explicit
comparison against zero.
More usefully, besides improving our model of booleans, this fixes spurious
warnings when returning the address of a local variable cast to bool.
<rdar://problem/13296133>
llvm-svn: 180638
We get a CallEnter with a null expression, when processing a destructor. All other users of
CallEnter::getCallExpr work fine with null as return value.
(Addresses PR15832, Thanks to Jordan for reducing the test case!)
llvm-svn: 180234
- If only partial invalidators exist and there are no full invalidators in @implementation, report every ivar that has
not been invalidated. (Previously, we reported the first Ivar in the list, which could actually have been invalidated
by a partial invalidator. The code assumed you cannot have only partial invalidators.)
- Do not report missing invalidation method declaration if a partial invalidation method declaration exists.
llvm-svn: 180170
The uniqueing location is the location which is part of the hash used to determine if two reports are
the same. This is used by the CmpRuns.py script to compare two analyzer runs and determine which
warnings are new.
llvm-svn: 180166
The 2 functions were computing the same location using different logic (each one had edge case bugs that the other
one did not). Refactor them to rely on the same logic.
The location of the warning reported in text/command line output format will now match that of the plist file.
There is one change in the plist output as well. When reporting an error on a BinaryOperator, we use the location of the
operator instead of the beginning of the BinaryOperator expression. This matches our output on command line and
looks better in most cases.
llvm-svn: 180165
The analyzer represents all pointer-to-pointer bitcasts the same way, but
this can be problematic if an implicit base cast gets layered on top of a
manual base cast (performed with reinterpret_cast instead of static_cast).
Fix this (and avoid a valid assertion) by looking through cast regions.
Using reinterpret_cast this way is only valid if the base class is at the
same offset as the derived class; this is checked by -Wreinterpret-base-class.
In the interest of performance, the analyzer doesn't repeat this check
anywhere; it will just silently do the wrong thing (use the wrong offsets
for fields of the base class) if the user code is wrong.
PR15394
llvm-svn: 180052
Introduce a new helper function, which computes the first symbolic region in
the base region chain. The corresponding symbol has been used for assuming that
a pointer is null. Now, it will also be used for checking if it is null.
This ensures that we are tracking a null pointer correctly in the BugReporter.
llvm-svn: 179916
The analyzer uses LazyCompoundVals to represent rvalues of aggregate types,
most importantly structs and arrays. This allows us to efficiently copy
around an entire struct, rather than doing a memberwise load every time a
struct rvalue is encountered. This can also keep memory usage down by
allowing several structs to "share" the same snapshotted bindings.
However, /lookup/ through LazyCompoundVals can be expensive, especially
since they can end up chaining back to the original value. While we try
to reuse LazyCompoundVals whenever it's safe, and cache information about
this transitivity, the fact is it's sometimes just not a good idea to
perpetuate LazyCompoundVals -- the tradeoffs just aren't worth it.
This commit changes RegionStore so that binding a LazyCompoundVal to struct
will do a memberwise copy if the struct is simple enough. Today's definition
of "simple enough" is "up to N scalar members" (see below), but that could
easily be changed in the future. This is enough to bring the test case in
PR15697 back down to a manageable analysis time (within 20% of its original
time, in an unfair test where the new analyzer is not compiled with LTO).
The actual value of "N" is controlled by a new -analyzer-config option,
'region-store-small-struct-limit'. It defaults to "2", meaning structs with
zero, one, or two scalar members will be considered "simple enough" for
this code path.
It's worth noting that a more straightforward implementation would do this
on load, not on bind, and make use of the structure we already have for this:
CompoundVal. A long time ago, this was actually how RegionStore modeled
aggregate-to-aggregate copies, but today it's only used for compound literals.
Unfortunately, it seems that we've special-cased LazyCompoundVal in certain
places (such as liveness checks) but failed to similarly special-case
CompoundVal in all of them. Until we're confident that CompoundVal is
handled properly everywhere, this solution is safer, since the entire
optimization is just an implementation detail of RegionStore.
<rdar://problem/13599304>
llvm-svn: 179767
A C++ overloaded operator may be implemented as an instance method, and
that instance method may be called on an rvalue object, which has no
associated region. The analyzer handles this by creating a temporary region
just for the evaluation of this call; however, it is possible that /by
creating the region/, the analyzer ends up in a previously-explored state.
In this case we don't need to continue along this path.
This doesn't actually show any behavioral change now, but it starts being
used with the next commit and prevents an assertion failure there.
llvm-svn: 179766
In the committed example, we now see a note that tells us when the pointer
was assumed to be null.
This is the only case in which getDerefExpr returned null (failed to get
the dereferenced expr) throughout our regression tests. (There were multiple
occurrences of this one.)
llvm-svn: 179736
We always register the visitor on a node in which the value we are tracking is live and constrained. However,
the visitation can restart at a node, later on the path, in which the value is under constrained because
it is no longer live. Previously, we just silently stopped tracking in that case.
llvm-svn: 179731
This was slightly tricky because BlockDecls don't currently store an
inferred return type. However, we can rely on the fact that blocks with
inferred return types will have return statements that match the inferred
type.
<rdar://problem/13665798>
llvm-svn: 179699
VerifyDiagnosticConsumer previously would not check that the diagnostic and
its matching directive referenced the same source file. Common practice was
to create directives that referenced other files but only by line number,
and this led to problems such as when the file containing the directive
didn't have enough lines to match the location of the diagnostic in the
other file, leading to bizarre file formatting and other oddities.
This patch causes VerifyDiagnosticConsumer to match source files as well as
line numbers. Therefore, a new syntax is made available for directives, for
example:
// expected-error@file:line {{diagnostic message}}
This extends the @line feature where "file" is the file where the diagnostic
is generated. The @line syntax is still available and uses the current file
for the diagnostic. "file" can be specified either as a relative or absolute
path - although the latter has less usefulness, I think! The #include search
paths will be used to locate the file and if it is not found an error will be
generated.
The new check is not optional: if the directive is in a different file to the
diagnostic, the file must be specified. Therefore, a number of test-cases
have been updated with regard to this.
This closes out PR15613.
llvm-svn: 179677
This is an opt-in tweak for leak diagnostics to reference the allocation
site if the diagnostic consumer only wants a pithy amount of information,
and not the entire path.
This is a strawman enhancement that I expect to see some experimentation
with over the next week, and can go away if we don't want it.
Currently it is only used by RetainCountChecker, but could be used
by MallocChecker if and when we decide this should stay in.
llvm-svn: 179634
When computing the value of ?: expression, we rely on the last expression in
the previous basic block to be the resulting value of the expression. This is
not the case for binary "?:" operator (GNU extension) in C++. As the last
basic block has the expression for the condition subexpression, which is an
R-value, whereas the true subexpression is the L-value.
Note the operator evaluation just happens to work in C since the true
subexpression is an R-value (like the condition subexpression). CFG is the
same in C and C++ case, but the AST nodes are different, which the LValue to
Rvalue conversion happening after the BinaryConditionalOperator evaluation.
Changed the logic to only use the last expression from the predecessor only
if it matches either true or false subexpression. Note, the logic needed
fortification anyway: L and R were passed but not even used by the function.
Also, change the conjureSymbolVal to correctly compute the type, when the
expression is an LG-value.
llvm-svn: 179574
While we don't do anything intelligent with pointers-to-members today,
it's perfectly legal to need a temporary of pointer-to-member type to, say,
pass by const reference. Tweak an assertion to allow this.
PR15742 and PR15747
llvm-svn: 179563
Now that we're invalidating global regions properly, we want to continue
taking advantage of a particular optimization: if all global regions are
invalidated together, we can represent the bindings of each region with
a "derived region value" symbol. Essentially, this lazily links each
global region with a single symbol created at invalidation time, rather
than binding each region with a new symbolic value.
We used to do this, but haven't been for a while; the previous commit
re-enabled this code path, and this handles the fallout.
<rdar://problem/13464044>
llvm-svn: 179554
This fixes a regression where a call to a function we can't reason about
would not actually invalidate global regions that had explicit bindings.
void test_that_now_works() {
globalInt = 42;
clang_analyzer_eval(globalInt == 42); // expected-warning{{TRUE}}
invalidateGlobals();
clang_analyzer_eval(globalInt == 42); // expected-warning{{UNKNOWN}}
}
This has probably been around since the initial "cluster" refactoring of
RegionStore, if not longer.
<rdar://problem/13464044>
llvm-svn: 179553
Some checkers ascribe different behavior to functions declared in system
headers, so when working with standard library functions it's probably best
to always have them in a standard location.
Test change only (no functionality change), but necessary for the next commit.
llvm-svn: 179552
There are few cases where we can track the region, but cannot print the note,
which makes the testing limited. (Though, I’ve tested this manually by making
all regions non-printable.) Even though the applicability is limited now, the enhancement
will be more relevant as we start tracking more regions.
llvm-svn: 179396
Before:
1. Calling 'foo'
2. Doing something interesting
3. Returning from 'foo'
4. Some kind of error here
After:
1. Calling 'foo'
2. Doing something interesting
3. Returning from 'foo'
4. Some kind of error here
The location of the note is already in the caller, not the callee, so this
just brings the "depth" attribute in line with that.
This only affects plist diagnostic consumers (i.e. Xcode). It's necessary
for Xcode to associate the control flow arrows with the right stack frame.
<rdar://problem/13634363>
llvm-svn: 179351
In this code
int getZero() {
return 0;
}
void test() {
int problem = 1 / getZero(); // expected-warning {{Division by zero}}
}
we generate these arrows:
+-----------------+
| v
int problem = 1 / getZero();
^ |
+---+
where the top one represents the control flow up to the first call, and the
bottom one represents the flow to the division.* It turns out, however, that
we were generating the top arrow twice, as if attempting to "set up context"
after we had already returned from the call. This resulted in poor
highlighting in Xcode.
* Arguably the best location for the division is the '/', but that's a
different problem.
<rdar://problem/13326040>
llvm-svn: 179350
For this source:
const int &ref = someStruct.bitfield;
We used to generate this AST:
DeclStmt [...]
`-VarDecl [...] ref 'const int &'
`-MaterializeTemporaryExpr [...] 'const int' lvalue
`-ImplicitCastExpr [...] 'const int' lvalue <NoOp>
`-MemberExpr [...] 'int' lvalue bitfield .bitfield [...]
`-DeclRefExpr [...] 'struct X' lvalue ParmVar [...] 'someStruct' 'struct X'
Notice the lvalue inside the MaterializeTemporaryExpr, which is very
confusing (and caused an assertion to fire in the analyzer - PR15694).
We now generate this:
DeclStmt [...]
`-VarDecl [...] ref 'const int &'
`-MaterializeTemporaryExpr [...] 'const int' lvalue
`-ImplicitCastExpr [...] 'int' <LValueToRValue>
`-MemberExpr [...] 'int' lvalue bitfield .bitfield [...]
`-DeclRefExpr [...] 'struct X' lvalue ParmVar [...] 'someStruct' 'struct X'
Which makes a lot more sense. This allows us to remove code in both
CodeGen and AST that hacked around this special case.
The commit also makes Clang accept this (legal) C++11 code:
int &&ref = std::move(someStruct).bitfield
PR15694 / <rdar://problem/13600396>
llvm-svn: 179250
The heuristic here (proposed by Jordan) is that, usually, if a leak is due to an early exit from init, the allocation site will be
a call to alloc. Note that in other cases init resets self to [super init], which becomes the allocation site of the object.
llvm-svn: 179221
Previously, the analyzer used isIntegerType() everywhere, which uses the C
definition of "integer". The C++ predicate with the same behavior is
isIntegerOrUnscopedEnumerationType().
However, the analyzer is /really/ using this to ask if it's some sort of
"integrally representable" type, i.e. it should include C++11 scoped
enumerations as well. hasIntegerRepresentation() sounds like the right
predicate, but that includes vectors, which the analyzer represents by its
elements.
This commit audits all uses of isIntegerType() and replaces them with the
general isIntegerOrEnumerationType(), except in some specific cases where
it makes sense to exclude scoped enumerations, or any enumerations. These
cases now use isIntegerOrUnscopedEnumerationType() and getAs<BuiltinType>()
plus BuiltinType::isInteger().
isIntegerType() is hereby banned in the analyzer - lib/StaticAnalysis and
include/clang/StaticAnalysis. :-)
Fixes real assertion failures. PR15703 / <rdar://problem/12350701>
llvm-svn: 179081
Test that the path notes do not change. I don’t think we should print a note on escape.
Also, I’ve removed a check that assumed that the family stored in the RefStete could be
AF_None and added an assert in the constructor.
llvm-svn: 179075
This is important because sometimes two nodes are identical, except the
second one is a sink.
This bug has probably been around for a while, but it wouldn't have been an
issue in the old report graph algorithm. I'm ashamed to say I actually looked
at this the first time around and thought it would never be a problem...and
then didn't include an assertion to back that up.
PR15684
llvm-svn: 178944
As mentioned in the previous commit message, the use-after-free and
double-free warnings for 'delete' are worth enabling even while the
leak warnings still have false positives.
llvm-svn: 178891
This splits the leak-checking part of alpha.cplusplus.NewDelete into a
separate user-level checker, alpha.cplusplus.NewDeleteLeaks. All the
difficult false positives we've seen with the new/delete checker have been
spurious leak warnings; the use-after-free warnings and mismatched
deallocator warnings, while rare, have always been valid.
<rdar://problem/6194569>
llvm-svn: 178890