In an expression like "new (a, b) Foo(x, y)", two things happen:
- Memory is allocated by calling a function named 'operator new'.
- The memory is initialized using the constructor for 'Foo'.
Currently the analyzer only models the second event, though it has special
cases for both the default and placement forms of operator new. This patch
is the first step towards properly modeling both events: it changes the CFG
so that the above expression now generates the following elements.
1. a
2. b
3. (CFGNewAllocator)
4. x
5. y
6. Foo::Foo
The analyzer currently ignores the CFGNewAllocator element, but the next
step is to treat that as a call like any other.
The CFGNewAllocator element is not added to the CFG for analysis-based
warnings, since none of them take advantage of it yet.
llvm-svn: 199123
...by synthesizing their body to be "return self->_prop;", with an extra
nudge to RetainCountChecker to still treat the value as +0 if we have no
other information.
This doesn't handle weak properties, but that's mostly correct anyway,
since they can go to nil at any time. This also doesn't apply to properties
whose implementations we can't see, since they may not be backed by an
ivar at all. And finally, this doesn't handle properties of C++ class type,
because we can't invoke the copy constructor. (Sema has actually done this
work already, but the AST it synthesizes is one the analyzer doesn't quite
handle -- it has an rvalue DeclRefExpr.)
Modeling setters is likely to be more difficult (since it requires
handling strong/copy), but not impossible.
<rdar://problem/11956898>
llvm-svn: 198953
...rather somewhere in the destructor when we try to access something and
realize the object has already been deleted. This is necessary because
the destructor is processed before the 'delete' itself.
Patch by Karthik Bhat!
llvm-svn: 198779
...even though the argument is declared "const void *", because this is
just a way to pass pointers around as objects. (Though NSData is often
a better one.)
PR18262
llvm-svn: 198710
RetainCountChecker has to track returned object values to know if they are
retained or not. Under ARC, even methods that return +1 are tracked by the
system and should be treated as +0. However, this effect behaves exactly
like NotOwned(ObjC), i.e. a generic Objective-C method that actually returns
+0, so we don't need a special case for it.
No functionality change.
llvm-svn: 198709
encodes the canonical rules for LLVM's style. I noticed this had drifted
quite a bit when cleaning up LLVM, so wanted to clean up Clang as well.
llvm-svn: 198686
This has the dual effect of (1) enabling more dead-stripping in release builds
and (2) ensuring that debug helper functions aren't stripped away in debug
builds, as they're intended to be called from the debugger.
Note that the attribute is applied to definitions rather than declarations in
headers going forward because it's now conditional on NDEBUG:
/// \brief Mark debug helper function definitions like dump() that should not be
/// stripped from debug builds.
Requires corresponding macro added in LLVM r198456.
llvm-svn: 198489
This checker has not been updated to work with interprocedural analysis,
and actually contains both logical correctness issues but also
memory bugs. We can resuscitate it from version control once there
is focused interest in making it a real viable checker again.
llvm-svn: 198476
Summary:
This allows for a better alternative to the FrontendAction hack used in
clang-tidy in order to get static analyzer's ASTConsumer.
Reviewers: jordan_rose, krememek
Reviewed By: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2505
llvm-svn: 198426
Remove UnaryTypeTraitExpr and switch all remaining type trait related handling
over to TypeTraitExpr.
The UTT/BTT/TT enum prefix and evaluation code is retained pending further
cleanup.
This is part of the ongoing work to unify type traits following the removal of
BinaryTypeTraitExpr in r197273.
llvm-svn: 198271
We have assertions for this, but a few edge cases had snuck through where
we were still unconditionally using 'int'.
<rdar://problem/15703011>
llvm-svn: 197733
There's nothing special about type traits accepting two arguments.
This commit eliminates BinaryTypeTraitExpr and switches all related handling
over to TypeTraitExpr.
Also fixes a CodeGen failure with variadic type traits appearing in a
non-constant expression.
The BTT/TT prefix and evaluation code is retained as-is for now but will soon
be further cleaned up.
This is part of the ongoing work to unify type traits.
llvm-svn: 197273
With the introduction of explicit address space casts into LLVM, there's
a need to provide a new cast kind the front-end can create for C/OpenCL/CUDA
and code to produce address space casts from those kinds when appropriate.
Patch by Michele Scandale!
llvm-svn: 197036
Warn if both result expressions of a ternary operator (? :) are the same.
Because only one of them will be executed, this warning will fire even if
the expressions have side effects.
Patch by Anders Rönnholm and Per Viberg!
llvm-svn: 196937
This reverts commit r189090.
The original patch introduced regressions (see the added live-variables.* tests). The patch depends on the correctness of live variable analyses, which are not computed correctly. I've opened PR18159 to track the proper resolution to this problem.
The patch was a stepping block to r189746. This is why part of the patch reverts temporary destructor tests that started crashing. The temporary destructors feature is disabled by default.
llvm-svn: 196593
look at the attribute spelling instead. The 'ownership_*' attributes should
probably be split into separate *Attr classes, but that's more than I wanted to
do here.
llvm-svn: 195805
New rules of invalidation/escape of the source buffer of memcpy: the source buffer contents is invalidated and escape while the source buffer region itself is neither invalidated, nor escape.
In the current modeling of memcpy the information about allocation state of regions, accessible through the source buffer, is not copied to the destination buffer and we can not track the allocation state of those regions anymore. So we invalidate/escape the source buffer indirect regions in anticipation of their being invalidated for real later. This eliminates false-positive leaks reported by the unix.Malloc and alpha.cplusplus.NewDeleteLeaks checkers for the cases like
char *f() {
void *x = malloc(47);
char *a;
memcpy(&a, &x, sizeof a);
return a;
}
llvm-svn: 194953
This is similar to r194004: because we can't reason about the data structure
invariants of std::basic_string, the analyzer decides it's possible for an
allocator to be used to deallocate the string's inline storage. Just ignore
this by walking up the stack, skipping past methods in classes with
"allocator" in the name, and seeing if we reach std::basic_string that way.
PR17866
llvm-svn: 194764
This has no effect on user-visible output, but can be used by post-processing
tools that work with the generated HTML, rather than using CmpRuns.py's
interface to work with plists.
Patch by György Orbán!
llvm-svn: 194763
I've added the missing ImutProfileInfo [sic] specialization for bool,
so this patch on r194235 is no longer needed.
This reverts r194244 / 2baea2887dfcf023c8e3560e5d4713c42eed7b6b.
llvm-svn: 194265
In ADT/ImmutableSet, ImutProfileInfo<bool> cannot be matched to ImutProfileInteger.
I didn't have idea it'd the right way if PROFILE_INTEGER_INFO(bool) could be added there.
llvm-svn: 194244
This syntactic checker looks for expressions on both sides of comparison
operators that are structurally the same. As a special case, the
floating-point idiom "x != x" for "isnan(x)" is left alone.
Currently this only checks comparison operators, but in the future we could
extend this to include logical operators or chained if-conditionals.
Checker by Per Viberg!
llvm-svn: 194236
An Objective-C for-in loop will have zero iterations if the collection is
empty. Previously, we could only detect this case if the program asked for
the collection's -count /before/ the for-in loop. Now, the analyzer
distinguishes for-in loops that had zero iterations from those with at
least one, and can use this information to constrain the result of calling
-count after the loop.
In order to make this actually useful, teach the checker that methods on
NSArray, NSDictionary, and the other immutable collection classes don't
change the count.
<rdar://problem/14992886>
llvm-svn: 194235
The path note that says "Loop body executed 0 times" has been changed to
"Loop body skipped when range is empty" for C++11 for-range loops, and to
"Loop body skipped when collection is empty" for Objective-C for-in loops.
Part of <rdar://problem/14992886>
llvm-svn: 194234
We could certainly be more precise in many of our diagnostics, but before we
were printing "Assuming x is && y", which is just ridiculous.
<rdar://problem/15167979>
llvm-svn: 193455
This ensures that variables accessible through a union are invalidated when
the union value is passed to a function. We still don't fully handle union
values, but this should at least quiet some false positives.
PR16596
llvm-svn: 193265
Since these aren't lexically in the constructor, drawing arrows would
be a horrible jump across the body of the class. We could still do
better here by skipping over unimportant initializers, but this at least
keeps everything within the body of the constructor.
<rdar://problem/14960554>
llvm-svn: 192818
Re-commit r191910 (reverted in r191936) with layering violation fixed, by
moving the bug categories to StaticAnalyzerCore instead of ...Checkers.
llvm-svn: 191937
One small functionality change is to bring the sizeof-pointer checker in
line with the other checkers by making its category be "Logic error"
instead of just "Logic". There should be no other functionality changes.
Patch by Daniel Marjamäki!
llvm-svn: 191910
This will emit a warning if a call to clang_analyzer_warnIfReached is
executed, printing REACHABLE. This is a more explicit way to declare
expected reachability than using clang_analyzer_eval or triggering
a bug (divide-by-zero or null dereference), and unlike the former will
work the same in inlined functions and top-level functions. Like the
other debug helpers, it is part of the debug.ExprInspection checker.
Patch by Jared Grubb!
llvm-svn: 191909
Also add some tests that there is actually a message and that the bug is
actually a hard error. This actually behaved correctly before, because:
- addTransition() doesn't actually add a transition if the new state is null;
it assumes you want to propagate the predecessor forward and does nothing.
- generateSink() is called in order to emit a bug report.
- If at least one new node has been generated, the predecessor node is /not/
propagated forward.
But now it's spelled out explicitly.
Found by Richard Mazorodze, who's working on a patch that may require this.
llvm-svn: 191805
...rather than trying to figure it out from the call site, and having
people complain that we guessed wrong and that a prototype-less call is
the same as a variadic call on their system. More importantly, fix a
crash when there's no decl at the call site (though we could have just
returned a default value).
<rdar://problem/15037033>
llvm-svn: 191599
Now that the CFG includes nodes for the destructors in a delete-expression,
process them in the analyzer using the same common destructor interface
currently used for local, member, and base destructors. Also, check for when
the value is known to be null, in which case no destructor is actually run.
This does not yet handle destructors for deleted /arrays/, which may need
more CFG work. It also causes a slight regression in the location of
double delete warnings; the double delete is detected at the destructor
call, which is implicit, and so is reported on the first access within the
destructor instead of at the 'delete' statement. This will be fixed soon.
Patch by Karthik Bhat!
llvm-svn: 191381
With this patch things like preserving contents of regions (either hi- or low-level ones) or processing of the only top-level region can be implemented easily without passing around extra parameters.
This patch is a first step towards adequate modeling of memcpy() by the CStringChecker checker and towards eliminating of majority of false-positives produced by the NewDeleteLeaks checker.
llvm-svn: 191342
Apart from being more compact and already implemented, this also handles the
case where the parent is null. (It does also ignore all casts, not just
implicit ones, but this is more efficient to test and in the case we care
about---a message in a PseudoObjectExpr---there should only be implicit casts
anyway.
This should fix our internal buildbot.
llvm-svn: 191094
We now have symbols with floating-point type to make sure that
(double)x == (double)x comes out true, but we still can't do much with
these. For now, don't even bother trying to create a floating-point zero
value; just give up on conversion to bool.
PR14634, C++ edition.
llvm-svn: 190953
LLVM supports applying conversion instructions to vectors of the same number of
elements (fptrunc, fptosi, etc.) but there had been no way for a Clang user to
cause such instructions to be generated when using builtin vector types.
C-style casting on vectors is already defined in terms of bitcasts, and so
cannot be used for these conversions as well (without leading to a very
confusing set of semantics). As a result, this adds a __builtin_convertvector
intrinsic (patterned after the OpenCL __builtin_astype intrinsic). This is
intended to aid the creation of vector intrinsic headers that create generic IR
instead of target-dependent intrinsics (in other words, this is a generic
_mm_cvtepi32_ps). As noted in the documentation, the action of
__builtin_convertvector is defined in terms of the action of a C-style cast on
each vector element.
llvm-svn: 190915
This has a side effect of preventing a crash, which occurs because we get a
property getter declaration, which is overriding but is declared inside
@protocol. Will file a bug about this inconsistency internally. Getting a
small test case is very challenging.
llvm-svn: 190836
"+method_name: cannot take ownership of memory allocated by 'new'."
instead of the old
"Memory allocated by 'new' should be deallocated by 'delete', not +method_name"
llvm-svn: 190800
RegionStore tries to protect against accidentally initializing the same
region twice, but it doesn't take subregions into account very well. If
the outer region being initialized is a struct with an empty base class,
the offset of the first field in the struct will be 0. When we initialize
the base class, we may invalidate the contents of the struct by providing
a default value of Unknown (or some new symbol). We then go to initialize
the member with a zeroing constructor, only to find that the region at
that offset in the struct already has a value. The best we can do here is
to invalidate that value and continue; neither the old default value nor
the new 0 is correct for the entire struct after the member constructor call.
The correct solution for this is to track region extents in the store.
<rdar://problem/14914316>
llvm-svn: 190530
This paves the way for adding support for modeling the destructor of a
region before it is deleted. The statement "delete <expr>" now generates
this series of CFG elements:
1. <expr>
2. [B1.1]->~Foo() (Implicit destructor)
3. delete [B1.1]
Patch by Karthik Bhat!
llvm-svn: 189828
This is an improved version of r186498. It enables ExprEngine to reason about
temporary object destructors. However, these destructor calls are never
inlined, since this feature is still broken. Still, this is sufficient to
properly handle noreturn temporary destructors.
Now, the analyzer correctly handles expressions like "a || A()", and executes the
destructor of "A" only on the paths where "a" evaluted to false.
Temporary destructor processing is still off by default and one has to
explicitly request it by setting cfg-temporary-dtors=true.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1259
llvm-svn: 189746
This will never happen in the analyzed code code, but can happen for checkers
that over-eagerly dereference pointers without checking that it's safe.
UnknownVal is a harmless enough value to get back.
Fixes an issue added in r189590, caught by our internal buildbot.
llvm-svn: 189688
Summary:
RegionStoreManager had an optimization which replaces references to empty
structs with UnknownVal. Unfortunately, this check didn't take into account
possible field members in base classes.
To address this, I changed this test to "is empty and has no base classes". I
don't consider it worth the trouble to go through base classes and check if all
of them are empty.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1547
llvm-svn: 189590
When casting the address of a FunctionTextRegion to bool, or when adding
constraints to such an address, use a stand-in symbol to represent the
presence or absence of the function if the function is weakly linked.
This is groundwork for possible simple availability testing checks, and
can already catch mistakes involving inverted null checks for
weakly-linked functions.
Currently, the implementation reuses the "extent" symbols, originally created
for tracking the size of a malloc region. Since FunctionTextRegions cannot
be dereferenced, the extent symbol will never be used for anything else.
Still, this probably deserves a refactoring in the future.
This patch does not attempt to support testing the presence of weak
/variables/ (global variables), which would likely require much more of
a change and a generalization of "region structure metadata", like the
current "extents", vs. "region contents metadata", like CStringChecker's
"string length".
Patch by Richard <tarka.t.otter@googlemail.com>!
llvm-svn: 189492
Summary:
-fno-exceptions does not implicitly attach a nothrow specifier to every operator
new. Even in this mode, non-nothrow new must not return a null pointer. Failure
to allocate memory can be signalled by other means, or just by killing the
program. This behaviour is consistent with the compiler - even with
-fno-exceptions, the generated code never tests for null (and would segfault if
the opeator actually happened to return null).
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1528
llvm-svn: 189452
Summary:
Instead of digging through the ExplodedGraph, to figure out which edge brought
us here, I compute the value of conditional expression by looking at the
sub-expression values.
To do this, I needed to change the liveness algorithm a bit -- now, the full
conditional expression also depends on all atomic sub-expressions, not only the
outermost ones.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1340
llvm-svn: 189090
Basically, isInMainFile considers line markers, and isWrittenInMainFile
doesn't. Distinguishing between the two is useful when dealing with
files which are preprocessed files or rewritten with -frewrite-includes
(so we don't, for example, print useless warnings).
llvm-svn: 188968
This is still an alpha checker, but we use it in certain tests to make sure
something is not being executed.
This should fix the buildbots.
llvm-svn: 188682
This keeps the analyzer from making silly assumptions, like thinking
strlen(foo)+1 could wrap around to 0. This fixes PR16558.
Patch by Karthik Bhat!
llvm-svn: 188680
This builtin does not actually evaluate its arguments for side effects,
so we shouldn't include them in the CFG. In the analyzer, rely on the
constant expression evaluator to get the proper semantics, at least for
now. (In the future, we could get ambitious and try to provide path-
sensitive size values.)
In theory, this does pose a problem for liveness analysis: a variable can
be used within the __builtin_object_size argument expression but not show
up as live. However, it is very unlikely that such a value would be used
to compute the object size and not used to access the object in some way.
<rdar://problem/14760817>
llvm-svn: 188679
Summary:
ScanReachableSymbols uses a "visited" set to avoid scanning the same object
twice. However, it did not use the optimization for LazyCompoundVal objects,
which resulted in exponential complexity for long chains of temporary objects.
Adding this resulted in a decrease of analysis time from >3h to 3 seconds for
some files.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1398
llvm-svn: 188677
This once again restores notes to following their associated warnings
in -analyzer-output=text mode. (This is still only intended for use as a
debugging aid.)
One twist is that the warning locations in "regular" analysis output modes
(plist, multi-file-plist, html, and plist-html) are reported at a different
location on the command line than in the output file, since the command
line has no path context. This commit makes -analyzer-output=text behave
like a normal output format, which means that the *command line output
will be different* in -analyzer-text mode. Again, since -analyzer-text is
a debugging aid and lo-fi stand-in for a regular output mode, this change
makes sense.
Along the way, remove a few pieces of stale code related to the path
diagnostic consumers.
llvm-svn: 188514
When a region is realloc()ed, MallocChecker records whether it was known
to be allocated or not. If it is, and the reallocation fails, the original
region has to be freed. Previously, when an allocated region escaped,
MallocChecker completely stopped tracking it, so a failed reallocation
still (correctly) wouldn't require freeing the original region. Recently,
however, MallocChecker started tracking escaped symbols, so that if it were
freed we could check that the deallocator matched the allocator. This
broke the reallocation model for whether or not a symbol was allocated.
Now, MallocChecker will actually check if a symbol is owned, and only
require freeing after a failed reallocation if it was owned before.
PR16730
llvm-svn: 188468
This is intended to be a simplified API, whose internals are
deliberately less efficient for the purpose of a simplified interface,
for use with clients that want to query the analyzer's heuristics for
determining retain count semantics.
There are no immediate clients, but it is intended to be used
by the ObjC modernizer.
llvm-svn: 188433
Summary:
ExprEngine had code which specificaly disabled using CXXTempObjectRegions in
InitListExprs. This was a hack put in r168757 to silence a false positive.
The underlying problem seems to have been fixed in the mean time, as removing
this code doesn't seem to break anything. Therefore I propose to remove it and
solve PR16629 in the process.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1325
llvm-svn: 188059
This field is just IsDefaulted && !IsDeleted; in all places it's used,
a simple check for isDefaulted() is superior anyway, and we were forgetting
to set it in a few cases.
Also eliminate CXXDestructorDecl::IsImplicitlyDefined, for the same reasons.
No intended functionality change.
llvm-svn: 187891
We process autorelease counts when we exit functions, but if there's an
issue in a synthesized body the report will get dropped. Just skip the
processing for now and let it get handled when the caller gets around to
processing autoreleases.
(This is still suboptimal: objects autoreleased in the caller context
should never be warned about when exiting a callee context, synthesized
or not.)
Second half of <rdar://problem/14611722>
llvm-svn: 187625
Much of our diagnostic machinery is set up to assume that the report
end path location is valid. Moreover, the user may be quite confused
when something goes wrong in our BodyFarm-synthesized function bodies,
which may be simplified or modified from the real implementations.
Rather than try to make this all work somehow, just drop the report so
that we don't try to go on with an invalid source location.
Note that we still handle reports whose /paths/ go through invalid
locations, just not those that are reported in one.
We do have to be careful not to lose warnings because of this.
The impetus for this change was an autorelease being processed within
the synthesized body, and there may be other possible issues that are
worth reporting in some way. We'll take these as they come, however.
<rdar://problem/14611722>
llvm-svn: 187624
Summary:
When binding a temporary object to a static local variable, the analyzer would
complain about a dangling reference even though the temporary's lifetime should
be extended past the end of the function. This commit tries to detect these
cases and construct them in a global memory region instead of a local one.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1133
llvm-svn: 187196
Previously, we tried to avoid creating new temporary object regions if
the value to be materialized itself came from a temporary object region.
However, once we became more strict about lvalues vs. rvalues (months
ago), this optimization became dead code, because the input to this
function will always be an rvalue (i.e. a symbolic value or compound
value rather than a region, at least for structs).
This would be a nice optimization to keep, but removing it makes it
simpler to reason about temporary regions.
llvm-svn: 187160
These are cases where a scalar type is "destructed", usually due to
template instantiation (e.g. "obj.~T()", where 'T' is 'int'). This has
no actual effect and the analyzer should just skip over it.
llvm-svn: 186927
The analyzer doesn't currently expect CFG blocks with terminators to be
empty, but this can happen when generating conditional destructors for
a complex logical expression, such as (a && (b || Temp{})). Moreover,
the branch conditions for these expressions are not persisted in the
state. Even for handling noreturn destructors this needs more work.
This reverts r186498.
llvm-svn: 186925
This is the same way GenericSelectionExpr works, and it's generally a
more consistent approach.
A large part of this patch is devoted to caching the value of the condition
of a ChooseExpr; it's needed to avoid threading an ASTContext into
IgnoreParens().
Fixes <rdar://problem/14438917>.
llvm-svn: 186738
Previously, we would simply abort the path when we saw a default member
initialization; now, we actually attempt to evaluate it. Like default
arguments, the contents of these expressions are not actually part of the
current function, so we fall back to constant evaluation.
llvm-svn: 186521
Previously, SValBuilder knew how to evaluate StringLiterals, but couldn't
handle an array-to-pointer decay for constant values. Additionally,
RegionStore was being too strict about loading from an array, refusing to
return a 'char' value from a 'const char' array. Both of these have been
fixed.
llvm-svn: 186520
Previously, the use of a std::initializer_list (actually, a
CXXStdInitializerListExpr) would cause the analyzer to give up on the rest
of the path. Now, it just uses an opaque symbolic value for the
initializer_list and continues on.
At some point in the future we can add proper support for initializer_list,
with access to the elements in the InitListExpr.
<rdar://problem/14340207>
llvm-svn: 186519
Summary:
This patch enables ExprEndgine to reason about temporary object destructors.
However, these destructor calls are never inlined, since this feature is still
broken. Still, this is sufficient to properly handle noreturn temporary
destructors and close bug #15599. I have also enabled the cfg-temporary-dtors
analyzer option by default.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1131
llvm-svn: 186498
Previously, we asserted that whenever 'new' did not include a constructor
call, the type must be a non-record type. In C++11, however, uniform
initialization syntax (braces) allow 'new' to construct records with
list-initialization: "new Point{1, 2}".
Removing this assertion should be perfectly safe; the code here matches
what VisitDeclStmt does for regions allocated on the stack.
<rdar://problem/14403437>
llvm-svn: 186028
list is the name of a class, not a namespace. Change the test as well - the previous
version did not test properly.
Fixes radar://14317928.
llvm-svn: 185898
We should not be asking unique_file to prepend the system temporary directory
when creating the html report. Unfortunately I don't think we can test this
with the current infrastructure since unique_file ignores MakeAbsolute if the
directory is already absolute and the paths provided by lit are.
I will take a quick look at making this api a bit less error prone.
llvm-svn: 185707
The motivation is to suppresses false use-after-free reports that occur when calling
std::list::pop_front() or std::list::pop_back() twice. The analyzer does not
reason about the internal invariants of the list implementation, so just do not report
any of warnings in std::list.
Fixes radar://14317928.
llvm-svn: 185609
Summary:
The analyzer incorrectly handled noreturn destructors which were hidden inside
function calls. This happened because NoReturnFunctionChecker only listened for
PostStmt events, which are not executed for destructor calls. I've changed it to
listen to PostCall events, which should catch both cases.
Reviewers: jordan_rose
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1056
llvm-svn: 185522
While we don't model pointers-to-members besides "null" and "non-null",
we were using Loc symbols for valid pointers and NonLoc integers for the
null case. This hit the assert committed in r185401.
Fixed by using a true (Loc) null for null member pointers.
llvm-svn: 185444
Summary:
Static analyzer used to abort when encountering AttributedStmts, because it
asserted that the statements should not appear in the CFG. This is however not
the case, since at least the clang::fallthrough annotation makes it through.
This commit simply makes the analyzer ignore the statement attributes.
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1030
llvm-svn: 185417
The one bit of code that was using this is gone, and neither C nor C++
actually allows this. Add an assertion and remove dead code.
Found by Matthew Dempsky!
llvm-svn: 185401
Re-apply r184511, reverted in r184561, with the trivial default constructor
fast path removed -- it turned out not to be necessary here.
Certain expressions can cause a constructor invocation to zero-initialize
its object even if the constructor itself does no initialization. The
analyzer now handles that before evaluating the call to the constructor,
using the same "default binding" mechanism that calloc() uses, rather
than simply ignoring the zero-initialization flag.
<rdar://problem/14212563>
llvm-svn: 184815
In order to make sure virtual base classes are always initialized once,
the AST contains initializers for the base class in /all/ of its
descendents, not just the immediate descendents. However, at runtime,
the most-derived object is responsible for initializing all the virtual
base classes; all the other initializers will be ignored.
The analyzer now checks to see if it's being called from another base
constructor, and if so does not perform virtual base initialization.
<rdar://problem/14236851>
llvm-svn: 184814
Add a debug checker that is useful to understand how the ExplodedGraph is
built; it can be triggered using the following command:
clang -cc1 -analyze -analyzer-checker=debug.ViewExplodedGraph my_program.c
A patch by Béatrice Creusillet!
llvm-svn: 184768
This fixes false positives by allowing us to know that a loop is always entered if
the collection count method returns a positive value and vice versa.
Addresses radar://14169391.
llvm-svn: 184618
Per review from Anna, this really should have been two commits, and besides
it's causing problems on our internal buildbot. Reverting until these have
been worked out.
This reverts r184511 / 98123284826bb4ce422775563ff1a01580ec5766.
llvm-svn: 184561
Certain expressions can cause a constructor invocation to zero-initialize
its object even if the constructor itself does no initialization. The
analyzer now handles that before evaluating the call to the constructor,
using the same "default binding" mechanism that calloc() uses, rather
than simply ignoring the zero-initialization flag.
As a bonus, trivial default constructors are now no longer inlined; they
are instead processed explicitly by ExprEngine. This has a (positive)
effect on the generated path edges: they no longer stop at a default
constructor call unless there's a user-provided implementation.
<rdar://problem/14212563>
llvm-svn: 184511
Summary:
When doing a reinterpret+dynamic cast from an incomplete type, the analyzer
would crash (bug #16308). This fix makes the dynamic cast evaluator ignore
incomplete types, as they can never be used in a dynamic_cast. Also adding a
regression test.
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1006
llvm-svn: 184403
Summary:
When processing a call to a function, which got passed less arguments than it
expects, the analyzer would crash.
I've also added a test for that and a analyzer warning which detects these
cases.
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D994
llvm-svn: 184288
This silences warnings that could occur when one is swapping partially initialized structs. We suppress
not only the assignments of uninitialized members, but any values inside swap because swap could
potentially be used as a subroutine to swap class members.
This silences a warning from std::try::function::swap() on partially initialized objects.
llvm-svn: 184256
The untemplated implementation of getParents() doesn't need to be in a
header file.
RecursiveASTVisitor.h is full of repeated macro expansion. Moving this
include to ASTContext.cpp speeds up compilation of
LambdaMangleContext.cpp, a small C++ file with few includes, from 3.7s
to 2.8s for me locally. I haven't measured a full build, but it can't
hurt.
I had to fix a few static analyzer files that were depending on
transitive includes of C++ AST headers.
Reviewers: rsmith, klimek
Differential Revision: http://llvm-reviews.chandlerc.com/D982
llvm-svn: 184075
Introduce CXXStdInitializerListExpr node, representing the implicit
construction of a std::initializer_list<T> object from its underlying array.
The AST representation of such an expression goes from an InitListExpr with a
flag set, to a CXXStdInitializerListExpr containing a MaterializeTemporaryExpr
containing an InitListExpr (possibly wrapped in a CXXBindTemporaryExpr).
This more detailed representation has several advantages, the most important of
which is that the new MaterializeTemporaryExpr allows us to directly model
lifetime extension of the underlying temporary array. Using that, this patch
*drastically* simplifies the IR generation of this construct, provides IR
generation support for nested global initializer_list objects, fixes several
bugs where the destructors for the underlying array would accidentally not get
invoked, and provides constant expression evaluation support for
std::initializer_list objects.
llvm-svn: 183872
Summary:
"register" functions for the checker were caching the checker objects in a
static variable. This caused problems when the function is called with a
different CheckerManager.
Reviewers: klimek
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D955
llvm-svn: 183823
We drew the diagnostic edges to wrong statements in cases the note was on a macro.
The fix is simple, but seems to work just fine for a whole bunch of test cases (plist-macros.cpp).
Also, removes an unnecessary edge in edges-new.mm, when function signature starts with a macro.
llvm-svn: 183599
The function in which we were doing it used to be conditionalized. Add a new unconditional
cleanup step.
This fixes PR16227 (radar://14073870) - a crash when generating html output for one of the test files.
llvm-svn: 183451
Previously our edges were completely broken here; now, the final result
is a very simple set of edges in most cases: one up to the "for" keyword
for context, and one into the body of the loop. This matches the behavior
for ObjC for-in loops.
In the AST, however, CXXForRangeStmts are handled very differently from
ObjCForCollectionStmts. Since they are specified in terms of equivalent
statements in the C++ standard, we actually have implicit AST nodes for
all of the semantic statements. This makes evaluation very easy, but
diagnostic locations a bit trickier. Fortunately, the problem can be
generally defined away by marking all of the implicit statements as
part of the top-level for-range statement.
One of the implicit statements in a for-range statement is the declaration
of implicit iterators __begin and __end. The CFG synthesizes two
separate DeclStmts to match each of these decls, but until now these
synthetic DeclStmts weren't in the function's ParentMap. Now, the CFG
keeps track of its synthetic statements, and the AnalysisDeclContext will
make sure to add them to the ParentMap.
<rdar://problem/14038483>
llvm-svn: 183449
When processing ArrayToPointerDecay, we expect the array to be a location, not a LazyCompoundVal.
Special case the rvalue arrays by using a location to represent them. This case is handled similarly
elsewhere in the code.
Fixes PR16206.
llvm-svn: 183359
We previously asserted that there was a top-level function entry edge, but
if the function decl's location is invalid (or within a macro) this edge
might not exist. Change the assertion to an actual check, and don't drop
the first path piece if it doesn't match.
<rdar://problem/14070304>
llvm-svn: 183358
The edge optimizer needs to see edges for, say, implicit casts (which have
the same source location as their operand) to uniformly simplify the
entire path. However, we still don't want to produce edges from a statement
to /itself/, which could occur when two nodes in a row have the same
statement location.
This necessitated moving the check for redundant notes to after edge
optimization, since the check relies on notes being adjacent in the path.
<rdar://problem/14061675>
llvm-svn: 183357
...but don't yet migrate over the existing plist tests. Some of these
would be trivial to migrate; others could use a bit of inspection first.
In any case, though, the new edge algorithm seems to have proven itself,
and we'd like more coverage (and more usage) of it going forwards.
llvm-svn: 183165
A.1 -> A -> B
becomes
A.1 -> B
This only applies if there's an edge from a subexpression to its parent
expression, and that is immediately followed by another edge from the
parent expression to a subsequent expression. Normally this is useful for
bringing the edges back to the left side of the code, but when the
subexpression is on a different line the backedge ends up looking strange,
and may even obscure code. In these cases, it's better to just continue
to the next top-level statement.
llvm-svn: 183164
Specifically, if the line is over 80 characters, or if the top-level
statement spans mulitple lines, we should preserve sub-expression edges
even if they form a simple cycle as described in the last commit, because
it's harder to infer what's going on than it is for shorter lines.
llvm-svn: 183163
Generating context arrows can result in quite a few arrows surrounding a
relatively simple expression, often containing only a single path note.
|
1 +--2---+
v/ v
auto m = new m // 3 (the path note)
|\ |
5 +--4---+
v
Note also that 5 and 1 are two ends of the "same" arrow, i.e. they go from
event to event. 3 is not an arrow but the path note itself.
Now, if we see a pair of edges like 2 and 4---where 4 is the reverse of 2
and there is optionally a single path note between them---we will
eliminate /both/ edges. Anything more complicated will be left as is
(more edges involved, an inlined call, etc).
The next commit will refine this to preserve the arrows in a larger
expression, so that we don't lose all context.
llvm-svn: 183162
The old edge builder didn't have a notion of nested statement contexts,
so there was no special treatment of a logical operator inside an if
(or inside another logical operator). The new edge builder always tries
to establish the full context up to the top-level statement, so it's
important to know how much context has been established already rather
than just checking the innermost context.
This restores some of the old behavior for the old edge generation:
the context of a logical operator's non-controlling expression is the
subexpression in the old edge algorithm, but the entire operator
expression in the new algorithm.
llvm-svn: 183160
The current edge-generation algorithm sometimes creates edges from a
top-level statement A to a sub-expression B.1 that's not at the start of B.
This creates a "swoosh" effect where the arrow is drawn on top of the
text at the start of B. In these cases, the results are clearer if we see
an edge from A to B, then another one from B to B.1.
Admittedly, this does create a /lot/ of arrows, some of which merely hop
into a subexpression and then out again for a single note. The next commit
will eliminate these if the subexpression is simple enough.
This updates and reuses some of the infrastructure from the old edge-
generation algorithm to find the "enclosing statement" context for a
given expression. One change in particular marks the context of the
LHS or RHS of a logical binary operator (&&, ||) as the entire operator
expression, rather than the subexpression itself. This matches our behavior
for ?:, and allows us to handle nested context information.
<rdar://problem/13902816>
llvm-svn: 183159
Although we don't want to show a function entry edge for a top-level path,
having it makes optimizing edges a little more uniform.
This does not affect any edges now, but will affect context edge generation
(next commit).
llvm-svn: 183158
Jordan has pointed out that it is valuable to warn in cases when the arguments to init escape.
For example, NSData initWithBytes id not going to free the memory.
llvm-svn: 183062
In many cases, the edge from the "if" to the condition, followed by an edge from the branch condition to the target code, is uninteresting.
In such cases, we should fold the two edges into one from the "if" to the target.
This also applies to loops.
Implements <rdar://problem/14034763>.
llvm-svn: 183018
...and make this work correctly in the current codebase.
After living on this for a while, it turns out to look very strange for
inlined functions that have only a single statement, and somewhat strange
for inlined functions in general (since they are still conceptually in the
middle of the path, and there is a function-entry path note).
It's worth noting that this only affects inlined functions; in the new
arrow generation algorithm, the top-level function still starts at the
first real statement in the function body, not the enclosing CompoundStmt.
This reverts r182078 / dbfa950abe0e55b173286a306ee620eff5f72ea.
llvm-svn: 182963
It is okay to declare a block without an argument list: ^ {} or ^void {}.
In these cases, the BlockDecl's signature-as-written will just contain
the return type, rather than the entire function type. It is unclear if
this is intentional, but the analyzer shouldn't crash because of it.
<rdar://problem/14018351>
llvm-svn: 182948
Most loop notes (like "entering loop body") are attached to the condition
expression guarding a loop or its equivalent. For loops may not have a
condition expression, though. Rather than crashing, just use the entire
ForStmt as the location. This is probably the best we can do.
<rdar://problem/14016063>
llvm-svn: 182904
In C, 'void' is treated like any other incomplete type, and though it is
never completed, you can cast the address of a void-typed variable to do
something useful. (In C++ it's illegal to declare a variable with void type.)
Previously we asserted on this code; now we just treat it like any other
incomplete type.
And speaking of incomplete types, we don't know their extent. Actually
check that in TypedValueRegion::getExtent, though that's not being used
by any checkers that are on by default.
llvm-svn: 182880
This gives slightly better precision, specifically, in cases where a non-typed region represents the array
or when the type is a non-array type, which can happen when an array is a result of a reinterpret_cast.
llvm-svn: 182810
It’s important for us to reason about the cast as it is used in std::addressof. The reason we did not
handle the cast previously was a crash on a test case (see commit r157478). The crash was in
processing array to pointer decay when the region type was not an array. Address the issue, by
just returning an unknown in that case.
llvm-svn: 182808
In addition to enabling more code reuse, this suppresses some false positives by allowing us to
compare an element region to its base. See the ptr-arith.cpp test cases for an example.
llvm-svn: 182780
When generating path notes, implicit function bodies are shown at the call
site, so that, say, copying a POD type in C++ doesn't jump you to a header
file. This is especially important when the synthesized function itself
calls another function (or block), in which case we should try to jump the
user around as little as possible.
By checking whether a called function has a body in the AST, we can tell
if the analyzer synthesized the body, and if we should therefore collapse
the call down to the call site like a true implicitly-defined function.
<rdar://problem/13978414>
llvm-svn: 182677
The new edge algorithm would keep track of the previous location in each
location context, so that it could draw arrows coming in and out of each
inlined call. However, it tried to access the location of the call before
it was actually set (at the CallEnter node). This only affected
unterminated calls at the end of a path; calls with visible exit nodes
already had a valid location.
This patch ditches the location context map, since we're processing the
nodes in order anyway, and just unconditionally updates the PrevLoc
variable after popping out of an inlined call.
<rdar://problem/13983470>
llvm-svn: 182676
Currently, blocks instantiated in templates lose their "signature as
written"; it's not clear if this is intentional. Change the analyzer's
use of BlockDecl::getSignatureAsWritten to check whether or not the
signature is actually there.
<rdar://problem/13954714>
llvm-svn: 182497
The crash is triggered by the newly added option (-analyzer-config report-in-main-source-file=true) introduced in r182058.
Note, ideally, we’d like to report the issue within the main source file here as well.
For now, just do not crash.
llvm-svn: 182445
Ted and I spent a long time discussing this today and found out that neither
the existing code nor the new code was doing what either of us thought it
was, which is never good. The good news is we found a much simpler way to
fix the motivating test case (an ObjCSubscriptExpr).
This reverts r182083, but pieces of it will come back in subsequent commits.
llvm-svn: 182185
This optimizes some spurious edges resulting from PseudoObjectExprs.
This required far more changes than I anticipated. The current
ParentMap does not record any hierarchy information between
a PseudoObjectExpr and its *semantic* expressions that may be
wrapped in OpaqueValueExprs, which are the expressions actually
laid out in the CFG. This means the arrow pruning logic could
not map from an expression to its containing PseudoObjectExprs.
To solve this, this patch adds a variant of ParentMap that
returns the "semantic" parentage of expressions (essentially
as they are viewed by the CFG). This alternate ParentMap is then
used by the arrow reducing logic to identify edges into pseudo
object expressions, and then eliminate them.
llvm-svn: 182083
The analyzer can't see the reference count for shared_ptr, so it doesn't
know whether a given destruction is going to delete the referenced object.
This leads to spurious leak and use-after-free warnings.
For now, just ban destructors named '~shared_ptr', which catches
std::shared_ptr, std::tr1::shared_ptr, and boost::shared_ptr.
PR15987
llvm-svn: 182071
Previously, we’ve used the last location of the analyzer issue path as the location of the
report. This might not provide the best user experience, when one analyzer a source
file and the issue appears in the header. Introduce an option to use the last location
of the path that is in the main source file as the report location.
New option can be enabled with -analyzer-config report-in-main-source-file=true.
llvm-svn: 182058
This class is a StmtVisitor that distinguishes between block-level and
non-block-level statements in a CFG. However, it does so using a hard-coded
idea of which statements might be block-level, which probably isn't accurate
anymore. The only implementer of the CFGStmtVisitor hierarchy was the
analyzer's DeadStoresChecker, and the analyzer creates a linearized CFG
anyway (every non-trivial statement is a block-level statement).
This also allows us to remove the block-expr map ("BlkExprMap"), which
mapped statements to positions in the CFG. Apart from having a helper type
that really should have just been Optional<unsigned>, it was only being
used to ask /if/ a particular expression was block-level, for traversal
purposes in CFGStmtVisitor.
llvm-svn: 181945
ASTDumper was already trying to do this & instead got an implicit bool
conversion by surprise (thus printing out 0 or 1 instead of the name of
the declaration). To avoid that issue & simplify call sites, simply make
it the normal/expected operator<<(raw_ostream&, ...) overload & simplify
all the existing call sites. (bonus: this function doesn't need to be a
member or friend, it's just using public API in DeclarationName)
llvm-svn: 181832
This patch renames getLinkage to getLinkageInternal. Only code that
needs to handle UniqueExternalLinkage specially should call this.
Linkage, as defined in the c++ standard, is provided by
getFormalLinkage. It maps UniqueExternalLinkage to ExternalLinkage.
Most places in the compiler actually want isExternallyVisible, which
handles UniqueExternalLinkage as internal.
llvm-svn: 181677
In most cases it is, by just looking at the name. Also, this check prevents the heuristic from working in strange user settings.
radar://13839692
llvm-svn: 181615
Consider this example:
char *p = malloc(sizeof(char));
systemFunction(&p);
free(p);
In this case, when we call systemFunction, we know (because it's a system
function) that it won't free 'p'. However, we /don't/ know whether or not
it will /change/ 'p', so the analyzer is forced to invalidate 'p', wiping
out any bindings it contains. But now the malloc'd region looks like a
leak, since there are no more bindings pointing to it, and we'll get a
spurious leak warning.
The fix for this is to notice when something is becoming inaccessible due
to invalidation (i.e. an imperfect model, as opposed to being explicitly
overwritten) and stop tracking it at that point. Currently, the best way
to determine this for a call is the "indirect escape" pointer-escape kind.
In practice, all the patch does is take the "system functions don't free
memory" special case and limit it to direct parameters, i.e. just the
arguments to a call and not other regions accessible to them. This is a
conservative change that should only cause us to escape regions more
eagerly, which means fewer leak warnings.
This isn't perfect for several reasons, the main one being that this
example is treated the same as the one above:
char **p = malloc(sizeof(char *));
systemFunction(p + 1);
// leak
Currently, "addresses accessible by offsets of the starting region" and
"addresses accessible through bindings of the starting region" are both
considered "indirect" regions, hence this uniform treatment.
Another issue is our longstanding problem of not distinguishing const and
non-const bindings; if in the first example systemFunction's parameter were
a char * const *, we should know that the function will not overwrite 'p',
and thus we can safely report the leak.
<rdar://problem/13758386>
llvm-svn: 181607
The one user has been changed to use getLValue on the compound literal
expression and then use the normal bindLoc to assign a value. No need
to special case this in the StoreManager.
llvm-svn: 181214
This occurs because in C++11 the compound literal syntax can trigger a
constructor call via list-initialization. That is, "Point{x, y}" and
"(Point){x, y}" end up being equivalent. If this occurs, the inner
CXXConstructExpr will have already handled the object construction; the
CompoundLiteralExpr just needs to propagate that value forwards.
<rdar://problem/13804098>
llvm-svn: 181213
This change required some minor changes to LocationContextMap to have it map
from PathPieces to LocationContexts instead of PathDiagnosticCallPieces to
LocationContexts. These changes are in the other diagnostic
generation logic as well, but are functionally equivalent.
Interestingly, this optimize requires delaying "cleanUpLocation()" until
later; possibly after all edges have been optimized. This is because
we need PathDiagnosticLocations to refer to the semantic entity (e.g. a statement)
as long as possible. Raw source locations tell us nothing about
the semantic relationship between two locations in a path.
llvm-svn: 181084
FindLastStoreBRVisitor is responsible for finding where a particular region
gets its value; if the region is a VarRegion, it's possible that value was
assigned at initialization, i.e. at its DeclStmt. However, if a function is
called recursively, the same DeclStmt may be evaluated multiple times in
multiple stack frames. FindLastStoreBRVisitor was not taking this into
account and just picking the first one it saw.
<rdar://problem/13787723>
llvm-svn: 180997
There were actually two bugs here:
- if we decided to look for an interesting lvalue or call expression, we
wouldn't go find its node if we also knew we were at a (different) call.
- if we looked through one message send with a nil receiver, we thought we
were still looking at an argument to the original call.
Put together, this kept us from being able to track the right values, which
means sub-par diagnostics and worse false-positive suppression.
Noticed by inspection.
llvm-svn: 180996