The only use of AggExprVisitor was in #if 0 code (the analyzer's incomplete C++ support), so there is no actual behavioral change anyway.
llvm-svn: 152856
BugVisitor DiagnosticPieces.
When checkers create a DiagnosticPieceEvent, they can supply an extra
string, which will be concatenated with the call exit message for every
call on the stack between the diagnostic event and the final bug report.
(This is a simple version, which could be/will be further enhanced.)
For example, this is used in Malloc checker to produce the ",
which allocated memory" in the following example:
static char *malloc_wrapper() { // 2. Entered call from 'use'
return malloc(12); // 3. Memory is allocated
}
void use() {
char *v;
v = malloc_wrapper(); // 1. Calling 'malloc_wrappers'
// 4. Returning from 'malloc_wrapper', which allocated memory
} // 5. Memory is never released; potential
memory leak
llvm-svn: 152837
inlining to be the reverse of their declaration.
This optimizes running time under inlining up to 20% since we do not
re-analyze the utility functions which are usually defined first in the
translation unit if they have already been analyzed while inlined into
the root functions.
llvm-svn: 152653
BFS should give slightly better performance. Ex: Suppose, we have two
roots R1 and R2. A callee function C is reachable through both. However,
C is not inlined when analyzing R1 due to inline stack depth limit. With
DFS, C will be analyzed as top level even though it would be analyzed as
inlined through R2. On the other hand, BFS could avoid analyzing C as
top level.
llvm-svn: 152652
AnalysisConsumer.
As a result:
- We now analyze the C++ methods which are defined within the
class body. These were completely skipped before.
- Ensure that AST checkers are called on functions in the
order they are defined in the Translation unit.
llvm-svn: 152650
track whether the referenced declaration comes from an enclosing
local context. I'm amenable to suggestions about the exact meaning
of this bit.
llvm-svn: 152491
as aborted, but didn't treat such cases as sinks in the ExplodedGraph.
Along the way, add basic support for CXXCatchStmt, expanding the set of code we actually analyze (hopefully correctly).
Fixes: <rdar://problem/10892489>
llvm-svn: 152468
We do not reanalyze a function, which has already been analyzed as an
inlined callee. As per PRELIMINARY testing, this gives over
50% run time reduction on some benchmarks without decreasing of the
number of bugs found.
Turning the mode on by default.
llvm-svn: 152440
Essentially, a bug centers around a story for various symbols and regions. We should only include
the path diagnostic events that relate to those symbols and regions.
The pruning is done by associating a set of interesting symbols and regions with a BugReporter, which
can be modified at BugReport creation or by BugReporterVisitors.
This patch reduces the diagnostics emitted in several of our test cases. I've vetted these as
having desired behavior. The only regression is a missing null check diagnostic for the return
value of realloc() in test/Analysis/malloc-plist.c. This will require some investigation to fix,
and I have added a FIXME to the test case.
llvm-svn: 152361
analyzed.
The CallGraph is used when inlining is on, which is the current default.
This alone does not bring any performance improvement. It's a
stepping stone for the upcoming optimization in which we do not
re-analyze a function that has already been analyzed while inlined in
other functions. Using the call graph makes it easier to play with
the order of functions to minimize redundant analyzes.
llvm-svn: 152352
The final graph contains a single root node, which is a parent of all externally available functions(and 'main'). As well as a list of Parentless/Unreachable functions, which are either truly unreachable or are unreachable due to our analyses imprecision.
The analyzer checkers debug.DumpCallGraph or debug.ViewGraph can be used to look at the produced graph.
Currently, the graph is not very precise, for example, it entirely skips edges resulted from ObjC method calls.
llvm-svn: 152272
analysis to make the AST representation testable. They are represented by a
new UserDefinedLiteral AST node, which is a sugared CallExpr. All semantic
properties, including full CodeGen support, are achieved for free by this
representation.
UserDefinedLiterals can never be dependent, so no custom instantiation
behavior is required. They are mangled as if they were direct calls to the
underlying literal operator. This matches g++'s apparent behavior (but not its
actual mangling, which is broken for literal-operator-ids).
User-defined *string* literals are now fully-operational, but the semantic
analysis is quite hacky and needs more work. No other forms of user-defined
literal are created yet, but the AST support for them is present.
This patch committed after midnight because we had already hit the quota for
new kinds of literal yesterday.
llvm-svn: 152211
command line options for inlining tuning.
This adds the option for stack depth bound as well as function size
bound.
+ minor doxygenification
llvm-svn: 151930
funopen, setvbuf.
Teach the checker and the engine about these APIs to resolve malloc
false positives. As I am adding more of these APIs, it is clear that all
this should be factored out into a separate callback (for example,
region escapes). Malloc, KeyChainAPI and RetainRelease checkers could
all use it.
llvm-svn: 151737
This introduces a concept of a "prunable" PathDiagnosticEvent. Currently this is a flag, but
we may evolve the concept to make this more dynamically inferred.
llvm-svn: 151663
When allocated buffer is passed to CF/NS..NoCopy functions, the
ownership is transfered unless the deallocator argument is set to
'kCFAllocatorNull'.
llvm-svn: 151608
Assume none of the ObjC messages defined in system headers free memory,
except for the ones containing 'freeWhenDone' selector. Currently, just
assume that the region escapes to the messages with 'freeWhenDone'
(ideally, we want to treat it as 'free()').
For now, always assume that regions escape when passed to C++ methods.
llvm-svn: 151410
that provides the behavior of the C++11 library trait
std::is_trivially_constructible<T, Args...>, which can't be
implemented purely as a library.
Since __is_trivially_constructible can have zero or more arguments, I
needed to add Yet Another Type Trait Expression Class, this one
handling arbitrary arguments. The next step will be to migrate
UnaryTypeTrait and BinaryTypeTrait over to this new, more general
TypeTrait class.
Fixes the Clang side of <rdar://problem/10895483> / PR12038.
llvm-svn: 151352
When we find two leak reports with the same allocation site, report only
one of them.
Provide a helper method to BugReporter to facilitate this.
llvm-svn: 151287
Make this call an exception in ExprEngine::invalidateArguments:
'int pthread_setspecific(ptheread_key k, const void *)' stores
a value into thread local storage. The value can later be retrieved
with 'void *ptheread_getspecific(pthread_key)'. So even thought the
parameter is 'const void *', the region escapes through the
call.
(Here we just blacklist the call in the ExprEngine's default
logic. Another option would be to add a checker which evaluates
the call and triggers the call to invalidate regions.)
Teach the Malloc Checker, which treats all system calls as safe about
the API.
llvm-svn: 151220
- We should not evaluate strdup in the Malloc Checker, it's the job of
CString checker, so just update the RefState to reflect allocated
memory.
- Refactor to reduce LOC: remove some wrapper auxiliary functions, make
all functions return the state and add the transition in one place
(instead of in each auxiliary function).
llvm-svn: 151188
block pointer that returns a block literal which captures (by copy)
the lambda closure itself. Some aspects of the block literal are left
unspecified, namely the capture variable (which doesn't actually
exist) and the body (which will be filled in by IRgen because it can't
be written as an AST).
Because we're switching to this model, this patch also eliminates
tracking the copy-initialization expression for the block capture of
the conversion function, since that information is now embedded in the
synthesized block literal. -1 side tables FTW.
llvm-svn: 151131
it aware of CString APIs that return the input parameter.
Malloc Checker needs to know how the 'strcpy' function is
evaluated. Introduce the dependency on CStringChecker for that.
CStringChecker knows all about these APIs.
Addresses radar://10864450
llvm-svn: 150846
Holding the constructor directly makes no sense when list-initialized arrays come into play. The constructor is now held in a CXXConstructExpr, if construction is what is done. The new design can also distinguish properly between list-initialization and direct-initialization, as well as implicit default-initialization constructors and explicit value-initialization constructors. Finally, doing it this way removes redundance from the AST because CXXNewExpr doesn't try to handle both the allocation and the initialization responsibilities.
This breaks the static analysis of new expressions. I've filed PR12014 to track this.
llvm-svn: 150682
piece can always be generated.
The default end of diagnostic path piece was failing to generate on a
BlockEdge that was outgoing from a basic block without a terminator,
resulting in a very simple diagnostic being rendered (ex: no path
highlighting or custom visitors). Reuse another function, which is
essentially doing the same thing and correct it not to fail when a block
has no terminator.
llvm-svn: 150659
We are not properly handling the memory regions that escape into struct
fields, which led to a bunch of false positives. Be conservative here
and give up when a pointer escapes into a struct.
llvm-svn: 150658
is general goodness because representations of member pointers are
not always equivalent across member pointer types on all ABIs
(even though this isn't really standard-endorsed).
Take advantage of the new information to teach IR-generation how
to do these reinterprets in constant initializers. Make sure this
works when intermingled with hierarchy conversions (although
this is not part of our motivating use case). Doing this in the
constant-evaluator would probably have been better, but that would
require a *lot* of extra structure in the representation of
constant member pointers: you'd really have to track an arbitrary
chain of hierarchy conversions and reinterpretations in order to
get this right. Ultimately, this seems less complex. I also
wasn't quite sure how to extend the constant evaluator to handle
foldings that we don't actually want to treat as extended
constant expressions.
llvm-svn: 150551
(In response of Ted's review of r150112.)
This moves the logic which checked if a symbol escapes through a
parameter to invalidateRegionCallback (instead of post CallExpr visit.)
To accommodate the change, added a CallOrObjCMessage parameter to
checkRegionChanges callback.
llvm-svn: 150513
in realloc map.
If there is no dependency, the reallocated ptr will get garbage
collected before we know that realloc failed, which would lead us to
missing a memory leak warning.
Also added new test cases, which we can handle now.
Plus minor cleanups.
llvm-svn: 150446
1) Support the case when realloc fails to reduce False Positives. (We
essentially need to restore the state of the pointer being reallocated.)
2) Realloc behaves differently under special conditions (from pointer is
null, size is 0). When detecting these cases, we should consider
under-constrained states (size might or might not be 0). The
old version handled this in a very hacky way. The code did not
differentiate between definite and possible (no consideration for
under-constrained states). Further, after processing each special case,
the realloc processing function did not return but chained to the next
special case processing. So you could end up in an execution in which
you first see the states in which size is 0 and realloc ~ free(),
followed by the states corresponding to size is not 0 followed by the
evaluation of the regular realloc behavior.
llvm-svn: 150402
memory.
(As per one test case, the existing checker thought that this could
cause a lot of false positives - not sure if that's valid, to be
verified.)
llvm-svn: 150313
which allows values to escape through unknown calls.
Assumes all calls but the malloc family are unknown.
Also, catch a use-after-free when a pointer is passed to a
function after a call to free (previously, you had to explicitly
dereference the pointer value).
llvm-svn: 150112
optimistic.
TODO: actually implement the pessimistic version of the checker. Ex: it
needs to assume that any function that takes a pointer might free it.
The optimistic version relies on annotations to tell us which functions
can free the pointer.
llvm-svn: 150111
This seems to negatively affect compile time onsome ObjC tests
(which use a lot of partial diagnostics I assume). I have to come
up with a way to keep them inline without including Diagnostic.h
everywhere. Now adding a new diagnostic requires a full rebuild
of e.g. the static analyzer which doesn't even use those diagnostics.
This reverts commit 6496bd10dc3a6d5e3266348f08b6e35f8184bc99.
This reverts commit 7af19b817ba964ac560b50c1ed6183235f699789.
This reverts commit fdd15602a42bbe26185978ef1e17019f6d969aa7.
This reverts commit 00bd44d5677783527d7517c1ffe45e4d75a0f56f.
This reverts commit ef9b60ffed980864a8db26ad30344be429e58ff5.
llvm-svn: 150006
- Capturing variables by-reference and by-copy within a lambda
- The representation of lambda captures
- The creation of the non-static data members in the lambda class
that store the captured variables
- The initialization of the non-static data members from the
captured variables
- Pretty-printing lambda expressions
There are a number of FIXMEs, both explicit and implied, including:
- Creating a field for a capture of 'this'
- Improved diagnostics for initialization failures when capturing
variables by copy
- Dealing with temporaries created during said initialization
- Template instantiation
- AST (de-)serialization
- Binding and returning the lambda expression; turning it into a
proper temporary
- Lots and lots of semantic constraints
- Parameter pack captures
llvm-svn: 149977
- Move the offending methods out of line and fix transitive includers.
- This required changing an enum in the PPCallback API into an unsigned.
llvm-svn: 149782
Fix all the files that depended on transitive includes of Diagnostic.h.
With this patch in place changing a diagnostic no longer requires a full rebuild of the StaticAnalyzer.
llvm-svn: 149781
the the code like this (due to x and &x being the same value but
different size):
void* x[] = { ptr1, ptr2, ptr3 };
CFArrayCreate(NULL, (const void **) &x, count, NULL);
llvm-svn: 149579
Original log:
Convert ProgramStateRef to a smart pointer for managing the reference counts of ProgramStates. This leads to a slight memory
improvement, and a simplification of the logic for managing ProgramState objects.
# Please enter the commit message for your changes. Lines starting
llvm-svn: 149339
Original log:
Convert ProgramStateRef to a smart pointer for managing the reference counts of ProgramStates. This leads to a slight memory
improvement, and a simplification of the logic for managing ProgramState objects.
llvm-svn: 149336
At this point this is largely cosmetic, but it opens the door to replace
ProgramStateRef with a smart pointer that more eagerly acts in the role
of reclaiming unused ProgramState objects.
llvm-svn: 149081
using CFArrayCreate & family.
Specifically, CFArrayCreate's input should be:
'A C array of the pointer-sized values to be in the new array.'
(radar://10717339)
llvm-svn: 149008
to the underlying consumer implementation. This allows us to unique reports across analyses to multiple functions (which
shows up with inlining).
llvm-svn: 148997
This is accomplished by periodically reclaiming nodes in the graph. This was an optimization
done before the CFG was linearized, but the CFG linearization destroyed that optimization since each
freshly created node couldn't be reclaimed and we only looked at a window of nodes created between
each ProcessStmt. This optimization can be reclaimed my merely expanding the window to N number of nodes.
llvm-svn: 148888
Also, slightly modify the diagnostic message in ArrayBound and DivZero (still use 'taint', which might not mean much to the user, but plan on changing it later).
llvm-svn: 148626
size (Ex: in malloc, memcpy, strncpy..)
(Maybe some of this could migrate to the CString checker. One issue
with that is that we might want to separate security issues from
regular API misuse.)
llvm-svn: 148371
- Add atomic-to/from-nonatomic cast types
- Emit atomic operations for arithmetic on atomic types
- Emit non-atomic stores for initialisation of atomic types, but atomic stores and loads for every other store / load
- Add a __atomic_init() intrinsic which does a non-atomic store to an _Atomic() type. This is needed for the corresponding C11 stdatomic.h function.
- Enables the relevant __has_feature() checks. The feature isn't 100% complete yet, but it's done enough that we want people testing it.
Still to do:
- Make the arithmetic operations on atomic types (e.g. Atomic(int) foo = 1; foo++;) use the correct LLVM intrinsic if one exists, not a loop with a cmpxchg.
- Add a signal fence builtin
- Properly set the fenv state in atomic operations on floating point values
- Correctly handle things like _Atomic(_Complex double) which are too large for an atomic cmpxchg on some platforms (this requires working out what 'correctly' means in this context)
- Fix the many remaining corner cases
llvm-svn: 148242
To simplify the process:
Refactor taint generation checker to simplify passing the
information on which arguments need to be tainted from pre to post
visit.
Todo: We need to factor out the code that sema is using to identify the
string and memcpy functions and use it here and in the CString checker.
llvm-svn: 148010
the common *alloc functions as well as a few tiny wibbles (adds a note
to CWE/CERT advisory numbers in the bug output, and fixes a couple
80-column-wide violations.)"
Patch by Austin Seipp!
llvm-svn: 147931
My hope is to reimplement this from first principles based on the simplifications of removing unneeded node builders
and re-evaluating how C++ calls are handled in the CFG. The hope is to turn inlining "on-by-default" as soon as possible
with a core set of things working well, and then expand over time.
llvm-svn: 147904
A patch by Dmitri Gribenko!
The attached patch fixes a use-after-free in AnalysisConsumer::HandleTranslationUnit. The problem is that
BugReporter's destructor runs after AnalysisManager has been already
deleted. The fix introduces a scope to force correct destruction
order.
A crash happens only when reports have been added in AnalysisConsumer::HandleTranslationUnit's BugReporter. We don't have such checkers in clang so no test.
llvm-svn: 147732
We already have a more conservative check in the compiler (if the
format string is not a literal, we warn). Still adding it here for
completeness and since this check is stronger - only triggered if the
format string is tainted.
llvm-svn: 147714
(Stmt*,LocationContext*) pairs to SVals instead of Stmt* to SVals.
This is needed to support basic IPA via inlining. Without this, we cannot tell
if a Stmt* binding is part of the current analysis scope (StackFrameContext) or
part of a parent context.
This change introduces an uglification of the use of getSVal(), and thus takes
two steps forward and one step back. There are also potential performance implications
of enlarging the Environment. Both can be addressed going forward by refactoring the
APIs and optimizing the internal representation of Environment. This patch
mainly introduces the functionality upon when we want to build upon (and clean up).
llvm-svn: 147688
as a result of a call.
Problem:
Global variables, which come in from system libraries should not be
invalidated by all calls. Also, non-system globals should not be
invalidated by system calls.
Solution:
The following solution to invalidation of globals seems flexible enough
for taint (does not invalidate stdin) and should not lead to too
many false positives. We split globals into 3 classes:
* immutable - values are preserved by calls (unless the specific
global is passed in as a parameter):
A : Most system globals and const scalars
* invalidated by functions defined in system headers:
B: errno
* invalidated by all other functions (note, these functions may in
turn contain system calls):
B: errno
C: all other globals (which are not in A nor B)
llvm-svn: 147569
type is a pointer to const. (radar://10595327)
The regions corresponding to the pointer and reference arguments to
a function get invalidated by the calls since a function call can
possibly modify the pointed to data. With this change, we are not going
to invalidate the data if the argument is a pointer to const. This
change makes the analyzer more optimistic in reporting errors.
(Support for C, C++ and Obj C)
llvm-svn: 147002
Check if the input parameters are tainted (or point to tainted data) on
a checkPreStmt<CallExpr>. If the output should be tainted, record it in
the state. On post visit (checkPostStmt<CallExpr>), use the state to
make decisions (in addition to the existing logic). Use this logic for
atoi and fscanf.
llvm-svn: 146793
Some of the test cases do not currently work because the analyzer core
does not seem to call checkers for pre/post DeclRefExpr visits.
(Opened radar://10573500. To be fixed later on.)
llvm-svn: 146536
We are now often generating expressions even if the solver is not known to be able to simplify it. This is another cleanup of the existing code, where the rest of the analyzer and checkers should not base their logic on knowing ahead of the time what the solver can reason about.
In this case, CStringChecker is performing a check for overflow of 'left+right' operation. The overflow can be checked with either 'maxVal-left' or 'maxVal-right'. Previously, the decision was based on whether the expresion evaluated to undef or not. With this patch, we check if one of the arguments is a constant, in which case we know that 'maxVal-const' is easily simplified. (Another option is to use canReasonAbout() method of the solver here, however, it's currently is protected.)
This patch also contains 2 small bug fixes:
- swap the order of operators inside SValBuilder::makeGenericVal.
- handle a case when AddeVal is unknown in GenericTaintChecker::getPointedToSymbol.
llvm-svn: 146343
Fix a bug in SimpleSValBuilder, where we should swap lhs and rhs when calling generateUnknownVal(), - the function which creates symbolic expressions when data is tainted. The issue is not visible when we only create the expressions for taint since all expressions are commutative from taint perspective.
Refactor SymExpr::symbol_iterator::expand() to use a switch instead of a chain of ifs.
llvm-svn: 146336
types are equivalent.
+ A taint test which tests bitwise operations and which was
triggering an assertion due to presence of the integer to integer cast.
llvm-svn: 146240
between the casted type of the return value of a malloc/calloc/realloc
call and the operand of any sizeof expressions contained within
its argument(s).
llvm-svn: 146144
- Created a new SymExpr type - SymbolCast.
- SymbolCast is created when we don't know how to simplify a NonLoc to
NonLoc casts.
- A bit of code refactoring: introduced dispatchCast to have better
code reuse, remove a goto.
- Updated the test case to showcase the new taint flow.
llvm-svn: 145985
class.
We are going into the direction of handling SymbolData and other SymExpr
uniformly, so it makes less sense to keep two different SVal classes.
For example, the checkers would have to take an extra step to reason
about each type separately.
The classes have the same members, we were just using the SVal kind
field for easy differentiation in 3 switch statements. The switch
statements look more ugly now, but we can make the code more readable in
other ways, for example, moving some code into separate functions.
llvm-svn: 145833
ExprEngine.
Teach SimpleConstraintManager::assumeSymRel() to propagate constraints
to symbolic expressions.
+ One extra warning (real bug) is now generated due to enhanced
assumeSymRel().
llvm-svn: 145832
ConstraintManager::canReasonAbout() from the ExprEngine.
ExprEngine should not care if the constraint solver can reason about
something or not. The solver should be able to handle all the SymExprs.
To do this, the solver should be able to keep track of not only the
SymbolData but of all SymExprs. This is why we change SymbolRef to be an
alias of SymExpr*. When encountering an expression it cannot simplify,
the solver should just add the constraints to it.
llvm-svn: 145831