Commit Graph

54 Commits

Author SHA1 Message Date
Anna Zaks d2a807d831 [analyzer] Fix an infinite recursion in region invalidation by adding block count to the BlockDataRegion.
llvm-svn: 195174
2013-11-20 00:11:42 +00:00
Jordan Rose 36bc6b4559 [analyzer] Don't even try to convert floats to booleans for now.
We now have symbols with floating-point type to make sure that
(double)x == (double)x comes out true, but we still can't do much with
these. For now, don't even bother trying to create a floating-point zero
value; just give up on conversion to bool.

PR14634, C++ edition.

llvm-svn: 190953
2013-09-18 18:58:58 +00:00
Jordan Rose acd080b956 [analyzer] Add support for testing the presence of weak functions.
When casting the address of a FunctionTextRegion to bool, or when adding
constraints to such an address, use a stand-in symbol to represent the
presence or absence of the function if the function is weakly linked.
This is groundwork for possible simple availability testing checks, and
can already catch mistakes involving inverted null checks for
weakly-linked functions.

Currently, the implementation reuses the "extent" symbols, originally created
for tracking the size of a malloc region. Since FunctionTextRegions cannot
be dereferenced, the extent symbol will never be used for anything else.
Still, this probably deserves a refactoring in the future.

This patch does not attempt to support testing the presence of weak
/variables/ (global variables), which would likely require much more of
a change and a generalization of "region structure metadata", like the
current "extents", vs. "region contents metadata", like CStringChecker's
"string length".

Patch by Richard <tarka.t.otter@googlemail.com>!

llvm-svn: 189492
2013-08-28 17:07:04 +00:00
Jordan Rose 783b11b5df [analyzer] Weaken assertion to account for pointer-to-integer casts.
PR16690

llvm-svn: 187132
2013-07-25 17:22:02 +00:00
Jordan Rose 5fded08403 [analyzer] Handle C string default values for const char * arguments.
Previously, SValBuilder knew how to evaluate StringLiterals, but couldn't
handle an array-to-pointer decay for constant values. Additionally,
RegionStore was being too strict about loading from an array, refusing to
return a 'char' value from a 'const char' array. Both of these have been
fixed.

llvm-svn: 186520
2013-07-17 17:16:38 +00:00
Anna Zaks 5416ab0156 [analyzer] Use the expression’s type instead of region’s type in ArrayToPointer decay evaluation
This gives slightly better precision, specifically, in cases where a non-typed region represents the array
or when the type is a non-array type, which can happen when an array is a result of a reinterpret_cast.

llvm-svn: 182810
2013-05-28 23:24:01 +00:00
Jordan Rose c76d7e3d96 [analyzer] Don't try to evaluate MaterializeTemporaryExpr as a constant.
...and don't consider '0' to be a null pointer constant if it's the
initializer for a float!

Apparently null pointer constant evaluation looks through both
MaterializeTemporaryExpr and ImplicitCastExpr, so we have to be more
careful about types in the callers. For RegionStore this just means giving
up a little more; for ExprEngine this means handling the
MaterializeTemporaryExpr case explicitly.

Follow-up to r180894.

llvm-svn: 180944
2013-05-02 19:51:20 +00:00
Jordan Rose 89bbd1fb64 [analyzer] Consolidate constant evaluation logic in SValBuilder.
Previously, this was scattered across Environment (literal expressions),
ExprEngine (default arguments), and RegionStore (global constants). The
former special-cased several kinds of simple constant expressions, while
the latter two deferred to the AST's constant evaluator.

Now, these are all unified as SValBuilder::getConstantVal(). To keep
Environment fast, the special cases for simple constant expressions have
been left in, but the main benefits are that (a) unusual constants like
ObjCStringLiterals now work as default arguments and global constant
initializers, and (b) we're not duplicating code between ExprEngine and
RegionStore.

This actually caught a bug in our test suite, which is awesome: we stop
tracking allocated memory if it's passed as an argument along with some
kind of callback, but not if the callback is 0. We were testing this in
a case where the callback parameter had a default value, but that value
was 0. After this change, the analyzer now (correctly) flags that as a
leak!

<rdar://problem/13773117>

llvm-svn: 180894
2013-05-01 23:10:44 +00:00
Jordan Rose dc16628c93 Re-apply "[analyzer] Model casts to bool differently from other numbers."
This doesn't appear to be the cause of the slowdown. I'll have to try a
manual bisect to see if there's really anything there, or if it's just
the bot itself taking on additional load. Meanwhile, this change helps
with correctness.

This changes an assertion and adds a test case, then re-applies r180638,
which was reverted in r180714.

<rdar://problem/13296133> and PR15863

llvm-svn: 180864
2013-05-01 18:19:59 +00:00
Jordan Rose 49f888bbab Revert "[analyzer] Model casts to bool differently from other numbers."
This seems to be causing quite a slowdown on our internal analyzer bot,
and I'm not sure why. Needs further investigation.

This reverts r180638 / 9e161ea981f22ae017b6af09d660bfc3ddf16a09.

llvm-svn: 180714
2013-04-29 17:23:03 +00:00
Jordan Rose 9661c1d18a [analyzer] Model casts to bool differently from other numbers.
Casts to bool (and _Bool) are equivalent to checks against zero,
not truncations to 1 bit or 8 bits.

This improved reasoning does cause a change in the behavior of the alpha
BoolAssignment checker. Previously, this checker complained about statements
like "bool x = y" if 'y' was known not to be 0 or 1. Now it does not, since
that conversion is well-defined. It's hard to say what the "best" behavior
here is: this conversion is safe, but might be better written as an explicit
comparison against zero.

More usefully, besides improving our model of booleans, this fixes spurious
warnings when returning the address of a local variable cast to bool.

<rdar://problem/13296133>

llvm-svn: 180638
2013-04-26 21:42:55 +00:00
Anna Zaks 8591aa78db [analyzer] Do not crash when processing binary "?:" in C++
When computing the value of ?: expression, we rely on the last expression in
the previous basic block to be the resulting value of the expression. This is
not the case for binary "?:" operator (GNU extension) in C++. As the last
basic block has the expression for the condition subexpression, which is an
R-value, whereas the true subexpression is the L-value.

Note the operator evaluation just happens to work in C since the true
subexpression is an R-value (like the condition subexpression). CFG is the
same in C and C++ case, but the AST nodes are different, which the LValue to
Rvalue conversion happening after the BinaryConditionalOperator evaluation.

Changed the logic to only use the last expression from the predecessor only
if it matches either true or false subexpression. Note, the logic needed
fortification anyway: L and R were passed but not even used by the function.

Also, change the conjureSymbolVal to correctly compute the type, when the
expression is an LG-value.

llvm-svn: 179574
2013-04-15 22:38:07 +00:00
Jordan Rose 61e221f68d [analyzer] Replace isIntegerType() with isIntegerOrEnumerationType().
Previously, the analyzer used isIntegerType() everywhere, which uses the C
definition of "integer". The C++ predicate with the same behavior is
isIntegerOrUnscopedEnumerationType().

However, the analyzer is /really/ using this to ask if it's some sort of
"integrally representable" type, i.e. it should include C++11 scoped
enumerations as well. hasIntegerRepresentation() sounds like the right
predicate, but that includes vectors, which the analyzer represents by its
elements.

This commit audits all uses of isIntegerType() and replaces them with the
general isIntegerOrEnumerationType(), except in some specific cases where
it makes sense to exclude scoped enumerations, or any enumerations. These
cases now use isIntegerOrUnscopedEnumerationType() and getAs<BuiltinType>()
plus BuiltinType::isInteger().

isIntegerType() is hereby banned in the analyzer - lib/StaticAnalysis and
include/clang/StaticAnalysis. :-)

Fixes real assertion failures. PR15703 / <rdar://problem/12350701>

llvm-svn: 179081
2013-04-09 02:30:33 +00:00
David Blaikie 05785d1622 Include llvm::Optional in clang/Basic/LLVM.h
Post-commit CR feedback from Jordan Rose regarding r175594.

llvm-svn: 175679
2013-02-20 22:23:23 +00:00
David Blaikie 2fdacbc5b0 Replace SVal llvm::cast support to be well-defined.
See r175462 for another example/more details.

llvm-svn: 175594
2013-02-20 05:52:05 +00:00
Anna Zaks fe9c7c87c9 [analyzer] Teach the analyzer to use a symbol for p when evaluating
(void*)p.

Addresses the false positives similar to the test case.

llvm-svn: 174436
2013-02-05 19:52:28 +00:00
Chandler Carruth 3a02247dc9 Sort all of Clang's files under 'lib', and fix up the broken headers
uncovered.

This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.

I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.

llvm-svn: 169237
2012-12-04 09:13:33 +00:00
Ted Kremenek d227833cba Rename 'getConjuredSymbol*' to 'conjureSymbol*'.
No need to have the "get", the word "conjure" is a verb too!
Getting a conjured symbol is the same as conjuring one up.

This shortening is largely cosmetic, but just this simple changed
cleaned up a handful of lines, making them less verbose.

llvm-svn: 162348
2012-08-22 06:26:06 +00:00
Ted Kremenek 72b3452c2b Implement initial static analysis inlining support for C++ methods.
llvm-svn: 159047
2012-06-22 23:55:50 +00:00
Anna Zaks 3563fde6a0 [analyzer] Anti-aliasing: different heap allocations do not alias
Add a concept of symbolic memory region belonging to heap memory space.
When comparing symbolic regions allocated on the heap, assume that they
do not alias. 

Use symbolic heap region to suppress a common false positive pattern in
the malloc checker, in code that relies on malloc not returning the
memory aliased to other malloc allocations, stack.

llvm-svn: 158136
2012-06-07 03:57:32 +00:00
Anna Zaks d0867105f4 [analyzer] Treat cast of array to reference in the same way as array to
pointer.

Fixes one of the crashes reported in PR12874.

llvm-svn: 157401
2012-05-24 17:31:57 +00:00
Anna Zaks f0e9ca8604 [analyzer] Do not assert on constructing SymSymExpr with diff types.
The resulting type info is stored in the SymSymExpr, so no reason not to
support construction of expression with different subexpression types.

llvm-svn: 156051
2012-05-03 02:13:53 +00:00
Anna Zaks 1d3d51a6e6 [analyzer] Add a complexity bound on history tracking.
(Currently, this is only relevant for tainted data.)

llvm-svn: 156050
2012-05-03 02:13:50 +00:00
Anna Zaks 7124b4b124 [analyzer] Revert the functional part of r155944.
The change resulted in multiple issues on the buildbot, so it's not
ready for prime time. Only enable history tracking for tainted
data(which is experimental) for now.

llvm-svn: 156049
2012-05-03 02:13:46 +00:00
Anna Zaks 06be9117bf [analyzer] Fix an assertion failure triggered by the analyzer buildbot.
llvm-svn: 155964
2012-05-02 00:05:23 +00:00
Ted Kremenek f56d4f2991 Teach SValBuilder to handle casts of symbolic pointer values to an integer twice. Fixes <rdar://problem/11212866>.
llvm-svn: 155950
2012-05-01 21:58:29 +00:00
Anna Zaks b35437a85e [analyzer] Construct a SymExpr even when the constraint solver cannot
reason about the expression.

This essentially keeps more history about how symbolic values were
constructed. As an optimization, previous to this commit, we only kept
the history if one of the symbols was tainted, but it's valuable keep
the history around for other purposes as well: it allows us to avoid
constructing conjured symbols.

Specifically, we need to identify the value of ptr as
ElementRegion (result of pointer arithmetic) in the following code.
However, before this commit '(2-x)' evaluated to Unknown value, and as
the result, 'p + (2-x)' evaluated to Unknown value as well.

int *p = malloc(sizeof(int));
ptr = p + (2-x);

This change brings 2% slowdown on sqlite. Fixes radar://11329382.

llvm-svn: 155944
2012-05-01 21:10:26 +00:00
Ted Kremenek 8fdb59f979 [analyzer] fix regression in analyzer of NOT actually aborting on Stmts it doesn't understand. We registered
as aborted, but didn't treat such cases as sinks in the ExplodedGraph.

Along the way, add basic support for CXXCatchStmt, expanding the set of code we actually analyze (hopefully correctly).

Fixes: <rdar://problem/10892489>
llvm-svn: 152468
2012-03-10 01:34:17 +00:00
Ted Kremenek d519cae8aa Have conjured symbols depend on LocationContext, to add context sensitivity for functions called more than once.
llvm-svn: 150849
2012-02-17 23:13:45 +00:00
Benjamin Kramer 11764ab4c0 StaticAnalyzer: Move ObjC- and CXX-specific methods out of line so checkers that don't care about the language don't have to pull in all the headers.
llvm-svn: 149178
2012-01-28 12:06:22 +00:00
Ted Kremenek 49b1e38e4b Change references to 'const ProgramState *' to typedef 'ProgramStateRef'.
At this point this is largely cosmetic, but it opens the door to replace
ProgramStateRef with a smart pointer that more eagerly acts in the role
of reclaiming unused ProgramState objects.

llvm-svn: 149081
2012-01-26 21:29:00 +00:00
Anna Zaks cb6d4ee793 [analyzer] Unwrap the pointers when ignoring the const cast.
radar://10686991

llvm-svn: 148081
2012-01-13 00:56:55 +00:00
David Blaikie 68e081d606 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146959
2011-12-20 02:48:34 +00:00
Anna Zaks c95a6c4c9f [analyzer] Address Jordy's comments for r145985.
llvm-svn: 146683
2011-12-15 21:33:26 +00:00
Anna Zaks 7c96b7db96 [analyzer] CStringChecker should not rely on the analyzer generating UndefOrUnknown value when it cannot reason about the expression.
We are now often generating expressions even if the solver is not known to be able to simplify it. This is another cleanup of the existing code, where the rest of the analyzer and checkers should not base their logic on knowing ahead of the time what the solver can reason about. 

In this case, CStringChecker is performing a check for overflow of 'left+right' operation. The overflow can be checked with either 'maxVal-left' or 'maxVal-right'. Previously, the decision was based on whether the expresion evaluated to undef or not. With this patch, we check if one of the arguments is a constant, in which case we know that 'maxVal-const' is easily simplified. (Another option is to use canReasonAbout() method of the solver here, however, it's currently is protected.)

This patch also contains 2 small bug fixes:
 - swap the order of operators inside SValBuilder::makeGenericVal.
 - handle a case when AddeVal is unknown in GenericTaintChecker::getPointedToSymbol.

llvm-svn: 146343
2011-12-11 18:43:40 +00:00
Anna Zaks 170fdf1b5a [analyzer]Fixup r146336.
Forgot to commit the Header files. 
Rename generateUnknownVal -> makeGenericVal.

llvm-svn: 146337
2011-12-10 23:42:38 +00:00
Anna Zaks ecd730085d [analyzer] Introduce IntSymExpr, where the integer is on the lhs.
Fix a bug in SimpleSValBuilder, where we should swap lhs and rhs when calling generateUnknownVal(), - the function which creates symbolic expressions when data is tainted. The issue is not visible when we only create the expressions for taint since all expressions are commutative from taint perspective.

Refactor SymExpr::symbol_iterator::expand() to use a switch instead of a chain of ifs.

llvm-svn: 146336
2011-12-10 23:36:51 +00:00
Anna Zaks 6af472aa3b [analyzer] Fix inconsistency on when SValBuilder assumes that 2
types are equivalent.

+ A taint test which tests bitwise operations and which was
triggering an assertion due to presence of the integer to integer cast.

llvm-svn: 146240
2011-12-09 03:34:02 +00:00
Anna Zaks c25efccc8b [analyzer] Propagate taint through NonLoc to NonLoc casts.
- Created a new SymExpr type - SymbolCast.
 - SymbolCast is created when we don't know how to simplify a NonLoc to
NonLoc casts.
 - A bit of code refactoring: introduced dispatchCast to have better
code reuse, remove a goto.
 - Updated the test case to showcase the new taint flow.

llvm-svn: 145985
2011-12-06 23:12:27 +00:00
Anna Zaks d066f79c80 [analyzer] Unify SymbolVal and SymExprVal under a single SymbolVal
class.

We are going into the direction of handling SymbolData and other SymExpr
uniformly, so it makes less sense to keep two different SVal classes.
For example, the checkers would have to take an extra step to reason
about each type separately.

The classes have the same members, we were just using the SVal kind
field for easy differentiation in 3 switch statements. The switch
statements look more ugly now, but we can make the code more readable in
other ways, for example, moving some code into separate functions.

llvm-svn: 145833
2011-12-05 18:58:30 +00:00
Anna Zaks 951d205aec [analyzer] Minor cleanup of SValBuilder: Comments + code reuse.
llvm-svn: 145274
2011-11-28 20:43:37 +00:00
Anna Zaks 457c68726c [analyzer] Warn when non pointer arguments are passed to scanf (only when running taint checker).
There is an open radar to implement better scanf checking as a Sema warning. However, a bit of redundancy is fine in this case.

llvm-svn: 144964
2011-11-18 02:26:36 +00:00
Anna Zaks 040ddfedc0 [analyzer] Do not conjure a symbol when we need to propagate taint.
When the solver and SValBuilder cannot reason about symbolic expressions (ex: (x+1)*y ), the analyzer conjures a new symbol with no ties to the past. This helps it to recover some path-sensitivity. However, this breaks the taint propagation.

With this commit, we are going to construct the expression even if we cannot reason about it later on if an operand is tainted.

Also added some comments and asserts.

llvm-svn: 144932
2011-11-17 23:07:28 +00:00
Ted Kremenek 81ce1c8a99 Rename AnalysisContext to AnalysisDeclContext. Not only is this name more accurate, but it frees up the name AnalysisContext for other uses.
llvm-svn: 142782
2011-10-24 01:32:45 +00:00
Anna Zaks 295208d744 [analyzer] Fix a new failure encountered while building Adium exposed as a result of r138196(radar://10087620). ObjectiveC property of type int has a value of type ObjCPropRef, which is a Loc.
llvm-svn: 139507
2011-09-12 17:56:08 +00:00
Ted Kremenek 001fd5b498 Rename GRState to ProgramState, and cleanup some code formatting along the way.
llvm-svn: 137665
2011-08-15 22:09:50 +00:00
Ted Kremenek 5ef32dbf2a Cleanup various declarations of 'Stmt*' to be 'Stmt *', etc. in libAnalyzer and libStaticAnalyzer[*]. It was highly inconsistent, and very ugly to look at.
llvm-svn: 137537
2011-08-12 23:37:29 +00:00
Ted Kremenek 8df44b2632 [analyzer] Introduce new MemRegion, "TypedValueRegion", so that we can separate TypedRegions that implement getValueType() from those that don't.
Patch by Olaf Krzikalla!

llvm-svn: 137498
2011-08-12 20:02:48 +00:00
Zhanyong Wan 5ad574c096 Improves the coding style in SValBuilder. This patch:
- renames evalCastNL and evalCastL to evalCastFromNonLoc and
  evalCastFromLoc (avoid abbreviations that aren't well known).

- makes all function parameter names start with a lower case letter
  for consistency and distinction from member variables.

- avoids abbreviations in function parameter names.

Reviewed by kremenek@apple.com.

llvm-svn: 126722
2011-03-01 00:45:32 +00:00
Argyrios Kyrtzidis eb8357c1d8 [analyzer] Fix crash when analyzing C++ code.
llvm-svn: 126025
2011-02-19 08:03:18 +00:00