handling the parsing of scanf format strings and hooking the checking into Sema.
Most of this checking logic piggybacks on what was already there for checking printf format
strings, but the checking logic has been refactored to support both.
What is left to be done is to support argument type checking in format strings and of course
fix the usual tail of bugs that will follow.
llvm-svn: 108500
types, updating callers of both isFloatingType() and
isRealFloatingType() accordingly. Caught at least one issue where we
allowed one to declare a vector of vectors (!), along with cleaning up
the standard-conversion logic for C++.
llvm-svn: 106595
- Precision toStrings shouldn't print a dot when they have no value.
- Length of char length modifier is now returned correctly.
- Added several fixit tests.
Note: fixit tests are currently broken due to a bug in HighlightRange. Marking as XFAIL for now.
M test/Sema/format-strings-fixit.c
M include/clang/Analysis/Analyses/PrintfFormatString.h
M lib/Analysis/PrintfFormatString.cpp
llvm-svn: 106275
- Added warning for undefined behavior when using field specifier
- Added warning for undefined behavior when using length modifier
- Fixed warnings for invalid flags
- Added warning for ignored flags
- Added fixits for the above warnings
- Fixed accuracy of detecting several undefined behavior conditions
- Receive normal warnings in addition to security warnings when using %n
- Fix bug where '+' flag would remain on unsigned conversion suggestions
Summary of changes:
- Added expanded tests
- Added/expanded warnings
- Added position info to OptionalAmounts for fixits
- Extracted optional flags to a wrapper class with position info for fixits
- Added several methods to validate a FormatSpecifier by component, each checking for undefined behavior
- Fixed conversion specifier checking to conform to C99 standard
- Added hooks to detect the invalid states in CheckPrintfHandler::HandleFormatSpecifier
Note: warnings involving the ' ' (space) flag are temporarily disabled until whitespace highlighting no longer triggers assertions. I will make a post about this on cfe-dev shortly.
M test/Sema/format-strings.c
M include/clang/Basic/DiagnosticSemaKinds.td
M include/clang/Analysis/Analyses/PrintfFormatString.h
M lib/Analysis/PrintfFormatString.cpp
M lib/Sema/SemaChecking.cpp
llvm-svn: 106233
- Added some handling of flags that become invalid when changing the conversion specifier.
- Changed fixit behavior to remove unnecessary length modifiers.
- Separated some tests out and added some comments.
modified:
lib/Analysis/PrintfFormatString.cpp
test/Sema/format-strings-fixit.c
llvm-svn: 105807
- Refactored LengthModifier to be a class.
- Added toString methods in all member classes of FormatSpecifier.
- FixIt suggestions keep user specified flags unless incorrect.
Limitations:
- The suggestions are not conversion specifier sensitive. For example, if we have a 'pad with zeroes' flag, and the correction is a string conversion specifier, we do not remove the flag. Clang will warn us on the next compilation.
A test/Sema/format-strings-fixit.c
M include/clang/Analysis/Analyses/PrintfFormatString.h
M lib/Analysis/PrintfFormatString.cpp
M lib/Sema/SemaChecking.cpp
llvm-svn: 105680
The macros required for DeclNodes use have changed to match the use of
StmtNodes. The FooFirst enumerator constants have been named firstFoo
to match usage elsewhere.
llvm-svn: 105165
This introduces FunctionType::ExtInfo to hold the calling convention and the
noreturn attribute. The next patch will extend it to include the regparm
attribute and fix the bug.
llvm-svn: 99920
After discussion with Zhongxing, don't force the initializer of DeclStmts to be
block-level expressions.
This led to some interesting fallout:
[UninitializedValues]
Always visit the initializer of DeclStmts (do not assume they are block-level expressions).
[BasicStore]
With initializers of DeclStmts no longer block-level expressions, this causes self-referencing initializers (e.g. 'int x = x') to no longer cause the initialized variable to be live before the DeclStmt. While this is correct, it caused BasicStore::RemoveDeadBindings() to prune off the values of these variables from the initial store (where they are set to uninitialized). The fix is to back-port some (and only some) of the lazy-binding logic from RegionStore to
BasicStore. Now the default values of local variables are determined lazily as opposed
to explicitly initialized.
llvm-svn: 97591
Sema and into analyze_printf::ParseFormatString(). Also use a bitvector to determine
what arguments have been covered (instead of just checking to see if the last argument consumed is the max argument). This is prep. for support positional arguments (an IEEE extension).
llvm-svn: 97248
- Add ConversionSpecifier::consumesDataArgument() as a helper method
to determine if a conversion specifier requires a matching argument.
- Add support for glibc-specific '%m' conversion
- Add an extra callback to HandleNull() for locations within the
format specifier that have a null character
llvm-svn: 94834
In addition, move ParseFormatString() and FormatStringHandler() from
the clang::analyze_printf to the clang namespace. Hopefully this will
resolve some link errors on Linux.
llvm-svn: 94794
strings than what we currently have in Sema. This is both an
experiment and a WIP.
The idea is simple: parse the format string incrementally,
constructing a well-structure representation of each format specifier.
Each format specifier is then handed back one-by-one to a client via a
callback. Malformed format strings are also handled with callbacks.
The idea is to separate the parsing of the format string from the
emission of diagnostics. Currently what we have in Sema for handling
format strings is a mongrel of both that is hard to follow and
difficult to modify (I can apply this label since I'm the original
author of that code).
This is in libAnalysis as it is reasonable generic and can potentially
be used both by libSema and libChecker.
Comments welcome.
llvm-svn: 94702
(1) libAnalysis is a generic analysis library that can be used by
Sema. It defines the CFG, basic dataflow analysis primitives, and
inexpensive flow-sensitive analyses (e.g. LiveVariables).
(2) libChecker contains the guts of the static analyzer, incuding the
path-sensitive analysis engine and domain-specific checks.
Now any clients that want to use the frontend to build their own tools
don't need to link in the entire static analyzer.
This change exposes various obvious cleanups that can be made to the
layout of files and headers in libChecker. More changes pending. :)
This change also exposed a layering violation between AnalysisContext
and MemRegion. BlockInvocationContext shouldn't explicitly know about
BlockDataRegions. For now I've removed the BlockDataRegion* from
BlockInvocationContext (removing context-sensitivity; although this
wasn't used yet). We need to have a better way to extend
BlockInvocationContext (and any LocationContext) to add
context-sensitivty.
llvm-svn: 94406
CallExprs as those edges help cause a n^2 explosion in the number of
destructor calls. Other consumers, such as static analysis, that
would like to have more a more complete CFG can select the inclusion
of those edges as CFG build time.
This also fixes up the two compilation users of CFGs to be tolerant of
having or not having those edges. All catch code is assumed be to
live if we didn't generate the exceptional edges for CallExprs.
llvm-svn: 94074
"ASTContext::getTypeSize() / 8". Replace [u]int64_t variables with CharUnits
ones as appropriate.
Also rename RawType, fromRaw(), and getRaw() in CharUnits to QuantityType,
fromQuantity(), and getQuantity() for clarity.
llvm-svn: 93153
value bindings. Along with a small change to OSAtomicChecker, this
resolves <rdar://problem/7527292> and resolves some long-standing
issues with how values can be bound to the same physical address by
not have the same "key". This change is only a beginning; logically
RegionStore needs to better handle loads from addresses where the
stored value is larger/smaller/different type than the loaded value.
We handle these cases in an approximate fashion now (via
CastRetrievedVal and help in SimpleSValuator), but it could be made
much smarter.
llvm-svn: 93137
(1) Introduce a new 'BindingKey' class to match 'BindingValue'. This
gives us the flexibility to change the current key value from 'const
MemRegion*' to something more interesting.
(2) Rework additions/removals/lookups from the store to use new
'Remove', 'Add', 'Lookup' utility methods.
No "real" functionality change; just prep work and abstraction.
llvm-svn: 93136
CallExpr/ObjCMessageExpr can be visited in an "lvalue" context if it
returns a struct temporary. Currently the analyzer doesn't reason
about struct temporary returned by function calls, but we shouldn't
crash here either.
llvm-svn: 93081
when the default case is winnowed down to be infeasible. When all
cases were ruled out (and the analysis state for the default case
would be infeasible) we would still consider the default case
possible. This fixes PR 5969.
llvm-svn: 93017
GRStateManager. Having these references was an abstraction violation,
as they really should only be known about GRExprEngine.
This change required adding a new 'ProcessAssume' callback in
GRSubEngine. GRExprEngine implements this callback by calling
'EvalAssume' on all registered Checker objects as well as the
registered GRTransferFunc object.
llvm-svn: 92549
Add new states for symbolic regions tracked by malloc checker. This enables us
to do malloc checking more accurately. See test case.
Based on Lei Zhang's patch and discussion.
llvm-svn: 92342
This change was a lot bigger than I originally anticipated; among
other things it requires us storing more information in the CFG to
record what block-level expressions need to be evaluated as lvalues.
The big change is that CFGBlocks no longer contain Stmt*'s by
CFGElements. Currently CFGElements just wrap Stmt*, but they also
store a bit indicating whether the block-level expression should be
evalauted as an lvalue. DeclStmts involving the initialization of a
reference require us treating the initialization expression as an
lvalue, even though that information isn't recorded in the AST.
Conceptually this change isn't that complicated, but it required
bubbling up the data through the CFGBuilder, to GRCoreEngine, and
eventually to GRExprEngine.
The addition of CFGElement is also useful for when we want to handle
more control-flow constructs or other data we want to keep in the CFG
that isn't represented well with just a block of statements.
In GRExprEngine, this patch introduces logic for evaluating the
lvalues of references, which currently retrieves the internal "pointer
value" that the reference represents. EvalLoad does a two stage load
to catch null dereferences involving an invalid reference (although
this could possibly be caught earlier during the initialization of a
reference).
Symbols are currently symbolicated using the reference type, instead
of a pointer type, and special handling is required creating
ElementRegions that layer on SymbolicRegions (see the changes to
RegionStoreManager).
Along the way, the DeadStoresChecker also silences warnings involving
dead stores to references. This was the original change I introduced
(which I wrote test cases for) that I realized caused GRExprEngine to
crash.
llvm-svn: 91501
Remove isPod() from DenseMapInfo, splitting it out to its own
isPodLike type trait. This is a generally useful type trait for
more than just DenseMap, and we really care about whether something
acts like a pod, not whether it really is a pod.
llvm-svn: 91422
now, don't construct CFGs that contain C++ try/catch statements, and
have GRExprEngine abort a path if it encounters a C++ construct it
doesn't understand (which is mostly everything at this point).
llvm-svn: 91389
Otherwise, even when real evaluation occurs, the previous fake auto
transitions would still be in the destination set, causing fake state
bifurcation.
llvm-svn: 90967
by the test case in PR 5627. Essentially we shouldn't clear the
ExplodedNodeSet where we deposit newly constructed nodes if that set
is the 'Dst' set passed in. It is not okay to clear that set because
it may already contain nodes.
llvm-svn: 90931
- Refactor the MemRegion hierarchy to distinguish between different StackSpaceRegions for locals and parameters.
- VarRegions for "captured" variables now have the BlockDataRegion as their super region (except those passed by reference)
- Add transfer function support to GRExprEngine for BlockDeclRefExprs.
This change also supports analyzing blocks as an analysis entry point
(top-of-the-stack), which required pushing more context-sensitivity
around in the MemRegion hierarchy via the use of LocationContext
objects. Functionally almost everything is the same, except we track
LocationContexts in a few more areas and StackSpaceRegions now refer
to a StackFrameContext object. In the future we will need to modify
MemRegionManager to allow multiple StackSpaceRegions in flight at once
(for the analysis of multiple stack frames).
llvm-svn: 90809
handler to this interface.
GRExprEngine::CheckerEvalCall() will return true if one of the checkers has
processed the node. In the future this might return void when we have some
default checker.
llvm-svn: 90755
we don't need to use the DoneEvaluation hack when check for
ObjCMessageExpr.
PreVisitObjCMessageExpr() only checks for undefined receiver or arguments.
Add checker interface EvalNilReceiver(). This is a 'once-and-done' interface.
llvm-svn: 90296