we are encountering some scalability issues with memory usage. The
appropriate long term fix is to make the analysis more scalable, but
this will at least prevent the analyzer swapping when
analyzing very large functions.
llvm-svn: 159578
properly if there is a join point in the control flow graph that involves
a trylock. Also changes the source locations of some warnings to be
more consistent.
llvm-svn: 159008
express library-level dependencies within Clang.
This is no more verbose really, and plays nicer with the rest of the
CMake facilities. It should also have no change in functionality.
llvm-svn: 158888
In addition, I've made the pointer and reference typedef 'void' rather than T*
just so they can't get misused. I would've omitted them entirely but
std::distance likes them to be there even if it doesn't use them.
This rolls back r155808 and r155869.
Review by Doug Gregor incorporating feedback from Chandler Carruth.
llvm-svn: 158104
-Wsometimes-uninitialized. This detects cases where an explicitly-written branch
inevitably leads to an uninitialized variable use (so either the branch is dead
code or there is an uninitialized use bug).
This chunk of warnings tentatively lives within -Wuninitialized, in order to
give it more visibility to existing Clang users.
llvm-svn: 157458
For "%hhx", printf expects an unsigned char. This makes Clang
accept a 'char' argument for that also when using -funsigned-char.
This fixes PR12761.
llvm-svn: 156388
Teach ASTContext about WIntType, and have it taken from TargetInfo like WCharType. Should fix test/Sema/format-strings.c for ARM, with the exception of one subtest which will fail if wint_t and wchar_t are the same size and wint_t is signed, wchar_t is unsigned.
There'll be a followup commit to fix that.
Reviewed by Chandler and Hans at http://llvm.org/reviews/r/8
llvm-svn: 156165
cases in switch statements. Also add a [[clang::fallthrough]] attribute, which
can be used to suppress the warning in the case of intentional fallthrough.
Patch by Alexander Kornienko!
The handling of C++11 attribute namespaces in this patch is temporary, and will
be replaced with a cleaner mechanism in a subsequent patch.
llvm-svn: 156086
filter_decl_iterator had a weird mismatch where both op* and op-> returned T*
making it difficult to generalize this filtering behavior into a reusable
library of any kind.
This change errs on the side of value, making op-> return T* and op* return
T&.
(reviewed by Richard Smith)
llvm-svn: 155808
g++4.7, which reuses stack space allocated for temporaries. CFGElement::getAs
returns a suitably-cast version of 'this'. Patch by Markus Trippelsdorf!
No test: this code has the same observable behavior as the old code when built
with most compilers, and the tests were already failing when built with a
compiler for which this produced a broken binary.
llvm-svn: 155803
This is needed to ensure that we always report issues in the correct
function. For example, leaks are identified when we call remove dead
bindings. In order to make sure we report a callee's leak in the callee,
we have to run the operation in the callee's context.
This change required quite a bit of infrastructure work since:
- We used to only run remove dead bindings before a given statement;
here we need to run it after the last statement in the function. For
this, we added additional Program Point and special mode in the
SymbolReaper to remove all symbols in context lower than the current
one.
- The call exit operation turned into a sequence of nodes, which are
now guarded by CallExitBegin and CallExitEnd nodes for clarity and
convenience.
(Sorry for the long diff.)
llvm-svn: 155244
We have a new flavor of exception specification, EST_Uninstantiated. A function
type with this exception specification carries a pointer to a FunctionDecl, and
the exception specification for that FunctionDecl is instantiated (if needed)
and used in the place of the function type's exception specification.
When a function template declaration with a non-trivial exception specification
is instantiated, the specialization's exception specification is set to this
new 'uninstantiated' kind rather than being instantiated immediately.
Expr::CanThrow has migrated onto Sema, so it can instantiate exception specs
on-demand. Also, any odr-use of a function triggers the instantiation of its
exception specification (the exception specification could be needed by IRGen).
In passing, fix two places where a DeclRefExpr was created but the corresponding
function was not actually marked odr-used. We used to get away with this, but
don't any more.
Also fix a bug where instantiating an exception specification which refers to
function parameters resulted in a crash. We still have the same bug in default
arguments, which I'll be looking into next.
This, plus a tiny patch to fix libstdc++'s common_type, is enough for clang to
parse (and, in very limited testing, support) all of libstdc++4.7's standard
headers.
llvm-svn: 154886
attached. Since we do not support any attributes which appertain to a statement
(yet), testing of this is necessarily quite minimal.
Patch by Alexander Kornienko!
llvm-svn: 154723
We should not deserialize unused declarations from the PCH file. Achieve
this by storing the top level declarations during parsing
(HandleTopLevelDecl ASTConsumer callback) and analyzing/building a call
graph only for those.
Tested the patch on a sample ObjC file that uses PCH. With the patch,
the analyzes is 17.5% faster and clang consumes 40% less memory.
Got about 10% overall build/analyzes time decrease on a large Objective
C project.
A bit of CallGraph refactoring/cleanup as well..
llvm-svn: 154625
during construction of branches for chained logical operators.
This makes -fsyntax-only for test/Sema/many-logical-ops.c about 32x times faster.
With measuring SemaExpr.cpp I see differences below the noise level.
llvm-svn: 153297
Specifically, we use the last store of the leaked symbol in the leak diagnostic.
(No support for struct fields since the malloc checker doesn't track those
yet.)
+ Infrastructure to track the regions used in store evaluations.
This approach is more precise than iterating the store to
obtain the region bound to the symbol, which is used in RetainCount
checker. The region corresponds to what is uttered in the code in the
last store and we do not rely on the store implementation to support
this functionality.
llvm-svn: 153212
The one difference between ObjCMethodDecl::getMethodFamily and Selector::getMethodFamily is that the former will do some additional sanity checking, and since CoreFoundation types don't look like Objective-C objects, an otherwise interesting method will get a method family of OMF_None. Future clients that use method families should consider how they want to handle CF types.
llvm-svn: 153000
track whether the referenced declaration comes from an enclosing
local context. I'm amenable to suggestions about the exact meaning
of this bit.
llvm-svn: 152491
as aborted, but didn't treat such cases as sinks in the ExplodedGraph.
Along the way, add basic support for CXXCatchStmt, expanding the set of code we actually analyze (hopefully correctly).
Fixes: <rdar://problem/10892489>
llvm-svn: 152468
This renames the -Wformat-non-standard flag to -Wformat-non-iso,
rewords the current warnings a bit (pointing out that a format string
is not supported by ISO C rather than being "non standard"),
and adds a warning about positional arguments.
llvm-svn: 152403
The final graph contains a single root node, which is a parent of all externally available functions(and 'main'). As well as a list of Parentless/Unreachable functions, which are either truly unreachable or are unreachable due to our analyses imprecision.
The analyzer checkers debug.DumpCallGraph or debug.ViewGraph can be used to look at the produced graph.
Currently, the graph is not very precise, for example, it entirely skips edges resulted from ObjC method calls.
llvm-svn: 152272
analysis to make the AST representation testable. They are represented by a
new UserDefinedLiteral AST node, which is a sugared CallExpr. All semantic
properties, including full CodeGen support, are achieved for free by this
representation.
UserDefinedLiterals can never be dependent, so no custom instantiation
behavior is required. They are mangled as if they were direct calls to the
underlying literal operator. This matches g++'s apparent behavior (but not its
actual mangling, which is broken for literal-operator-ids).
User-defined *string* literals are now fully-operational, but the semantic
analysis is quite hacky and needs more work. No other forms of user-defined
literal are created yet, but the AST support for them is present.
This patch committed after midnight because we had already hit the quota for
new kinds of literal yesterday.
llvm-svn: 152211
This adds the -Wformat-non-standard flag (off by default,
enabled by -pedantic), which warns about non-standard
things in format strings (such as the 'q' length modifier,
the 'S' conversion specifier, etc.)
llvm-svn: 151154
This is in preparation for being able to warn about 'q' and other
non-standard format string features.
It also allows us to print its name correctly.
llvm-svn: 150697
This commit makes PrintfSpecifier::fixType() and ScanfSpecifier::fixType()
only fix a conversion specification enough that Clang wouldn't warn about it,
as opposed to always changing it to use the "canonical" conversion specifier.
(PR11975)
This preserves the user's choice of conversion specifier in cases like:
printf("%a", (long double)1);
where we previously suggested "%Lf", we now suggest "%La"
printf("%x", (long)1);
where we previously suggested "%ld", we now suggest "%lx".
llvm-svn: 150578
* When we detect that a CFG block has inconsistent lock sets, point the
diagnostic at the location where we found the inconsistency, and point a note
at somewhere the inconsistently-locked mutex was locked.
* Fix the wording of the normal (non-loop, non-end-of-function) case of this
diagnostic to not suggest that the mutex is going out of scope.
* Fix the diagnostic emission code to keep a warning and its note together when
sorting the diagnostics into source location order.
llvm-svn: 149669
This fixes the case where Clang would output:
error: format specifies type 'wchar_t *' (aka 'wchar_t *')
ArgTypeResult::getRepresentativeTypeName needs to take into account
that wchar_t can be a built-in type (as opposed to in C, where it is a
typedef).
llvm-svn: 149387
block-level expr. Currently CXXConstructExpr is always added as a block-level
expr. This caused two problems for the analyzer (and potentially for the
CFG-based codegen).
1. We have no way to know whether a ctor call is base or complete.
2. We have no way to know the destination object being contructed.
llvm-svn: 147306
declarations and definitions) as ObjCInterfaceDecls within the same
redeclaration chain. This new representation matches what we do for
C/C++ variables/functions/classes/templates/etc., and makes it
possible to answer the query "where are all of the declarations of
this class?"
llvm-svn: 146679
in addition to underlying type.
For example, the warning for printf("%zu", 42.0);
changes from "conversion specifies type 'unsigned long'" to "conversion
specifies type 'size_t' (aka 'unsigned long')"
(This is a second attempt after r145697, which got reverted.)
llvm-svn: 146032
For example, the warning for printf("%zu", 42.0);
changes from "conversion specifies type 'unsigned long'" to "conversion
specifies type 'size_t' (aka 'unsigned long')"
llvm-svn: 145697
lifetimes have been extended via reference binding. The type of the
reference and the type of the temporary are not necessarily the same,
which could cause a crash. Fixes <rdar://problem/10398199>.
llvm-svn: 144646
property references to use a new PseudoObjectExpr
expression which pairs a syntactic form of the expression
with a set of semantic expressions implementing it.
This should significantly reduce the complexity required
elsewhere in the compiler to deal with these kinds of
expressions (e.g. IR generation's special l-value kind,
the static analyzer's Message abstraction), at the lower
cost of specifically dealing with the odd AST structure
of these expressions. It should also greatly simplify
efforts to implement similar language features in the
future, most notably Managed C++'s properties and indexed
properties.
Most of the effort here is in dealing with the various
clients of the AST. I've gone ahead and simplified the
ObjC rewriter's use of properties; other clients, like
IR-gen and the static analyzer, have all the old
complexity *and* all the new complexity, at least
temporarily. Many thanks to Ted for writing and advising
on the necessary changes to the static analyzer.
I've xfailed a small diagnostics regression in the static
analyzer at Ted's request.
llvm-svn: 143867
implicitly perform an lvalue-to-rvalue conversion if used on an lvalue
expression. Also improve the documentation of Expr::Evaluate* to indicate which
of them will accept expressions with side-effects.
llvm-svn: 143263
The code had it backwards, thinking size_t was signed, and using that for "%zd".
Also let the analysis get the types for (u)intmax_t while we are at it.
llvm-svn: 143099
For PR11152. Make PrintSpecifier::fixType() suggest "%zu" for size_t, etc.
rather than looking at the underlying type and suggesting "%llu" or other
platform-specific length modifiers. Applies to C99 and C++11.
llvm-svn: 142342
to take a FunctionDecl* instead of an llvm::StringRef. Eventually
we might push more logic in there, like using slightly different
conventions for C++ methods.
Also, fix a bug where 'copy' and 'create' were being caught in
non-camel-cased strings. We want copyFoo and CopyFoo and XCopy
but not Xcopy or xcopy.
llvm-svn: 140911
- Speed of "merge()", which merged data flow facts. This was doing a set canonicalization on every insertion, which was super slow.
To fix this, we use ImmutableSetRef.
- Visit CFGBlocks in reverse postorder. This is a huge speedup, as on some test cases the algorithm would take many iterations
to converge.
This contains a bunch of copy-paste from UninitializedValues.cpp and ThreadSafety.cpp. The idea
was to get something working first, and then refactor the common logic for all three files into
a separate analysis/library entry point.
llvm-svn: 139968
temporary objects and local variables. When detected, these split the
block, marking the new one as having only the exit block as a successor.
This prevents a large number of false positives in warnings sensitive to
no-return constructs such as -Wreturn-type, and fixes the remainder of
PR10063 along with several variations of this bug that had not been
reported. The test cases are extended across the board to cover these
patterns.
This also checks in a stress test for these types of CFGs. The stress
test declares some 32k variables, a mixture of no-return and normal
destructors. Previously, this resulted in roughly 2500 CFG blocks, but
didn't model any of the no-return destructors. With this patch, it
results in over 33k blocks, many of them now unreachable.
The nice thing about how the analyzer is set up? This causes *no*
regression in performance of building the CFG. It actually in some cases
makes it faster, as best I can benchmark. The analysis for -Wreturn-type
(and any other that cares about no-return code paths) is technically
slower now as it has to look at many more candidate blocks, but it
computes the correct answer. I have more test cases to follow, I think
they all work now. Also I have further work that should dramatically
simplify analyses in the presence of no-return.
llvm-svn: 139586
and case statements. Use this to make the logic in the CFG builder more
robust at finding the actual statements within a compound statement,
even when there are many layers of labels obscuring it.
Also extend the test cases for a large chunk of PR10063. Still more work
to do here though.
llvm-svn: 139437
incorrectly in the CFG, and also the static analyzer. This patch regresses the analyzer a bit, but
that needs to be followed up with a better solution.
Fixes <rdar://problem/10008112>.
llvm-svn: 138372
Having a notion of an actual ProgramPointTag will aid in introspection of the analyzer's behavior.
For example, the GraphViz output of the analyzer will pretty-print the tags in a useful manner.
llvm-svn: 137529
The motivation of this large change is to drastically simplify the logic in ExprEngine going forward.
Some fallout is that the output of some BugReporterVisitors is not as accurate as before; those will
need to be fixed over time. There is also some possible performance regression as RemoveDeadBindings
will be called frequently; this can also be improved over time.
llvm-svn: 136419
AnalysisBasedWarnings Sema layer and out of the Analysis library itself.
This returns the uninitialized values analysis to a more pure form,
allowing its original logic to correctly detect some categories of
definitely uninitialized values. Fixes PR10358 (again).
Thanks to Ted for reviewing and updating this patch after his rewrite of
several portions of this analysis.
llvm-svn: 135748
This is accomplished by forcing the needed expressions for -Wuninitialized to always be CFGElements in the CFG.
This allows us to remove a fair amount of the code for -Wuninitialized.
Some fallout:
- AnalysisBasedWarnings.cpp now specifically toggles the CFGBuilder to create a CFG that is suitable for -Wuninitialized. This
is a layering violation, since the logic for -Wuninitialized is in libAnalysis. This can be fixed with the proper refactoring.
- Some of the source locations for -Wunreachable-code warnings have shifted. While not ideal, this is okay because that analysis
already needs some serious reworking.
llvm-svn: 135480
patch, we actually move the state-machine for the value set backwards
one step. This can pretty easily lead to infinite loops where we
continually try to propagate a bit, succeed for one iteration, but then
back up because we find an uninitialized use.
A reduced test case from PR10379 is included.
llvm-svn: 135359
Previously, despite the names 'enqueue' and 'dequeue', it behaved as
a stack and visited blocks in a LIFO fashion. This interacts badly with
extremely broad CFGs *inside* of a loop (such as a large switch inside
a state machine) where every block updates a different variable.
When encountering such a CFG, the checker visited blocks in essentially
a "depth first" order due to the stack-like behavior of the work list.
Combined with each block updating a different variable, the saturation
logic of the checker caused it to re-traverse blocks [1,N-1] of the
broad CFG inside the loop after traversing block N. These re-traversals
were to propagate the variable values derived from block N. Assuming
approximately the same number of variables as inner blocks exist, the
end result is O(N^2) updates. By making this a queue, we also make the
traversal essentially "breadth-first" across each of the N inner blocks
of the loop. Then all of this state is propagated around to all N inner
blocks of the loop. The result is O(N) updates.
The truth is in the numbers:
Before, gcc.c: 96409 block visits (max: 61546, avg: 591)
After, gcc.c: 69958 block visits (max: 33090, avg: 429)
Before, PR10183: 2540494 block vists (max: 2536495, avg: 37360)
After, PR10183: 137803 block visits (max: 134406, avg: 2026)
The nearly 20x reduction in work for PR10183 corresponds to a roughly
100x speedup in compile time.
I've tested it on all the code I can get my hands on, and I've seen no
slowdowns due to this change. Where I've collected stats, the ammount of
work done is on average less. I'll also commit shortly some synthetic
test cases useful in analyzing the performance of CFG-based warnings.
Submitting this based on Doug's feedback that post-commit review should
be good. Ted, please review! Hopefully this helps compile times until
then.
llvm-svn: 134697
Special detail is added for uninitialized variable analysis as this has
serious performance problems than need to be tracked.
Computing some of this data is expensive, for example walking the CFG to
determine its size. To avoid doing that unless the stats data is going
to be used, we thread a bit into the Sema object to track whether
detailed stats should be collected or not. This bit is used to avoid
computations whereever the computations are likely to be more expensive
than checking the state of the flag. Thus, counters are in some cases
unconditionally updated, but the more expensive (and less frequent)
aggregation steps are skipped.
With this patch, we're able to see that for 'gcc.c':
*** Analysis Based Warnings Stats:
232 functions analyzed (0 w/o CFGs).
7151 CFG blocks built.
30 average CFG blocks per function.
1167 max CFG blocks per function.
163 functions analyzed for uninitialiazed variables
640 variables analyzed.
3 average variables per function.
94 max variables per function.
96409 block visits.
591 average block visits per function.
61546 max block visits per function.
And for the reduced testcase in PR10183:
*** Analysis Based Warnings Stats:
98 functions analyzed (0 w/o CFGs).
8526 CFG blocks built.
87 average CFG blocks per function.
7277 max CFG blocks per function.
68 functions analyzed for uninitialiazed variables
1359 variables analyzed.
19 average variables per function.
1196 max variables per function.
2540494 block visits.
37360 average block visits per function.
2536495 max block visits per function.
That last number is the somewhat scary one that indicates the problem in
PR10183.
llvm-svn: 134494
specifiers. Fixes <rdar://problem/9607158>." because it causes false positives
on some code that uses CF toll free bridging.
- I'll let Doug or Ted figure out the right fix here, possibly just to accept
any pointer type.
llvm-svn: 134041
MaterializeTemporaryExpr captures a reference binding to a temporary
value, making explicit that the temporary value (a prvalue) needs to
be materialized into memory so that its address can be used. The
intended AST invariant here is that a reference will always bind to a
glvalue, and MaterializeTemporaryExpr will be used to convert prvalues
into glvalues for that binding to happen. For example, given
const int& r = 1.0;
The initializer of "r" will be a MaterializeTemporaryExpr whose
subexpression is an implicit conversion from the double literal "1.0"
to an integer value.
IR generation benefits most from this new node, since it was
previously guessing (badly) when to materialize temporaries for the
purposes of reference binding. There are likely more refactoring and
cleanups we could perform there, but the introduction of
MaterializeTemporaryExpr fixes PR9565, a case where IR generation
would effectively bind a const reference directly to a bitfield in a
struct. Addresses <rdar://problem/9552231>.
llvm-svn: 133521
Language-design credit goes to a lot of people, but I particularly want
to single out Blaine Garst and Patrick Beard for their contributions.
Compiler implementation credit goes to Argyrios, Doug, Fariborz, and myself,
in no particular order.
llvm-svn: 133103
Related result types apply Cocoa conventions to the type of message
sends and property accesses to Objective-C methods that are known to
always return objects whose type is the same as the type of the
receiving class (or a subclass thereof), such as +alloc and
-init. This tightens up static type safety for Objective-C, so that we
now diagnose mistakes like this:
t.m:4:10: warning: incompatible pointer types initializing 'NSSet *'
with an
expression of type 'NSArray *' [-Wincompatible-pointer-types]
NSSet *array = [[NSArray alloc] init];
^ ~~~~~~~~~~~~~~~~~~~~~~
/System/Library/Frameworks/Foundation.framework/Headers/NSObject.h:72:1:
note:
instance method 'init' is assumed to return an instance of its
receiver
type ('NSArray *')
- (id)init;
^
It also means that we get decent type inference when writing code in
Objective-C++0x:
auto array = [[NSMutableArray alloc] initWithObjects:@"one", @"two",nil];
// ^ now infers NSMutableArray* rather than id
llvm-svn: 132868
Also, have Environment stop looking through NoOp casts; it didn't match the behavior of LiveVariables. And once that's gone, the whole cast block of that switch is unnecessary.
llvm-svn: 132840
This introduces a generic base class for the expression evaluator
classes, which handles a few common expression types which were
previously handled separately in each class. Also, the expression
evaluator now uses ConstStmtVisitor.
llvm-svn: 131281
instantiation), be sure to add the transformed declaration into the
current DeclContext. Also, remove the -Wuninitialized hack that works
around this bug. Fixes <rdar://problem/9200676>.
llvm-svn: 129544
evaluated and unevaluated contexts. Add some testing of sizeof and
typeid.
Both of the typeid tests added here were triggering warnings previously.
Now the one false positive is suppressed without suppressing the warning
on actually buggy code.
llvm-svn: 129431
marked explicitly as uninitialized through direct self initialization:
int x = x;
With r128894 we prevented warnings about this code, and this patch
teaches the analysis engine to continue analyzing subsequent uses of
'x'. This should wrap up PR9624.
There is still an open question of whether we should suppress the
maybe-uninitialized warnings resulting from variables initialized in
this fashion. The definitely-uninitialized uses should always be warned.
llvm-svn: 128932
1) Change the CFG to include the DeclStmt for conditional variables, instead of using the condition itself as a faux DeclStmt.
2) Update ExprEngine (the static analyzer) to understand (1), so not to regress.
3) Update UninitializedValues.cpp to initialize all tracked variables to Uninitialized at the start of the function/method.
4) Only use the SelfReferenceChecker (SemaDecl.cpp) on global variables, leaving the dataflow analysis to handle other cases.
The combination of (1) and (3) allows the dataflow-based -Wuninitialized to find self-init problems when the initializer
contained control-flow.
llvm-svn: 128858
Note this can potentially be enhanced to detect if the __block variable
is actually written by the block, or only when the block "escapes" or
is actually used, but that requires more analysis than it is probably worth
for this simple check.
llvm-svn: 128681
my expertise on the template instantiation logic isn't good enough to fix this problem for real. This patch worksaround the
problem in -Wuninitialized, but we should fix it for real later.
llvm-svn: 128443
This rename serves two purposes:
- It reflects the actual functionality of this analysis.
- We will have more than one reachability analysis.
llvm-svn: 127930