Commit Graph

9 Commits

Author SHA1 Message Date
Artem Dergachev 5657486854 [analyzer] Use faster hashing (MD5) in CloneDetector.
This replaces the old approach of fingerprinting every AST node into a string,
which avoided collisions and was simple to implement, but turned out to be
extremely ineffective with respect to both performance and memory.

The collisions are now dealt with in a separate pass, which no longer causes
performance problems because collisions are rare.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D22515

llvm-svn: 279378
2016-08-20 17:35:53 +00:00
Artem Dergachev 51b9a0e8e8 [analyzer] Make CloneDetector consider macro expansions.
So far macro-generated code was treated by the CloneDetector as normal code.
This caused that some macros where reported as false-positive clones because
large chunks of code coming from otherwise concise macro expansions were treated
as copy-pasted code.

This patch ensures that macros are treated in the same way as literals/function
calls. This prevents macros that expand into multiple statements
from being reported as clones.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23316

llvm-svn: 279367
2016-08-20 10:06:59 +00:00
Artem Dergachev 5183888813 [analyzer] Make CloneDetector consider template arguments.
For example, code samples `isa<Stmt>(S)' and `isa<Expr>(S)'
are no longer considered to be clones.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23555

llvm-svn: 279366
2016-08-20 09:57:21 +00:00
Artem Dergachev 2fc1985db3 [analyzer] Teach CloneDetector to find clones that look like copy-paste errors.
The original clone checker tries to find copy-pasted code that is exactly
identical to the original code, up to minor details.

As an example, if the copy-pasted code has all references to variable 'a'
replaced with references to variable 'b', it is still considered to be
an exact clone.

The new check finds copy-pasted code in which exactly one variable seems
out of place compared to the original code, which likely indicates
a copy-paste error (a variable was forgotten to be renamed in one place).

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23314

llvm-svn: 279056
2016-08-18 12:29:41 +00:00
Artem Dergachev cad151491e [analyzer] Fix a crash in CloneDetector when calling functions by pointers.
CallExpr may have a null direct callee when the callee function is not
known in compile-time. Do not try to take callee name in this case.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23320

llvm-svn: 278238
2016-08-10 16:25:16 +00:00
Vassil Vassilev 5721e0f37a [analyzer] Try to fix coverity CID 1360469.
Patch by Raphael Isemann!

llvm-svn: 278110
2016-08-09 10:00:23 +00:00
Artem Dergachev 7a0088bbae [analyzer] Make CloneDetector recognize different variable patterns.
CloneDetector should be able to detect clones with renamed variables.
However, if variables are referenced multiple times around the code sample,
the usage patterns need to be recognized.

For example, (x < y ? y : x) and (y < x ? y : x) are no longer clones,
however (a < b ? b : a) is still a clone of the former.

Variable patterns are computed and compared during a separate filtering pass.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D22982

llvm-svn: 277757
2016-08-04 19:37:00 +00:00
Artem Dergachev 78692ea590 [analyzer] Respect statement-specific data in CloneDetection.
So far the CloneDetector only respected the kind of each statement when
searching for clones. This patch refines the way the CloneDetector collects data
from each statement by providing methods for each statement kind,
that will read the kind-specific attributes.

For example, statements 'a < b' and 'a > b' are no longer considered to be
clones, because they are different in operation code, which is an attribute
specific to the BinaryOperator statement kind.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D22514

llvm-svn: 277449
2016-08-02 12:21:09 +00:00
Artem Dergachev ba816326f3 [analyzer] Add basic capabilities to detect source code clones.
This patch adds the CloneDetector class which allows searching source code
for clones.

For every statement or group of statements within a compound statement,
CloneDetector computes a hash value, and finds clones by detecting
identical hash values.

This initial patch only provides a simple hashing mechanism
that hashes the kind of each sub-statement.

This patch also adds CloneChecker - a simple static analyzer checker
that uses CloneDetector to report copy-pasted code.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D20795

llvm-svn: 276782
2016-07-26 18:13:12 +00:00