Commit Graph

40 Commits

Author SHA1 Message Date
Borsik Gabor 89bc4c662c [analyzer] Add custom filter functions for GenericTaintChecker
This patch is the last of the series of patches which allow the user to
annotate their functions with taint propagation rules.

I implemented the use of the configured filtering functions. These
functions can remove taintedness from the symbols which are passed at
the specified arguments to the filters.

Differential Revision: https://reviews.llvm.org/D59516
2019-11-23 20:12:15 +01:00
Gabor Borsik 080ecafdd8 Move prop-sink branch to monorepo.
llvm-svn: 371342
2019-09-08 19:23:43 +00:00
Reid Kleckner 2336c1b872 Fix taint-generic.c on Windows, handle case in OS error
llvm-svn: 367249
2019-07-29 18:48:50 +00:00
Gabor Borsik 4bde15fe1e [analyzer] Add yaml parser to GenericTaintChecker
While we implemented taint propagation rules for several
builtin/standard functions, there's a natural desire for users to add
such rules to custom functions.

A series of patches will implement an option that allows users to
annotate their functions with taint propagation rules through a YAML
file. This one adds parsing of the configuration file, which may be
specified in the commands line with the analyzer config:
alpha.security.taint.TaintPropagation:Config. The configuration may
contain propagation rules, filter functions (remove taint) and sink
functions (give a warning if it gets a tainted value).

I also added a new header for future checkers to conveniently read YAML
files as checker options.

Differential Revision: https://reviews.llvm.org/D59555

llvm-svn: 367190
2019-07-28 13:38:04 +00:00
Artem Dergachev 3d90e7e8db Revert "[analyzer] Toning down invalidation a bit".
This reverts commit r352473.

The overall idea is great, but it seems to cause unintented consequences
when not only Region Store invalidation but also pointer escape mechanism
was accidentally affected.

Based on discussions in https://reviews.llvm.org/D58121#1452483
and https://reviews.llvm.org/D57230#1434161

Differential Revision: https://reviews.llvm.org/D57230

llvm-svn: 357620
2019-04-03 18:21:16 +00:00
Kristof Umann 855478328b [analyzer] Fix taint propagation in GenericTaintChecker
The gets function has no SrcArgs. Because the default value for isTainted was
false, it didn't mark its DstArgs as tainted.

Patch by Gábor Borsik!

Differential Revision: https://reviews.llvm.org/D58828

llvm-svn: 355396
2019-03-05 12:42:59 +00:00
Gabor Horvath f41e3d0873 [analyzer] Toning down invalidation a bit
When a function takes the address of a field the analyzer will no longer
assume that the function will change other fields of the enclosing structs.

Differential Revision: https://reviews.llvm.org/D57230

llvm-svn: 352473
2019-01-29 10:27:14 +00:00
Henry Wong cb2ad24c5c [analyzer] Improves the logic of GenericTaintChecker identifying stdin.
Summary:
GenericTaintChecker can't recognize stdin in some cases. The reason is that `if (PtrTy->getPointeeType() == C.getASTContext().getFILEType()` does not hold when stdin is encountered.

My platform is ubuntu16.04 64bit, gcc 5.4.0, glibc 2.23. The definition of stdin is as follows:
```
__BEGIN_NAMESPACE_STD
/* The opaque type of streams.  This is the definition used elsewhere.  */
typedef struct _IO_FILE FILE;
___END_NAMESPACE_STD

  ...

/* The opaque type of streams.  This is the definition used elsewhere.  */
typedef struct _IO_FILE __FILE;   

  ...

/* Standard streams.  */
extern struct _IO_FILE *stdin;      /* Standard input stream.  */
extern struct _IO_FILE *stdout;     /* Standard output stream.  */
extern struct _IO_FILE *stderr;     /* Standard error output stream.  */
```

The type of stdin is as follows AST:
```
ElaboratedType 0xc911170'struct _IO_FILE'sugar
`-RecordType 0xc911150'struct _IO_FILE'
 `-CXXRecord 0xc923ff0'_IO_FILE'
```

`C.getASTContext().GetFILEType()` is as follows AST:
```
TypedefType 0xc932710 'FILE' sugar
|-Typedef 0xc9111c0 'FILE'
`-ElaboratedType 0xc911170 'struct _IO_FILE' sugar
  `-RecordType 0xc911150 'struct _IO_FILE'
      `-CXXRecord 0xc923ff0 '_IO_FILE'
```

So I think it's better to use `getCanonicalType()`.

Reviewers: zaks.anna, NoQ, george.karpenkov, a.sidorin

Reviewed By: zaks.anna, a.sidorin

Subscribers: a.sidorin, cfe-commits, xazax.hun, szepet, MTC

Differential Revision: https://reviews.llvm.org/D39159

llvm-svn: 326709
2018-03-05 15:41:15 +00:00
Artem Dergachev eed7a3102c [analyzer] Support partially tainted records.
The analyzer's taint analysis can now reason about structures or arrays
originating from taint sources in which only certain sections are tainted.

In particular, it also benefits modeling functions like read(), which may
read tainted data into a section of a structure, but RegionStore is incapable of
expressing the fact that the rest of the structure remains intact, even if we
try to model read() directly.

Patch by Vlad Tsyrklevich!

Differential revision: https://reviews.llvm.org/D28445

llvm-svn: 304162
2017-05-29 15:42:56 +00:00
Anna Zaks 12d0c8d662 [analyzer] Extend taint propagation and checking to support LazyCompoundVal
A patch by Vlad Tsyrklevich!

Differential Revision: https://reviews.llvm.org/D28445

llvm-svn: 297326
2017-03-09 00:01:16 +00:00
Dominic Chen 184c6242fa Reland 4: [analyzer] NFC: Update test infrastructure to support multiple constraint managers
Summary: Replace calls to %clang/%clang_cc1 with %clang_analyze_cc1 when invoking static analyzer, and perform runtime substitution to select the appropriate constraint manager, per D28952.

Reviewers: xazax.hun, NoQ, zaks.anna, dcoughlin

Subscribers: mgorny, rgov, mikhail.ramalho, a.sidorin, cfe-commits

Differential Revision: https://reviews.llvm.org/D30373

llvm-svn: 296895
2017-03-03 18:02:02 +00:00
Dominic Chen 09d66f7528 Revert "Reland 3: [analyzer] NFC: Update test infrastructure to support multiple constraint managers"
This reverts commit ea36f1406e1f36bf456c3f3929839b024128e468.

llvm-svn: 296841
2017-03-02 23:30:53 +00:00
Dominic Chen feaf9ff5ee Reland 3: [analyzer] NFC: Update test infrastructure to support multiple constraint managers
Summary: Replace calls to %clang/%clang_cc1 with %clang_analyze_cc1 when invoking static analyzer, and perform runtime substitution to select the appropriate constraint manager, per D28952.

Reviewers: xazax.hun, NoQ, zaks.anna, dcoughlin

Subscribers: mgorny, rgov, mikhail.ramalho, a.sidorin, cfe-commits

Differential Revision: https://reviews.llvm.org/D30373

llvm-svn: 296837
2017-03-02 23:05:45 +00:00
Dominic Chen 4a90bf8c3f Revert "Reland 2: [analyzer] NFC: Update test infrastructure to support multiple constraint managers"
This reverts commit f93343c099fff646a2314cc7f4925833708298b1.

llvm-svn: 296836
2017-03-02 22:58:06 +00:00
Dominic Chen 1cb0256a3c Reland 2: [analyzer] NFC: Update test infrastructure to support multiple constraint managers
Summary: Replace calls to %clang/%clang_cc1 with %clang_analyze_cc1 when invoking static analyzer, and perform runtime substitution to select the appropriate constraint manager, per D28952.

Reviewers: xazax.hun, NoQ, zaks.anna, dcoughlin

Subscribers: mgorny, rgov, mikhail.ramalho, a.sidorin, cfe-commits

Differential Revision: https://reviews.llvm.org/D30373

llvm-svn: 296835
2017-03-02 22:45:24 +00:00
Dominic Chen 00355a51d0 Revert "Reland: [analyzer] NFC: Update test infrastructure to support multiple constraint managers"
This reverts commit 1b28d0b10e1c8feccb971abb6ef7a18bee589830.

llvm-svn: 296422
2017-02-28 01:50:23 +00:00
Dominic Chen 59cd893320 Reland: [analyzer] NFC: Update test infrastructure to support multiple constraint managers
Summary: Replace calls to %clang/%clang_cc1 with %clang_analyze_cc1 when invoking static analyzer, and perform runtime substitution to select the appropriate constraint manager, per D28952.

Reviewers: xazax.hun, NoQ, zaks.anna, dcoughlin

Subscribers: mgorny, rgov, mikhail.ramalho, a.sidorin, cfe-commits

Differential Revision: https://reviews.llvm.org/D30373

llvm-svn: 296414
2017-02-28 00:02:36 +00:00
Dominic Chen 8589e10c30 Revert "[analyzer] NFC: Update test infrastructure to support multiple constraint managers"
This reverts commit 8e7780b9e59ddaad1800baf533058d2c064d4787.

llvm-svn: 296317
2017-02-27 03:29:25 +00:00
Dominic Chen 02064a3076 [analyzer] NFC: Update test infrastructure to support multiple constraint managers
Summary: Replace calls to %clang/%clang_cc1 with %clang_analyze_cc1 when invoking static analyzer, and perform runtime substitution to select the appropriate constraint manager, per D28952.

Reviewers: xazax.hun, NoQ, zaks.anna, dcoughlin

Subscribers: mgorny, rgov, mikhail.ramalho, a.sidorin, cfe-commits

Differential Revision: https://reviews.llvm.org/D30373

llvm-svn: 296312
2017-02-27 02:36:15 +00:00
Jordan Rose d1d8929370 [analyzer] Teach ConstraintManager to ignore NonLoc <> NonLoc comparisons.
These aren't generated by default, but they are needed when either side of
the comparison is tainted.

Should fix our internal buildbot.

llvm-svn: 177846
2013-03-24 20:25:22 +00:00
Ted Kremenek 722398f1d4 Fix analyzer tests.
llvm-svn: 162588
2012-08-24 20:39:55 +00:00
Anna Zaks f0e9ca8604 [analyzer] Do not assert on constructing SymSymExpr with diff types.
The resulting type info is stored in the SymSymExpr, so no reason not to
support construction of expression with different subexpression types.

llvm-svn: 156051
2012-05-03 02:13:53 +00:00
Anna Zaks 1d3d51a6e6 [analyzer] Add a complexity bound on history tracking.
(Currently, this is only relevant for tainted data.)

llvm-svn: 156050
2012-05-03 02:13:50 +00:00
Anna Zaks 3705a1ee10 [analyzer] Change naming in bug reports "tainted" -> "untrusted"
llvm-svn: 151120
2012-02-22 02:35:58 +00:00
Anna Zaks b7eac9fbef [analyzer] Make VLA checker taint aware.
Also, slightly modify the diagnostic message in ArrayBound and DivZero (still use 'taint', which might not mean much to the user, but plan on changing it later).

llvm-svn: 148626
2012-01-21 05:07:33 +00:00
Ted Kremenek e7b9d4342b Tighten format string diagnostic and make it a bit clearer (and a bit closer to GCC's).
llvm-svn: 148579
2012-01-20 21:52:58 +00:00
Anna Zaks 8298af85a6 [analyzer] Add taint awareness to DivZeroChecker.
llvm-svn: 148566
2012-01-20 20:28:31 +00:00
Anna Zaks 3b754b25bd [analyzer] Add socket API as a source of taint.
llvm-svn: 148518
2012-01-20 00:11:19 +00:00
Anna Zaks 560dbe9ac9 [analyzer] Taint: warn when tainted data is used to specify a buffer
size (Ex: in malloc, memcpy, strncpy..)

(Maybe some of this could migrate to the CString checker. One issue
with that is that we might want to separate security issues from
regular API misuse.)

llvm-svn: 148371
2012-01-18 02:45:11 +00:00
Anna Zaks 5d324e509c [analyzer] Taint: add taint propagation rules for string and memory copy
functions.

llvm-svn: 148370
2012-01-18 02:45:07 +00:00
Anna Zaks 0244cd7450 [analyzer] Taint: add system and popen as undesirable sinks for taint
data.

llvm-svn: 148176
2012-01-14 02:48:40 +00:00
Anna Zaks cb6d4ee793 [analyzer] Unwrap the pointers when ignoring the const cast.
radar://10686991

llvm-svn: 148081
2012-01-13 00:56:55 +00:00
Anna Zaks b3fa8d7dd1 [analyzer] Add taint transfer by strcpy & others (part 1).
To simplify the process:
Refactor taint generation checker to simplify passing the
information on which arguments need to be tainted from pre to post
visit.

Todo: We need to factor out the code that sema is using to identify the
string and memcpy functions and use it here and in the CString checker.

llvm-svn: 148010
2012-01-12 02:22:34 +00:00
Anna Zaks 126a2ef920 [analyzer] Add basic format string vulnerability checking.
We already have a more conservative check in the compiler (if the
format string is not a literal, we warn). Still adding it here for
completeness and since this check is stronger - only triggered if the
format string is tainted.

llvm-svn: 147714
2012-01-07 02:33:10 +00:00
Anna Zaks 7c96b7db96 [analyzer] CStringChecker should not rely on the analyzer generating UndefOrUnknown value when it cannot reason about the expression.
We are now often generating expressions even if the solver is not known to be able to simplify it. This is another cleanup of the existing code, where the rest of the analyzer and checkers should not base their logic on knowing ahead of the time what the solver can reason about. 

In this case, CStringChecker is performing a check for overflow of 'left+right' operation. The overflow can be checked with either 'maxVal-left' or 'maxVal-right'. Previously, the decision was based on whether the expresion evaluated to undef or not. With this patch, we check if one of the arguments is a constant, in which case we know that 'maxVal-const' is easily simplified. (Another option is to use canReasonAbout() method of the solver here, however, it's currently is protected.)

This patch also contains 2 small bug fixes:
 - swap the order of operators inside SValBuilder::makeGenericVal.
 - handle a case when AddeVal is unknown in GenericTaintChecker::getPointedToSymbol.

llvm-svn: 146343
2011-12-11 18:43:40 +00:00
Hans Wennborg b1a5e09f6f Check that arguments to a scanf call match the format specifier,
and offer fixits when there is a mismatch.

llvm-svn: 146326
2011-12-10 13:20:11 +00:00
Anna Zaks ff029b6953 [analyzer] Add more simple taint tests.
llvm-svn: 145275
2011-11-28 20:43:40 +00:00
Anna Zaks 457c68726c [analyzer] Warn when non pointer arguments are passed to scanf (only when running taint checker).
There is an open radar to implement better scanf checking as a Sema warning. However, a bit of redundancy is fine in this case.

llvm-svn: 144964
2011-11-18 02:26:36 +00:00
Anna Zaks 040ddfedc0 [analyzer] Do not conjure a symbol when we need to propagate taint.
When the solver and SValBuilder cannot reason about symbolic expressions (ex: (x+1)*y ), the analyzer conjures a new symbol with no ties to the past. This helps it to recover some path-sensitivity. However, this breaks the taint propagation.

With this commit, we are going to construct the expression even if we cannot reason about it later on if an operand is tainted.

Also added some comments and asserts.

llvm-svn: 144932
2011-11-17 23:07:28 +00:00
Anna Zaks 20829c90be [analyzer] Catch the first taint propagation implied buffer overflow.
Change the ArrayBoundCheckerV2 to be more aggressive in reporting buffer overflows
when the offset is tainted. Previously, we did not report bugs when the state was
underconstrained (not enough information about the bound to determine if there is
an overflow) to avoid false positives. However, if we know that the buffer
offset is tainted - comes in from the user space and can be anything, we should
report it as a bug.

+ The very first example of us catching a taint related bug.
This is the only example we can currently handle. More to come...

llvm-svn: 144826
2011-11-16 19:58:17 +00:00