llvm-project/clang/lib
Roman Lebedev b69ba22773 [clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part
Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:

```
unsigned char store = 0;

bool consume(unsigned int val);

void test(unsigned long val) {
  if (consume(val)) {
    // the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
    // If their bit widths are different on this platform, the implicit
    // truncation happens. And if that `unsigned long` had a value bigger
    // than UINT_MAX, then you may or may not have a bug.

    // Similarly, integer addition happens on `int`s, so `store` will
    // be promoted to an `int`, the sum calculated (0+768=768),
    // and the result demoted to `unsigned char`, and stored to `store`.
    // In this case, the `store` will still be 0. Again, not always intended.
    store = store + 768; // before addition, 'store' was promoted to int.
  }

  // But yes, sometimes this is intentional.
  // You can either make the conversion explicit
  (void)consume((unsigned int)val);
  // or mask the value so no bits will be *implicitly* lost.
  (void)consume((~((unsigned int)0)) & val);
}
```

Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does **not** warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.

The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.

The most non-trivial thing here is the logic to detect whether this
`ImplicitCastExpr` AST node is **actually** an implicit conversion, //or//
part of an explicit cast. Because the explicit casts are modeled as an outer
`ExplicitCastExpr` with some `ImplicitCastExpr`'s as **direct** children.
https://godbolt.org/g/eE1GkJ

Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set
on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`.
So if that flag is **not** set, then it is an actual implicit conversion.

As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`.
There are potentially some more implicit conversions to be warned about.
Namely, implicit conversions that result in sign change; implicit conversion
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.

One thing i know isn't handled is bitfields.

This is a clang part.
The compiler-rt part is D48959.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions)

Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane

Reviewed By: rsmith, vsk, erichkeane

Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D48958

llvm-svn: 338288
2018-07-30 18:58:30 +00:00
..
ARCMigrate Remove \brief commands from doxygen comments. 2018-05-09 01:00:01 +00:00
AST Delete some unreachable AST printing code. 2018-07-30 18:05:19 +00:00
ASTMatchers [ASTMatchers] Introduce a matcher for `ObjCIvarExpr`, support getting it's declaration. 2018-07-27 17:26:11 +00:00
Analysis [AST] Add a convenient getter from QualType to RecordDecl 2018-07-28 02:16:13 +00:00
Basic Revert r338057 "[VirtualFileSystem] InMemoryFileSystem::status: Return a Status with the requested name" 2018-07-26 23:21:51 +00:00
CodeGen [clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part 2018-07-30 18:58:30 +00:00
CrossTU [CrossTU] Fix handling of Cross Translation Unit directory path 2017-10-27 12:53:37 +00:00
Driver [clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part 2018-07-30 18:58:30 +00:00
Edit Remove \brief commands from doxygen comments. 2018-05-09 01:00:01 +00:00
Format [clang-format] Silence -Wdocumentation warnings 2018-07-30 12:22:41 +00:00
Frontend [OPENMP] Force OpenMP 4.5 when compiling for offloading. 2018-07-26 15:17:38 +00:00
FrontendTool Add a new driver mode to dump compiler feature and extension options. 2018-05-31 13:57:09 +00:00
Headers [ms] Add __shiftleft128 / __shiftright128 intrinsics 2018-07-20 21:02:09 +00:00
Index [Index] Set OrigD before D is changed. 2018-07-20 08:08:56 +00:00
Lex [Preprocessor] Stop entering included files after hitting a fatal error. 2018-07-25 19:16:26 +00:00
Parse Parse a possible trailing postfix expression suffix after a fold expression 2018-07-27 21:55:12 +00:00
Rewrite Remove \brief commands from doxygen comments. 2018-05-09 01:00:01 +00:00
Sema [CodeComplete] Fix the crash in code completion on access checking 2018-07-30 15:19:05 +00:00
Serialization [AST] Sink 'part of explicit cast' down into ImplicitCastExpr 2018-07-27 07:27:14 +00:00
StaticAnalyzer [analyzer] Add missing state transition in IteratorChecker. 2018-07-30 16:14:59 +00:00
Tooling [Tooling] Use UniqueStringSaver. NFC 2018-07-23 11:25:25 +00:00
CMakeLists.txt Add Cross Translation Unit support library 2017-09-22 11:11:01 +00:00