[clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part
Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:
```
unsigned char store = 0;
bool consume(unsigned int val);
void test(unsigned long val) {
if (consume(val)) {
// the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
// If their bit widths are different on this platform, the implicit
// truncation happens. And if that `unsigned long` had a value bigger
// than UINT_MAX, then you may or may not have a bug.
// Similarly, integer addition happens on `int`s, so `store` will
// be promoted to an `int`, the sum calculated (0+768=768),
// and the result demoted to `unsigned char`, and stored to `store`.
// In this case, the `store` will still be 0. Again, not always intended.
store = store + 768; // before addition, 'store' was promoted to int.
}
// But yes, sometimes this is intentional.
// You can either make the conversion explicit
(void)consume((unsigned int)val);
// or mask the value so no bits will be *implicitly* lost.
(void)consume((~((unsigned int)0)) & val);
}
```
Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does **not** warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.
The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.
The most non-trivial thing here is the logic to detect whether this
`ImplicitCastExpr` AST node is **actually** an implicit conversion, //or//
part of an explicit cast. Because the explicit casts are modeled as an outer
`ExplicitCastExpr` with some `ImplicitCastExpr`'s as **direct** children.
https://godbolt.org/g/eE1GkJ
Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set
on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`.
So if that flag is **not** set, then it is an actual implicit conversion.
As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`.
There are potentially some more implicit conversions to be warned about.
Namely, implicit conversions that result in sign change; implicit conversion
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.
One thing i know isn't handled is bitfields.
This is a clang part.
The compiler-rt part is D48959.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions)
Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane
Reviewed By: rsmith, vsk, erichkeane
Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D48958
llvm-svn: 338288
2018-07-31 02:58:30 +08:00
|
|
|
// RUN: %clang_cc1 -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefix=CHECK
|
2018-10-11 17:09:50 +08:00
|
|
|
// RUN: %clang_cc1 -fsanitize=implicit-unsigned-integer-truncation,implicit-signed-integer-truncation -fno-sanitize-recover=implicit-unsigned-integer-truncation,implicit-signed-integer-truncation -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK-SANITIZE,CHECK-SANITIZE-ANYRECOVER,CHECK-SANITIZE-NORECOVER
|
|
|
|
// RUN: %clang_cc1 -fsanitize=implicit-unsigned-integer-truncation,implicit-signed-integer-truncation -fsanitize-recover=implicit-unsigned-integer-truncation,implicit-signed-integer-truncation -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK-SANITIZE,CHECK-SANITIZE-ANYRECOVER,CHECK-SANITIZE-RECOVER
|
|
|
|
// RUN: %clang_cc1 -fsanitize=implicit-unsigned-integer-truncation,implicit-signed-integer-truncation -fsanitize-trap=implicit-unsigned-integer-truncation,implicit-signed-integer-truncation -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK-SANITIZE,CHECK-SANITIZE-TRAP
|
[clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part
Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:
```
unsigned char store = 0;
bool consume(unsigned int val);
void test(unsigned long val) {
if (consume(val)) {
// the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
// If their bit widths are different on this platform, the implicit
// truncation happens. And if that `unsigned long` had a value bigger
// than UINT_MAX, then you may or may not have a bug.
// Similarly, integer addition happens on `int`s, so `store` will
// be promoted to an `int`, the sum calculated (0+768=768),
// and the result demoted to `unsigned char`, and stored to `store`.
// In this case, the `store` will still be 0. Again, not always intended.
store = store + 768; // before addition, 'store' was promoted to int.
}
// But yes, sometimes this is intentional.
// You can either make the conversion explicit
(void)consume((unsigned int)val);
// or mask the value so no bits will be *implicitly* lost.
(void)consume((~((unsigned int)0)) & val);
}
```
Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does **not** warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.
The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.
The most non-trivial thing here is the logic to detect whether this
`ImplicitCastExpr` AST node is **actually** an implicit conversion, //or//
part of an explicit cast. Because the explicit casts are modeled as an outer
`ExplicitCastExpr` with some `ImplicitCastExpr`'s as **direct** children.
https://godbolt.org/g/eE1GkJ
Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set
on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`.
So if that flag is **not** set, then it is an actual implicit conversion.
As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`.
There are potentially some more implicit conversions to be warned about.
Namely, implicit conversions that result in sign change; implicit conversion
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.
One thing i know isn't handled is bitfields.
This is a clang part.
The compiler-rt part is D48959.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions)
Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane
Reviewed By: rsmith, vsk, erichkeane
Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D48958
llvm-svn: 338288
2018-07-31 02:58:30 +08:00
|
|
|
|
|
|
|
extern "C" { // Disable name mangling.
|
|
|
|
|
|
|
|
// ========================================================================== //
|
|
|
|
// Check that explicit cast does not interfere with implicit conversion
|
|
|
|
// ========================================================================== //
|
|
|
|
// These contain one implicit truncating conversion, and one explicit truncating cast.
|
|
|
|
// We want to make sure that we still diagnose the implicit conversion.
|
|
|
|
|
|
|
|
// Implicit truncation after explicit truncation.
|
|
|
|
// CHECK-LABEL: @explicit_cast_interference0
|
|
|
|
unsigned char explicit_cast_interference0(unsigned int c) {
|
|
|
|
// CHECK-SANITIZE: %[[ANYEXT:.*]] = zext i8 %[[DST:.*]] to i16, !nosanitize
|
|
|
|
// CHECK-SANITIZE: call
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (unsigned short)c;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Implicit truncation before explicit truncation.
|
|
|
|
// CHECK-LABEL: @explicit_cast_interference1
|
|
|
|
unsigned char explicit_cast_interference1(unsigned int c) {
|
|
|
|
// CHECK-SANITIZE: %[[ANYEXT:.*]] = zext i16 %[[DST:.*]] to i32, !nosanitize
|
|
|
|
// CHECK-SANITIZE: call
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
unsigned short b;
|
|
|
|
return (unsigned char)(b = c);
|
|
|
|
}
|
|
|
|
|
|
|
|
// ========================================================================== //
|
|
|
|
// The expected true-negatives.
|
|
|
|
// ========================================================================== //
|
|
|
|
|
|
|
|
// Explicit truncating casts.
|
|
|
|
// ========================================================================== //
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_unsigned_int_to_unsigned_char
|
|
|
|
unsigned char explicit_unsigned_int_to_unsigned_char(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (unsigned char)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_signed_int_to_unsigned_char
|
|
|
|
unsigned char explicit_signed_int_to_unsigned_char(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (unsigned char)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_unsigned_int_to_signed_char
|
|
|
|
signed char explicit_unsigned_int_to_signed_char(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (signed char)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_signed_int_to_signed_char
|
|
|
|
signed char explicit_signed_int_to_signed_char(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (signed char)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Explicit NOP casts.
|
|
|
|
// ========================================================================== //
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_unsigned_int_to_unsigned_int
|
|
|
|
unsigned int explicit_unsigned_int_to_unsigned_int(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (unsigned int)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_signed_int_to_signed_int
|
|
|
|
signed int explicit_signed_int_to_signed_int(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (signed int)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_unsigned_char_to_signed_char
|
|
|
|
unsigned char explicit_unsigned_char_to_signed_char(unsigned char src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (unsigned char)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_signed_char_to_signed_char
|
|
|
|
signed char explicit_signed_char_to_signed_char(signed char src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return (signed char)src;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Explicit functional truncating casts.
|
|
|
|
// ========================================================================== //
|
|
|
|
|
|
|
|
using UnsignedChar = unsigned char;
|
|
|
|
using SignedChar = signed char;
|
|
|
|
using UnsignedInt = unsigned int;
|
|
|
|
using SignedInt = signed int;
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_unsigned_int_to_unsigned_char
|
|
|
|
unsigned char explicit_functional_unsigned_int_to_unsigned_char(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return UnsignedChar(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_signed_int_to_unsigned_char
|
|
|
|
unsigned char explicit_functional_signed_int_to_unsigned_char(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return UnsignedChar(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_unsigned_int_to_signed_char
|
|
|
|
signed char explicit_functional_unsigned_int_to_signed_char(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return SignedChar(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_signed_int_to_signed_char
|
|
|
|
signed char explicit_functional_signed_int_to_signed_char(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return SignedChar(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Explicit functional NOP casts.
|
|
|
|
// ========================================================================== //
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_unsigned_int_to_unsigned_int
|
|
|
|
unsigned int explicit_functional_unsigned_int_to_unsigned_int(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return UnsignedInt(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_signed_int_to_signed_int
|
|
|
|
signed int explicit_functional_signed_int_to_signed_int(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return SignedInt(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_unsigned_char_to_signed_char
|
|
|
|
unsigned char explicit_functional_unsigned_char_to_signed_char(unsigned char src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return UnsignedChar(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_functional_signed_char_to_signed_char
|
|
|
|
signed char explicit_functional_signed_char_to_signed_char(signed char src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return SignedChar(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Explicit C++-style casts truncating casts.
|
|
|
|
// ========================================================================== //
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstyleunsigned_int_to_unsigned_char
|
|
|
|
unsigned char explicit_cppstyleunsigned_int_to_unsigned_char(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<unsigned char>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstylesigned_int_to_unsigned_char
|
|
|
|
unsigned char explicit_cppstylesigned_int_to_unsigned_char(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<unsigned char>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstyleunsigned_int_to_signed_char
|
|
|
|
signed char explicit_cppstyleunsigned_int_to_signed_char(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<signed char>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstylesigned_int_to_signed_char
|
|
|
|
signed char explicit_cppstylesigned_int_to_signed_char(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<signed char>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Explicit C++-style casts NOP casts.
|
|
|
|
// ========================================================================== //
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstyleunsigned_int_to_unsigned_int
|
|
|
|
unsigned int explicit_cppstyleunsigned_int_to_unsigned_int(unsigned int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<unsigned int>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstylesigned_int_to_signed_int
|
|
|
|
signed int explicit_cppstylesigned_int_to_signed_int(signed int src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<signed int>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstyleunsigned_char_to_signed_char
|
|
|
|
unsigned char explicit_cppstyleunsigned_char_to_signed_char(unsigned char src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<unsigned char>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
// CHECK-LABEL: @explicit_cppstylesigned_char_to_signed_char
|
|
|
|
signed char explicit_cppstylesigned_char_to_signed_char(signed char src) {
|
|
|
|
// CHECK-SANITIZE-NOT: call
|
|
|
|
// CHECK: }
|
|
|
|
return static_cast<signed char>(src);
|
|
|
|
}
|
|
|
|
|
|
|
|
} // extern "C"
|
|
|
|
|
|
|
|
// ---------------------------------------------------------------------------//
|
|
|
|
// A problematic true-negative involving simple C++ code.
|
|
|
|
// The problem is tha the NoOp ExplicitCast is directly within MaterializeTemporaryExpr(),
|
|
|
|
// so a special care is neeeded.
|
|
|
|
// See https://reviews.llvm.org/D48958#1161345
|
|
|
|
template <typename a>
|
|
|
|
a b(a c, const a &d) {
|
|
|
|
if (d)
|
|
|
|
;
|
|
|
|
return c;
|
|
|
|
}
|
|
|
|
|
|
|
|
extern "C" { // Disable name mangling.
|
|
|
|
|
|
|
|
// CHECK-LABEL: @false_positive_with_MaterializeTemporaryExpr
|
|
|
|
int false_positive_with_MaterializeTemporaryExpr() {
|
|
|
|
// CHECK-SANITIZE-NOT: call{{.*}}ubsan
|
|
|
|
// CHECK: }
|
|
|
|
int e = b<unsigned>(4, static_cast<unsigned>(4294967296));
|
|
|
|
return e;
|
|
|
|
}
|
|
|
|
|
|
|
|
// ---------------------------------------------------------------------------//
|
|
|
|
|
|
|
|
} // extern "C"
|