[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4 When an expression J that has integral type is added to or subtracted
from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
elements ([dcl.array]), the expressions P + J and J + P
(where J has the value j) point to the (possibly-hypothetical) array
element i+j of x if 0≤i+j≤n and the expression P - J points to the
(possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```
Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)
To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".
Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566
Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.
`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown
Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers
Reviewed By: rsmith
Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D67122
llvm-svn: 374293
2019-10-10 17:25:02 +08:00
|
|
|
// RUN: %clang_cc1 -x c -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s
|
|
|
|
// RUN: %clang_cc1 -x c -fsanitize=pointer-overflow -fno-sanitize-recover=pointer-overflow -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s
|
|
|
|
|
|
|
|
// RUN: %clang_cc1 -x c++ -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s
|
|
|
|
// RUN: %clang_cc1 -x c++ -fsanitize=pointer-overflow -fno-sanitize-recover=pointer-overflow -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s
|
|
|
|
|
|
|
|
#include <stdint.h>
|
|
|
|
|
|
|
|
#ifdef __cplusplus
|
|
|
|
extern "C" {
|
|
|
|
#endif
|
|
|
|
|
|
|
|
struct S {
|
|
|
|
int x, y;
|
|
|
|
};
|
|
|
|
|
2020-02-04 02:09:39 +08:00
|
|
|
// CHECK-LABEL: define i64 @{{.*}}get_offset_of_y_naively{{.*}}(
|
[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4 When an expression J that has integral type is added to or subtracted
from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
elements ([dcl.array]), the expressions P + J and J + P
(where J has the value j) point to the (possibly-hypothetical) array
element i+j of x if 0≤i+j≤n and the expression P - J points to the
(possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```
Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)
To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".
Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566
Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.
`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown
Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers
Reviewed By: rsmith
Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D67122
llvm-svn: 374293
2019-10-10 17:25:02 +08:00
|
|
|
uintptr_t get_offset_of_y_naively() {
|
|
|
|
// CHECK: [[ENTRY:.*]]:
|
|
|
|
// CHECK-NEXT: ret i64 ptrtoint (i32* getelementptr (i32, i32* null, i32 1) to i64)
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
return ((uintptr_t)(&(((struct S *)0)->y)));
|
|
|
|
}
|
|
|
|
|
2020-02-04 02:09:39 +08:00
|
|
|
// CHECK-LABEL: define i64 @{{.*}}get_offset_of_y_via_builtin{{.*}}(
|
[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4 When an expression J that has integral type is added to or subtracted
from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
elements ([dcl.array]), the expressions P + J and J + P
(where J has the value j) point to the (possibly-hypothetical) array
element i+j of x if 0≤i+j≤n and the expression P - J points to the
(possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```
Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)
To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".
Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566
Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.
`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown
Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers
Reviewed By: rsmith
Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D67122
llvm-svn: 374293
2019-10-10 17:25:02 +08:00
|
|
|
uintptr_t get_offset_of_y_via_builtin() {
|
|
|
|
// CHECK: [[ENTRY:.*]]:
|
|
|
|
// CHECK-NEXT: ret i64 4
|
|
|
|
// CHECK-NEXT: }
|
|
|
|
return __builtin_offsetof(struct S, y);
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef __cplusplus
|
|
|
|
}
|
|
|
|
#endif
|