Summary:
The `if (*cstr_end == '\0')` in the previous code checked if the previous loop terminated because it
found a null terminator or because it reached the end of the data. However, in the case that we hit
the end of the data before finding a null terminator, `cstr_end` points behind the last byte in our
data and `*cstr_end` reads the memory behind the array (which may be uninitialised)
This patch just rewrites that function use `std::find` and adds the relevant unit tests.
Reviewers: labath
Reviewed By: labath
Subscribers: abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D68773
llvm-svn: 374311
Currently, the heuristics the if-conversion pass uses for diamond if-conversion
are based on execution time, with no consideration for code size. This adds a
new set of heuristics to be used when optimising for code size.
This is mostly target-independent, because the if-conversion pass can
see the code size of the instructions which it is removing. For thumb,
there are a few passes (insertion of IT instructions, selection of
narrow branches, and selection of CBZ instructions) which are run after
if conversion and affect these heuristics, so I've added target hooks to
better predict the code-size effect of a proposed if-conversion.
Differential revision: https://reviews.llvm.org/D67350
llvm-svn: 374301
The test isn't using @expectedFailure correctly, which causes weird
errors, at least with python2, at least with linux. Possibly that
function shouldn't even be public as it's main use is as a backed for
other decorators.
llvm-svn: 374299
Other options which create output files don't produce output messages.
Improve documentation to help find trace file.
Differential Revision: https://reviews.llvm.org/D68710
llvm-svn: 374294
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4 When an expression J that has integral type is added to or subtracted
from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
elements ([dcl.array]), the expressions P + J and J + P
(where J has the value j) point to the (possibly-hypothetical) array
element i+j of x if 0≤i+j≤n and the expression P - J points to the
(possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```
Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)
To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".
Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566
Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.
`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown
Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers
Reviewed By: rsmith
Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D67122
llvm-svn: 374293
GNU ld looks for a number of other patterns than just lib<name>.dll.a
and lib<name>.a.
GNU ld does support linking directly against a DLL without using an
import library. If that's the only match for a -l argument, point out
that the user needs to use an import library, instead of leaving the
user with a puzzling message about the -l argument not being found
at all.
Also convert an existing case of fatal() into error().
Differential Revision: https://reviews.llvm.org/D68689
llvm-svn: 374292
This was further discussed at the llvm dev list:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/135602.html
I think the brief summary of that is that this change is an improvement,
this is the behaviour that we expect and promise in ours docs, and also
as a result there are cases where we now emit diagnostics whereas before
pragmas were silently ignored. Two areas where we can improve: 1) the
diagnostic message itself, and 2) and in some cases (e.g. -Os and -Oz)
the vectoriser is (quite understandably) not triggering.
Original commit message:
Specifying the vectorization width was supposed to implicitly enable
vectorization, except that it wasn't really doing this. It was only
setting the vectorize.width metadata, but not vectorize.enable.
This should fix PR27643.
llvm-svn: 374288
Some clang lit tests use a pipeline of the form
// RUN: %clang [args] -O0 %s | opt [specific optimizations] | FileCheck %s
to make the expected test output depend on as few optimization phases
as possible, for stability. But when you write a RUN line of this
form, you lose the ability to use update_cc_test_checks.py to
automatically generate the expected output, because it only supports
two-stage pipelines consisting of '%clang | FileCheck' (or %clang_cc1).
This change extends the set of supported RUN lines so that pipelines
with an invocation of `opt` in the middle can still be automatically
handled.
To implement it, I've adjusted `get_function_body()` so that it can
cope with an arbitrary sequence of intermediate pipeline commands. But
the code that decides which RUN lines to consider is more
conservative: it only adds clang | opt | FileCheck to the set of
supported lines, because I didn't want to accidentally include some
other kind of line that doesn't output IR at all.
(Also in this commit is the minimal change to make this script work at
all, after r373912 added an extra parameter to `add_ir_checks`.)
Reviewers: MaskRay, xbolva00
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68406
llvm-svn: 374287
SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds
the additional non-allocatable TTMP registers. There's no point in
allocating SReg_128 vregs. This shrinks the size of the classes
regalloc needs to consider, which is usually good.
llvm-svn: 374284
Summary:
`null` in the default address space (=AS 0) cannot be captured nor can
it alias anything. We make this clear now as it can be important for
callbacks and other cases later on. In addition, this patch improves the
debug output for noalias deduction.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68624
llvm-svn: 374280
Original Patch broke for compilations w/ gcc and exposed asan fail.
This reland repairs those bugs.
Differential Revision: https://reviews.llvm.org/D67529
llvm-svn: 374277
This pattern matches the ELF implementation add if also useful as
part of a planned change where running `mark` more than once is needed.
Differential Revision: https://reviews.llvm.org/D68749
llvm-svn: 374275
Summary:
The Transformer library has been growing inside of
lib/Tooling/Refactoring. However, it's not really related to anything else in
that directory. This revision moves all Transformer-related files into their own
include & lib directories. A followup revision will (temporarily) add
forwarding headers to help any users migrate their code to the new location.
Reviewers: gribozavr
Subscribers: mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68637
llvm-svn: 374271
This reverts r374268 (git commit c34385d07c)
I think I reverted this by mistake, so I'm relanding it. While my bisect
found this revision, I think the crashes I'm seeing locally must be
environmental. Maybe the version of clang I'm using miscompiles tot
clang.
llvm-svn: 374269
Summary:
Visual Studio doesn't like it while stepping. It kicks you out of the
source view of the file being stepped through and tries to fall back to
the disassembly view.
Fixes PR43530
The fix is incomplete, because it's possible to have a basic block with
no source locations at all. In this case, we don't emit a .cv_loc, but
that will result in wrong stepping behavior in the debugger if the
layout predecessor of the location-less BB has an unrelated source
location. We could try harder to find a valid location that dominates or
post-dominates the current BB, but in general it's a dataflow problem,
and one still might not exist. I left a FIXME about this.
As an alternative, we might want to consider having the middle-end check
if its emitting codeview and get it to stop using line zero.
Reviewers: akhuang
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68747
llvm-svn: 374267
Dereferences with addresses above the 48-bit hardware addressable range
produce "invalid instruction" (instead of "invalid access") hardware
exceptions (there is no hardware address decoding logic for those bits),
and the address provided by this exception is the address of the
instruction (not the faulting address). The kernel maps the "invalid
instruction" to SEGV, but fails to provide the real fault address.
Because of this ASan lies and says that those cases are null
dereferences. This downgrades the severity of a found bug in terms of
security. In the ASan signal handler, we can not provide the real
faulting address, but at least we can try not to lie.
rdar://50366151
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D68676
llvm-svn: 374265
debugserver had been using an instruction that would work
for armv7 or aarch64 processes, but we don't have armv7 code
running on arm64 devices any more so this is unnecessary.
<rdar://problem/56133118>
llvm-svn: 374264
CUDA/HIP program may be compiled with -fopenmp. In this case, -fopenmp is only passed to host compilation
to take advantages of multi-threads computation.
CUDA/HIP and OpenMP both use Sema::DeviceCallGraph to store functions to be analyzed and remove them
once they decide the function is sure to be emitted. CUDA/HIP and OpenMP have different functions to determine
if a function is sure to be emitted.
To check host/device correctly for CUDA/HIP when -fopenmp is enabled, there needs a unified logic to determine
whether a function is to be emitted. The logic needs to be aware of both CUDA and OpenMP logic.
Differential Revision: https://reviews.llvm.org/D67837
llvm-svn: 374263
As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86.
As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them.
This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions.
Differential Revision: https://reviews.llvm.org/D68419
llvm-svn: 374261
In a future patch, this will help cleanup m0 handling.
The register coalescer handles copies from a register that
materializes an immediate, but doesn't handle move immediates
itself. The virtual register uses will often be allocated to the same
register, so there end up being no real copy.
llvm-svn: 374257
This old test used a completely hand-rolled Makefile. Modernize so that
it's able to cross-compile. And XFAIL the test as it fails on embedded
targets...
llvm-svn: 374256
This was ignoring the register bank of the input pointer, and
isUniformMMO seems overly aggressive.
This will now conservatively assume a VGPR in cases where the incoming
bank hasn't been determined yet (i.e. is from a loop phi).
llvm-svn: 374255