X86 has some calling conventions where bits 127:0 of a vector register are callee saved, but the upper bits aren't. Previously we could detect that the full ymm register was clobbered when the xmm portion was really preserved. This patch checks the subregisters to make sure they aren't preserved.
Fixes PR44140
Differential Revision: https://reviews.llvm.org/D70699
I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc
The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.
This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605
Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.
Differential Revision: https://reviews.llvm.org/D70607
Summary:
Implicit Conversion Sanitizer is *almost* feature complete.
There aren't *that* much unsanitized things left,
two major ones are increment/decrement (this patch) and bit fields.
As it was discussed in
[[ https://bugs.llvm.org/show_bug.cgi?id=39519 | PR39519 ]],
unlike `CompoundAssignOperator` (which is promoted internally),
or `BinaryOperator` (for which we always have promotion/demotion in AST)
or parts of `UnaryOperator` (we have promotion/demotion but only for
certain operations), for inc/dec, clang omits promotion/demotion
altogether, under as-if rule.
This is technically correct: https://rise4fun.com/Alive/zPgD
As it can be seen in `InstCombineCasts.cpp` `canEvaluateTruncated()`,
`add`/`sub`/`mul`/`and`/`or`/`xor` operators can all arbitrarily
be extended or truncated:
901cd3b3f6/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L1320-L1334)
But that has serious implications:
1. Since we no longer model implicit casts, do we pessimise
their AST representation and everything that uses it?
2. There is no demotion, so lossy demotion sanitizer does not trigger :]
Now, i'm not going to argue about the first problem here,
but the second one **needs** to be addressed. As it was stated
in the report, this is done intentionally, so changing
this in all modes would be considered a penalization/regression.
Which means, the sanitization-less codegen must not be altered.
It was also suggested to not change the sanitized codegen
to the one with demotion, but i quite strongly believe
that will not be the wise choice here:
1. One will need to re-engineer the check that the inc/dec was lossy
in terms of `@llvm.{u,s}{add,sub}.with.overflow` builtins
2. We will still need to compute the result we would lossily demote.
(i.e. the result of wide `add`ition/`sub`traction)
3. I suspect it would need to be done right here, in sanitization.
Which kinda defeats the point of
using `@llvm.{u,s}{add,sub}.with.overflow` builtins:
we'd have two `add`s with basically the same arguments,
one of which is used for check+error-less codepath and other one
for the error reporting. That seems worse than a single wide op+check.
4. OR, we would need to do that in the compiler-rt handler.
Which means we'll need a whole new handler.
But then what about the `CompoundAssignOperator`,
it would also be applicable for it.
So this also doesn't really seem like the right path to me.
5. At least X86 (but likely others) pessimizes all sub-`i32` operations
(due to partial register stalls), so even if we avoid promotion+demotion,
the computations will //likely// be performed in `i32` anyways.
So i'm not really seeing much benefit of
not doing the straight-forward thing.
While looking into this, i have noticed a few more LLVM middle-end
missed canonicalizations, and filed
[[ https://bugs.llvm.org/show_bug.cgi?id=44100 | PR44100 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=44102 | PR44102 ]].
Those are not specific to inc/dec, we also have them for
`CompoundAssignOperator`, and it can happen for normal arithmetics, too.
But if we take some other path in the patch, it will not be applicable
here, and we will have most likely played ourselves.
TLDR: front-end should emit canonical, easy-to-optimize yet
un-optimized code. It is middle-end's job to make it optimal.
I'm really hoping reviewers agree with my personal assessment
of the path this patch should take..
This originally landed in 9872ea4ed1
but got immediately reverted in cbfa237892
because the assertion was faulty. That fault ended up being caused
by the enum - while there will be promotion, both types are unsigned,
with same width. So we still don't need to sanitize non-signed cases.
So far. Maybe the assert will tell us this isn't so.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44054 | PR44054 ]].
Refs. https://github.com/google/sanitizers/issues/940
Reviewers: rjmccall, erichkeane, rsmith, vsk
Reviewed By: erichkeane
Subscribers: mehdi_amini, dexonsmith, cfe-commits, #sanitizers, llvm-commits, aaron.ballman, t.p.northover, efriedma, regehr
Tags: #llvm, #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D70539
This is based on what's required for softening fp128 operations on 32-bit X86 assuming f32/f64/f80 are legal. So there could be some things missing.
Differential Revision: https://reviews.llvm.org/D70654
Summary:
While updatePostDominatedByUnreachable attemps to find basic blocks that are post-domianted by unreachable blocks, it currently cannot handle loops precisely, because it doesn't use the actual post dominator tree analysis but relies on heuristics of visiting basic blocks in post-order. More precisely, when the entire loop is post-dominated by the unreachable block, current algorithm fails to detect the entire loop as post-dominated by the unreachable because when the algorithm reaches to the loop latch it fails to tell all its successors (including the loop header) will "eventually" be post-domianted by the unreachable block, because the algorithm hasn't visited the loop header yet. This makes BPI for the loop latch to assume that loop backedges are taken with 100% of probability. And because of this, block frequency info sometimes marks virtually dead loops (which are post dominated by unreachable blocks) super hot, because 100% backedge-taken probability makes the loop iteration count the max value. updatePostDominatedByColdCall has the exact same problem as well.
To address this problem, this patch makes PostDominatedByUnreachable/PostDominatedByColdCall to be computed with the actual post-dominator tree.
Reviewers: skatkov, chandlerc, manmanren
Reviewed By: skatkov
Subscribers: manmanren, vsk, apilipenko, Carrot, qcolombet, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70104
This test was previously effectively doing:
P = malloc(X); write X bytes to P; P = realloc(P, X - Y); P = realloc(P, X)
and expecting that all X bytes stored to P would still be identical after
the final realloc.
This happens to be true for the current scudo implementation of realloc,
but is not guaranteed to be true by the C standard ("Any bytes in the new
object beyond the size of the old object have indeterminate values.").
This implementation detail will change with the new memory tagging support,
which unconditionally zeros newly allocated granules when memory tagging
is enabled. Fix this by limiting the number of bytes that we test to the
minimum size that we realloc the allocation to.
Differential Revision: https://reviews.llvm.org/D70761
The macros INLINE and COMPILER_CHECK always expand to the same thing (inline
and static_assert respectively). Both expansions are standards compliant C++
and are used consistently in the rest of LLVM, so let's improve consistency
with the rest of LLVM by replacing them with the expansions.
Differential Revision: https://reviews.llvm.org/D70793
Otherwise, we will hit a use-after-free when testing multiple instances of
the same allocator on the same thread. This only recently became a problem
with D70552 which caused us to run both ScudoCombinedTest.BasicCombined and
ScudoCombinedTest.ReleaseToOS on the unit tests' main thread.
Differential Revision: https://reviews.llvm.org/D70760
Shadow memory (and short granules) are not prepended with memory
address and arrow at the end of line is removed.
Differential Revision: https://reviews.llvm.org/D70707
Summary:
This CL makes unit tests compatible with Fuchsia's zxtest. This
required a few changes here and there, but also unearthed some
incompatibilities that had to be addressed.
A header is introduced to allow to account for the zxtest/gtest
differences, some `#if SCUDO_FUCHSIA` are used to disable incompatible
code (the 32-bit primary, or the exclusive TSD).
It also brought to my attention that I was using
`__scudo_default_options` in different tests, which ended up in a
single binary, and I am not sure how that ever worked. So move
this to the main cpp.
Additionally fully disable the secondary freelist on Fuchsia as we do
not track VMOs for secondary allocations, so no release possible.
With some modifications to Scudo's BUILD.gn in Fuchsia:
```
[==========] 79 tests from 23 test cases ran (10280 ms total).
[ PASSED ] 79 tests
```
Reviewers: mcgrathr, phosek, hctim, pcc, eugenis, cferris
Subscribers: srhines, jfb, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D70682
References need somewhat special treatment. While copying a gsl::Pointer
will propagate the points-to set, creating an object from a reference
often behaves more like a dereference operation.
Differential Revision: https://reviews.llvm.org/D70755
ThunkCreator::getThunk and ThunkCreator::normalizeExistingThunk
currently assume that the implicit addends are -8 for ARM and -4 for
Thumb. In D70637, ThunkCreator::getThunk will need to take care of the
relocation addend explicitly.
Add the utility function getPCBias() as a prerequisite so that the getThunk change in D70637
can be more general.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D70690
According to OpenMP 5.0, if clause can be used in parallel for simd directive. If condition in the if clause if false, the non-vectorized version of the
loop must be executed.
See PR43425:
https://bugs.llvm.org/show_bug.cgi?id=43425
When writing profile data on Windows we were opening profile file with
exclusive read/write access.
In case we are trying to write to the file from multiple processes
simultaneously, subsequent calls to CreateFileA would return
INVALID_HANDLE_VALUE.
To fix this, I changed to open without exclusive access and then take a
lock.
Patch by Michael Holman!
Differential revision: https://reviews.llvm.org/D70330
The asssertion that was added does not hold,
breaks on test-suite/MultiSource/Applications/SPASS/analyze.c
Will reduce the testcase and revisit.
This reverts commit 9872ea4ed1, 870f3542d3.
This replaces the A32 NEON vqadds, vqaddu, vqsubs and vqsubu intrinsics
with the target independent sadd_sat, uadd_sat, ssub_sat and usub_sat.
This helps generate vqadds from standard IR nodes, which might be
produced from the vectoriser. The old variants are removed in the
process.
Differential Revision: https://reviews.llvm.org/D69350
In order to simplify implementation we are moving add space
deduction into Sema while constructing variable declaration
and on template instantiation. Pointee are deduced to generic
addr space during creation of types.
This commit also
- fixed addr space dedution for auto type;
- factors out in a separate helper function OpenCL specific
logic from type diagnostics in var decl.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D65744
Summary:
Implicit Conversion Sanitizer is *almost* feature complete.
There aren't *that* much unsanitized things left,
two major ones are increment/decrement (this patch) and bit fields.
As it was discussed in
[[ https://bugs.llvm.org/show_bug.cgi?id=39519 | PR39519 ]],
unlike `CompoundAssignOperator` (which is promoted internally),
or `BinaryOperator` (for which we always have promotion/demotion in AST)
or parts of `UnaryOperator` (we have promotion/demotion but only for
certain operations), for inc/dec, clang omits promotion/demotion
altogether, under as-if rule.
This is technically correct: https://rise4fun.com/Alive/zPgD
As it can be seen in `InstCombineCasts.cpp` `canEvaluateTruncated()`,
`add`/`sub`/`mul`/`and`/`or`/`xor` operators can all arbitrarily
be extended or truncated:
901cd3b3f6/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L1320-L1334)
But that has serious implications:
1. Since we no longer model implicit casts, do we pessimise
their AST representation and everything that uses it?
2. There is no demotion, so lossy demotion sanitizer does not trigger :]
Now, i'm not going to argue about the first problem here,
but the second one **needs** to be addressed. As it was stated
in the report, this is done intentionally, so changing
this in all modes would be considered a penalization/regression.
Which means, the sanitization-less codegen must not be altered.
It was also suggested to not change the sanitized codegen
to the one with demotion, but i quite strongly believe
that will not be the wise choice here:
1. One will need to re-engineer the check that the inc/dec was lossy
in terms of `@llvm.{u,s}{add,sub}.with.overflow` builtins
2. We will still need to compute the result we would lossily demote.
(i.e. the result of wide `add`ition/`sub`traction)
3. I suspect it would need to be done right here, in sanitization.
Which kinda defeats the point of
using `@llvm.{u,s}{add,sub}.with.overflow` builtins:
we'd have two `add`s with basically the same arguments,
one of which is used for check+error-less codepath and other one
for the error reporting. That seems worse than a single wide op+check.
4. OR, we would need to do that in the compiler-rt handler.
Which means we'll need a whole new handler.
But then what about the `CompoundAssignOperator`,
it would also be applicable for it.
So this also doesn't really seem like the right path to me.
5. At least X86 (but likely others) pessimizes all sub-`i32` operations
(due to partial register stalls), so even if we avoid promotion+demotion,
the computations will //likely// be performed in `i32` anyways.
So i'm not really seeing much benefit of
not doing the straight-forward thing.
While looking into this, i have noticed a few more LLVM middle-end
missed canonicalizations, and filed
[[ https://bugs.llvm.org/show_bug.cgi?id=44100 | PR44100 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=44102 | PR44102 ]].
Those are not specific to inc/dec, we also have them for
`CompoundAssignOperator`, and it can happen for normal arithmetics, too.
But if we take some other path in the patch, it will not be applicable
here, and we will have most likely played ourselves.
TLDR: front-end should emit canonical, easy-to-optimize yet
un-optimized code. It is middle-end's job to make it optimal.
I'm really hoping reviewers agree with my personal assessment
of the path this patch should take..
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44054 | PR44054 ]].
Reviewers: rjmccall, erichkeane, rsmith, vsk
Reviewed By: erichkeane
Subscribers: mehdi_amini, dexonsmith, cfe-commits, #sanitizers, llvm-commits, aaron.ballman, t.p.northover, efriedma, regehr
Tags: #llvm, #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D70539
Summary:
This avoids leaking PCH files if editors don't use the LSP shutdown protocol.
This is one fix for https://github.com/clangd/clangd/issues/209
(Though I think we should *also* be unlinking the files)
Reviewers: kadircet, jfb
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, jfb, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70684
This is a follow-up discussed in D70495 thread.
The current logic is unusual for llvm-readobj. It doesn't print predecessors
list when it is empty. This is not good for machine parsers.
D70495 had to add this condition during refactoring to reduce amount of changes,
in tests, because the original code also had a similar logic.
Now seems it is time to get rid of it. This patch does it.
Differential revision: https://reviews.llvm.org/D70717
Use UTF-8 for communication with clang-format and convert the
replacements offset/length to characters position/count.
Internally VisualStudio.Text.Editor.IWpfTextView use sequence of Unicode
characters encoded using UTF-16 and use characters position/count for
manipulating text.
Resolved "Error while running clang-format: Specified argument was out
of the range of valid values. Parameter name: replaceSpan".
Patch by empty2fill!
Differential revision: https://reviews.llvm.org/D70633
Summary:
All these functions are unused from what I can see. Unless I'm missing something here, this code
can go the way of the Dodo.
Reviewers: labath
Reviewed By: labath
Subscribers: abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D70770
The third parameter to Streamer.EmitSymbolValue() is "bool
IsSectionRelative = false".
For ELF, these debug sections are mapped to address zero, so a normal,
absolute address relocation works just fine, but COFF needs a section
relative relocation, and COFF is the only target where
needsDwarfSectionOffsetDirective() returns true. This matches how
EmitSymbolValue is called elsewhere in the same source file.
Differential Revision: https://reviews.llvm.org/D70661
InitializeContext is useful for allocating a (potentially variable
size) CONTEXT struct in an unaligned byte buffer. In this case, we
already have a fixed size CONTEXT we want to initialize, and we only
used this as a very roundabout way of zero initializing it.
Instead just memset the CONTEXT we have, and set the ContextFlags field
manually.
This matches how it is done in NativeRegisterContextWindows_*.cpp.
This also makes LLDB run successfully in Wine (for a trivial tested
case at least), as Wine hasn't implemented the InitializeContext
function.
Differential Revision: https://reviews.llvm.org/D70742
This reapplies: 8ff85ed905
Original commit message:
As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.
This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.
Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.
Original post:
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
This reverts commit c9ddb02659.
http://45.33.8.238/win/3052/step_6.txt
C:\src\llvm-project\clang\test\Preprocessor\file_test.c:9:11: error: CHECK: expected string not found in input
// CHECK: filename: "/UNLIKELY_PATH/empty{{/|\\\\}}file_test.c"
^
<stdin>:1:1: note: scanning from here
^
<stdin>:1:28: note: possible intended match here
^
When selecting the set of default sanitizers, don't fail for unknown
architectures. This may be the case e.g. with x86_64-unknown-fuchsia
-m32 target that's used to build the bootloader.
Differential Revision: https://reviews.llvm.org/D70747
Current EvalInfo ctor causes EnableNewConstInterp to be true even though
it is supposed to be false on MSVC 2017. This is because a virtual function
getLangOpts() is called in member initializer lists, whereas on MSVC
member ctors are called before function virtual function pointers are
initialized.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D70729