We should represent the store directly in IR instead. This gives the middle end a chance to remove it if it can see a load from the same address.
Differential Revision: https://reviews.llvm.org/D51769
llvm-svn: 341677
Previously we only handled loads in operand 0, but nothing guarantees the load will be operand 0 for commutable operations.
Differential Revision: https://reviews.llvm.org/D51768
llvm-svn: 341675
If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.
I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.
Differential Revision: https://reviews.llvm.org/D51398
llvm-svn: 341674
Fix a latent bug in loop vectorizer which generates incorrect code for
memory accesses that are executed conditionally. As pointed in review,
this bug definitely affects uniform loads and may affect conditional
stores that should have turned into scatters as well).
The code gen for conditionally executed uniform loads on architectures
that support masked gather instructions is broken.
Without this patch, we were unconditionally executing the *conditional*
load in the vectorized version.
This patch does the following:
1. Uniform conditional loads on architectures with gather support will
have correct code generated. In particular, the cost model
(setCostBasedWideningDecision) is fixed.
2. For the recipes which are handled after the widening decision is set,
we use the isScalarWithPredication(I, VF) form which is added in the
patch.
3. Fix the vectorization cost model for scalarization
(getMemInstScalarizationCost): implement and use isPredicatedInst to
identify *all* predicated instructions, not just scalar+predicated. So,
now the cost for scalarization will be increased for maskedloads/stores
and gather/scatter operations. In short, we should be choosing the
gather/scatter in place of scalarization on archs where it is
profitable.
4. We needed to weaken the assert in useEmulatedMaskMemRefHack.
Reviewers: Ayal, hsaito, mkuper
Differential Revision: https://reviews.llvm.org/D51313
llvm-svn: 341673
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.
Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658
llvm-svn: 341669
If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.
Reviewers: spatel, john.brawn, mssimpso, craig.topper
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D51781
llvm-svn: 341666
Summary:
Code completion in clang is actually a mix of two features:
- Code completion is a familiar feature. Results are exposed via the
CodeCompleteConsumer::ProcessCodeCompleteResults callback.
- Signature help figures out if the current expression is an argument of
some function call and shows corresponding signatures if so.
Results are exposed via CodeCompleteConsumer::ProcessOverloadCandidates.
This patch refactors the implementation to untangle those two from each
other and makes some naming tweaks to avoid confusion when reading the
code.
The refactoring is required for signature help fixes, see D51038.
The only intended behavior change is the order of callbacks.
ProcessOverloadCandidates is now called before ProcessCodeCompleteResults.
Reviewers: sammccall, kadircet
Reviewed By: sammccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D51782
llvm-svn: 341660
Summary:
Extend LDV so that stack slot offsets for spilled sub-registers
are added to the emitted debug locations. This is accomplished
by querying InstrInfo::getStackSlotRange().
With this change, LDV will add a DW_OP_plus_uconst operation to
the expression if a sub-register is spilled. Later on, PEI will
add an offset operation for the stack slot, meaning that we will
get expressions of the forms:
* {DW_OP_constu #fp-offset, DW_OP_minus,
DW_OP_plus_uconst #subreg-offset}
* {DW_OP_plus_const #fp-offset,
DW_OP_minus, DW_OP_plus_uconst #subreg-offset}
The two offset operations should ideally be merged.
Reviewers: rnk, aprantl, stoklund
Reviewed By: aprantl
Subscribers: dblaikie, bjope, nemanjai, JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D51612
llvm-svn: 341659
[RISCV] Add support for computing sysroot for riscv32-unknown-elf
Extends r338385 to allow the driver to compute the sysroot when an explicit path is not provided. This allows the linker to find C runtime files and the correct include directory for header files.
Patch by lewis-revill (Lewis Revill)
llvm-svn: 341655
The test was missing '--' on mac as pointed out by -Wslash-u-filename:
<stdin>:5:69: note: possible intended match here
clang: warning: '/Users/thakis/src/llvm-mono/clang/test/Driver/msvc-link.c' treated as the '/U' option [-Wslash-u-filename]
Differential Revision: https://reviews.llvm.org/D51635
llvm-svn: 341654
Add support for bitcasts from float type to an integer type of the same element bitwidth.
There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet.
llvm-svn: 341652
Currently eliminateInstructions only returns true if any instruction got
replaced. In the test case for this patch, we eliminate the trivially
dead calls, for which eliminateInstructions not do a replacement and the
function is not marked as changed, which is why the inliner crashes
while traversing the call graph.
Alternatively we could also change eliminateInstructions to return true
in case we mark instructions for deletion, but that's slightly more code
and doing it at the place where the replacement happens seems safer.
Fixes PR37517.
Reviewers: davide, mcrosier, efriedma, bjope
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D51169
llvm-svn: 341651
Before this patch, analyzeContext called getCanonicalDIEOffset(), for
which the result depends on the timings of the setCanonicalDIEOffset()
calls in the cloneLambda. This can lead to slightly different output
between runs due to threading.
To prevent this from happening, we now record the output debug info size
after importing the modules (before any concurrent processing takes
place). This value, named the ModulesEndOffset is used to compare the
canonical DIE offset against. If the value is greater than this offset,
the canonical DIE offset has been updated during cloning, and should
therefore not be considered for pruning.
Differential revision: https://reviews.llvm.org/D51443
llvm-svn: 341649
Summary:
In this change we apply `XRAY_NEVER_INSTRUMENT` to more functions in the
profiling implementation to ensure that these never get instrumented if
the compiler used to build the library is capable of doing XRay
instrumentation.
We also consolidate all the allocators into a single header
(xray_allocator.h) which sidestep the use of the internal allocator
implementation in sanitizer_common.
This addresses more cases mentioned in llvm.org/PR38577.
Reviewers: mboerger, eizan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51776
llvm-svn: 341647
Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a
FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a
proper addressing-mode and tells the rewriter about it so that encodable
offsets are exploited and others are rejected.
Should fix PR38828.
llvm-svn: 341642
Boilerplate code for using KMSAN instrumentation in Clang.
We add a new command line flag, -fsanitize=kernel-memory, with a
corresponding SanitizerKind::KernelMemory, which, along with
SanitizerKind::Memory, maps to the memory_sanitizer feature.
KMSAN is only supported on x86_64 Linux.
It's incompatible with other sanitizers, but supports code coverage
instrumentation.
llvm-svn: 341641
`URIDistance` constructor should mention that `Sources` must contain
*absolute paths*, not URIs. This is not very clear when looking at the
interface, especially given that `distance(...)` accepts `URI`, not an
absolute path which can give the wrong impression.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D51691
llvm-svn: 341639
Introduce the -msan-kernel flag, which enables the kernel instrumentation.
The main differences between KMSAN and MSan instrumentations are:
- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
Shadow and origin values for a particular X-byte memory location are
read and written via pointers returned by
__msan_metadata_ptr_for_load_X(u8 *addr) and
__msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
to a function returning that struct is inserted into every instrumented
function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
entry and unpoisoned with __msan_unpoison_alloca() before leaving the
function;
- the pass doesn't declare any global variables or add global constructors
to the translation unit.
llvm-svn: 341637
Third Attempt:
- Alignment issues resolved.
- zlib::isAvailable() detected.
- ArrayRef misuse fixed.
Usage:
llvm-objcopy --compress-debug-sections=zlib foo.o
llvm-objcopy --compress-debug-sections=zlib-gnu foo.o
In both cases the debug section contents is compressed with zlib. In the GNU
style case the header is the "ZLIB" magic string followed by the uint64 big-
endian decompressed size. In the non-GNU mode the header is the
Elf(32|64)_Chdr.
Decompression support is coming soon.
Differential Revision: https://reviews.llvm.org/D49678
llvm-svn: 341635
IndVars does not set `Changed` flag when it eliminates dead instructions. As result,
it may make IR modifications and report that it has done nothing. It leads to inconsistent
preserved analyzes results.
Differential Revision: https://reviews.llvm.org/D51770
Reviewed By: skatkov
llvm-svn: 341633
Summary:
Enables trace-malloc-unbalanced.test on Windows, fixing two problems it had with Windows before.
The first fix is specifying python instead of relying on a script's shebang since they can't be used on Windows.
The second fix is making the regex tolerate windows' implementation of the "%p" format string.
Reviewers: Dor1s
Reviewed By: Dor1s
Subscribers: morehouse
Differential Revision: https://reviews.llvm.org/D51760
llvm-svn: 341632
accessible from the context where aggregate initialization occurs.
rdar://problem/38168772
Differential Revision: https://reviews.llvm.org/D45898
llvm-svn: 341629
Summary:
This patch implements a `BlockVerifier` type which enforces the
invariants of the log structure of FDR mode logs on a per-block basis.
This ensures that the data we encounter from an FDR mode log
semantically correct (i.e. that records follow the documented "grammar"
for FDR mode log records).
This is another part of the refactoring of D50441.
Reviewers: mboerger, eizan
Subscribers: mgorny, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D51723
llvm-svn: 341628
code. This will enable disassembly of the optional subset of
neon that some Cortex cores support. Add a unit test to check
that a few of these instructions disassemble as expected.
<rdar://problem/26674303>
llvm-svn: 341623
Summary:
When targeting MSVC: compile using clang's cl driver mode (this is needed for
libfuzzer's exit_on_src_pos feature). Don't use -lstdc++ when linking,
it isn't needed and causes a warning.
On Windows: Fix exit_on_src_pos.test by making sure debug info isn't
overwritten during compilation of second binary by using .exe extension.
Reviewers: morehouse
Reviewed By: morehouse
Subscribers: aprantl, JDevlieghere
Differential Revision: https://reviews.llvm.org/D51757
llvm-svn: 341622