Summary:
We don't pattern match pairwise shuffles in SelectionDAG. So we
should only return the optimized costs if its not a pairwise
shuffle.
I think SLP vectorizer gives priority to non pairwise shuffle if
the cost is the same. And the look up for reduction intrinsics
passes false for the pairwise flag. So this probably has no real
effect today.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69083
Summary:
Compare two values, and if they are different, return the position of the
most significant bit that is different in the values.
Needed for D69387.
Reviewers: nikic, spatel, sanjoy, RKSimon
Reviewed By: nikic
Subscribers: xbolva00, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69439
PTEST and especially the MOVMSK instructions are slow on Knights Landing
or later. As a bonus, this patch increases instruction parallelism by
emitting:
KORTEST(PCMPNEQ(a, b), PCMPNEQ(c, d)) == 0
Instead of:
KORTEST(AND(PCMPEQ(a, b), PCMPEQ(c, d))) == ~0
https://reviews.llvm.org/D69157
I've accidentally reverted one of my previous patches.
It was not catched by bots because (I guess) they do not
build in debug (we have a test case which triggers an assert
in MSVS when runs without this change).
More info: https://reviews.llvm.org/D68983#inline-624235
Reported by Jordan Rupprecht.
There is a minor flaw in the implementation of function lowerPhis.
This function replaces values of regclass Vreg_1 (boolean values)
involved in PHIs into an SGPR. Currently it iterates over the MBBs
and performs an inplace lowering of PHIs and fails to lower any
incoming value that itself is another PHI of Vreg_1 regclass.
The failure occurs only when the MBB where the incoming PHI value
belongs is not visited/lowered yet.
To fix this problem, collect all Vreg_1 PHIs upfront and then
perform the lowering.
Differential Revision: https://reviews.llvm.org/D69182
- Changed FileHandler read/write methods to return llvm::Error
- Using unified way of reporting errors
- Removed trailing '.' from the error messages
Differential Revision: https://reviews.llvm.org/D67031
Using `?` as an optional marker is very useful in Clang's AST-node
emitters because otherwise we need a separate class just to encode
the presence or absence of a base node reference.
- Changed FileHandler read/write methods to return llvm::Error
- Using unified way of reporting errors
- Removed trailing '.' from the error messages
Differential Revision: https://reviews.llvm.org/D67031
This makes the DAG behavior consistent with IR's extractelement after:
rGb32e4664a715
https://bugs.llvm.org/show_bug.cgi?id=42689
I've tried to maintain test intent for WebAssembly.
The AMDGPU test is trying to test for crashing or other bad behavior,
but I'm not sure if that's possible after this change.
MSVC supports it. Fixes the major MSVC compile time regression
introduced in r369961. Now
clang/lib/StaticAnalyzer/Frontend/CheckerRegistry.cpp compiles in 18s
instead of 7+ minutes.
Fixes PR43369
This prevents unused variable warning/error in -DNDEBUG builds. The
variable was introduced in 5934cd11ea.
Patch by: Shu-Chun Weng
Differential revision: https://reviews.llvm.org/D69451
The goal of this refactor is to enable ProcessMinidump to take into
account the loaded modules and their sections when computing the
permissions of various ranges of memory, as discussed in D66638.
This patch moves some of the responsibility for computing the ranges
from MinidumpParser into ProcessMinidump. MinidumpParser still does the
parsing, but ProcessMinidump becomes responsible for answering the
actual queries about memory ranges. This will enable it (in a follow-up
patch) to augment the information obtained from the parser with data
obtained from actual object files.
The changes in the actual code are fairly straight-forward and just
involve moving code around. MinidumpParser::GetMemoryRegions is renamed
to BuildMemoryRegions to emphasize that it does no caching. The only new
thing is the additional bool flag returned from this function. This
indicates whether the returned regions describe all memory mapped into
the target process. Data obtained from /proc/maps and the MemoryInfoList
stream is considered to be exhaustive. Data obtained from Memory(64)List
is not. This will be used to determine whether we need to augment the
data or not.
This reshuffle means that it is no longer possible/easy to test some of
this code via unit tests, as constructing a ProcessMinidump instance is
hard. Instead, I update the unit tests to only test the parsing of the
actual data, and test the answering of queries through a lit test using
the "memory region" command. The patch also includes some tweaks to the
MemoryRegion class to make the unit tests easier to write.
Reviewers: amccarth, clayborg
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D69035
In an attempt to ensure that every part of the module's memory image is
accounted for, D56537 created a special "container section" spanning the
entire image. While that seemed reasonable at the time (and it still
mostly does), it did create a problem of what to put as the "file size"
of the section, because the image is not continuous on disk, as we
generally assume (which is why I put zero there). Additionally, this
arrangement makes it unclear what kind of permissions should be assigned
to that section (which is what my next patch does).
To get around these, this patch partially reverts D56537, and goes back
to top-level sections. Instead, what I do is create a new "section" for
the object file header, which is also being loaded into memory, though
its not considered to be a section in the strictest sense. This makes it
possible to correctly assign file size section, and we can later assign
permissions to it as well.
Reviewers: amccarth, mstorsjo
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D69100
That used to fail in the last testcase function because after
%0:sreg_64.sub0 was folded into %3:sreg_32_xm0_xexec COPY, it
was further folded into S_STORE_DWORD_IMM. Its legal effective
subreg class is SReg_32 while instruction expects more restricted
SReg_32_XM0_EXEC. However, SIInstrInfo::isLegalRegOperand()
passed the legality check and it was caught in the verifier.
Borrowed code from the verifier to check for RC legality.
Differential Revision: https://reviews.llvm.org/D69445
Ilya Leoshkevich (<iii@linux.ibm.com>) reported an issue that
with -mattr=+alu32 CO-RE has a segfault in BPF MISimplifyPatchable
pass.
The pattern will be transformed by MISimplifyPatchable
pass looks like below:
r5 = ld_imm64 @"b:0:0$0:0"
r2 = ldw r5, 0
... r2 ... // use r2
The pass will remove the intermediate 'ldw' instruction
and replacing all r2 with r5 likes below:
r5 = ld_imm64 @"b:0:0$0:0"
... r5 ... // use r5
Later, the ld_imm64 insn will be replaced with
r5 = <patched immediate>
for field relocation purpose.
With -mattr=+alu32, the input code may become
r5 = ld_imm64 @"b:0:0$0:0"
w2 = ldw32 r5, 0
... w2 ... // use w2
Replacing "w2" with "r5" is incorrect and will
trigger compiler internal errors.
To fix the problem, if the register class of ldw* dest
register is sub_32, we just replace the original ldw*
register with:
w2 = w5
Directly replacing all uses of w2 with in-place
constructed w5 for the use operand seems not working in all cases.
The latest kernel will have -mattr=+alu32 on by default,
so added this flag to all CORE tests.
Tested with latest kernel bpf-next branch as well with this patch.
Differential Revision: https://reviews.llvm.org/D69438
Summary:
The version number has come out of sync with what is in CMakeLists.txt,
causing loading the bindings to fail.
Reviewers: AustinWells, abhina.sree
Reviewed By: AustinWells
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69436
For example, it is pretty easy to write a breakpoint command that implements "stop when my caller is Foo", and
it is pretty easy to write a breakpoint command that implements "stop when my caller is Bar". But there's no
way to write a generic "stop when my caller is..." function, and then specify the caller when you add the
command to a breakpoint.
With this patch, you can pass this data in a SBStructuredData dictionary. That will get stored in
the PythonCommandBaton for the breakpoint, and passed to the implementation function (if it has the right
signature) when the breakpoint is hit. Then in lldb, you can say:
(lldb) break com add -F caller_is -k caller_name -v Foo
More generally this will allow us to write reusable Python breakpoint commands.
Differential Revision: https://reviews.llvm.org/D68671
Both tryFoldOMod() and tryFoldClamp() remove original instruction,
so the check MI.modifiesRegister() may use a deleted MI.
Differential Revision: https://reviews.llvm.org/D69448
Custom lower this to a target instruction with the merge operands. I
think it might be better to directly select this and emit a
REG_SEQUENCE, but this would be more work since it would require
splitting the tablegen patterns for these cases from the other
atomics.
Summary:
In loadSRsrcFromVGPR, if MBB is the same as Succ, Remiander is not the immediate dominator of Succ.
Reviewer:
arsenm
Differential Revision:
https://reviews.llvm.org/D69358
Summary:
If there are a GUID collision between two globals checking the
summarylist from the import index to make assumption can be dangerous.
Do not assume that a GlobalValue that has a GlobalVarSummary
actually is a GlobalVariable as it can be another GlobalValue with
the same GUID that the summary is connected to.
Patch by Joel Klinghed (the_jk@opera.com)
Reviewers: evgeny777, tejohnson
Reviewed By: tejohnson
Subscribers: tejohnson, dblaikie, MaskRay, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67322
This extension point is not needed. Provide the equivalent option
through `CMAKE_CXX_STANDARD` which mirrors the previous extension point. Rely on
CMake to provide the check for the compiler instead.