Most of the functions emitted here should probably be convergent, but
only barriers are currently marked. Introduce this helper before
adding convergent to more functions.
Summary:
The PATCHABLE_EVENT_CALL uses i32 in the intrinsic. This
results in the register allocator picking a 32-bit register. We
need to use the 64-bit register when forming the MOV64rr
instructions. Otherwise we print illegal assembly in the text
output.
I think prior to this it was impossible for SrcReg to be equal
to DstReg so the NOP code was not reachable.
While there use Register instead of unsigned.
Also add a FIXME for what looks like a bug.
Reviewers: dberris
Reviewed By: dberris
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69365
Summary:
Move breakpoints from the old, bad ArgInfo::count to the new, better
ArgInfo::max_positional_args. Soon ArgInfo::count will be no more.
This functionality is tested in `TestFormatters.py`, `TestDataFormatterSynthVal.py`,
`TestDataFormatterSynthType.py`.
You may notice that the old code was passing 0 arguments when count was 1, and passing
1 argument when count is 2.
This is no longer necessary because max_positional_args counts the self pointer
correctly.
Reviewers: labath, jingham, JDevlieghere
Reviewed By: labath
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D69469
Do not add user-site packages directory to the python search path.
This avoids test failures if there's an incompatible lit module installed
inside the user-site packages directory, as it gets prioritized over the lit
from the PYTHONPATH.
Similar to:
rG4c47617627fb
This makes the DAG behavior consistent with IR's insertelement.
https://bugs.llvm.org/show_bug.cgi?id=42689
I've tried to maintain test intent for AArch64 and WebAssembly
by replacing undef index operands with something else.
If the target's preferred shift amount VT can't hold any shift
amount for the promoted VT, we should use i32. The specific shift
amount shouldn't matter. The type will be adjusted later when the
shift itself is type legalized. This avoids an assert in getNode.
Fixes PR43820.
This combine is only valid if the inner setcc produces a 0/1 result
or the inner type is MVT::i1.
I haven't seen this cause any issues, just happened to notice it
while reviewing combines in this function.
While there also fix another call to use the value type from the
SDValue for the operand instead of calling SDNode::getValueType(0).
Though its likely the use is result 0, its not guaranteed.
Summary:
[nfc][libomptarget] Decrease coupling between files
debug.h used the symbol omptarget_device_environment so implicitly required
an include of omptarget-nvptx.h to compile. Similarly interface.h uses size_t.
Moving this declaration to a new header means cancel, critical can now build
without omptarget-nvptx.h. After this change, debug.h, cancel.cu, critical.cu
could move under a common source directory.
Reviewers: ABataev, jdoerfert, grokos
Subscribers: openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D69473
Summary:
[nfc][libomptarget] Inline option into target_impl
Subset of D69423. The macros that were in option.h are all target dependent.
Inlining the header simplifies the dependency graph when looking to move code
into a common subdir.
Reviewers: ABataev, jdoerfert, grokos
Subscribers: openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D69472
Summary:
We don't pattern match pairwise shuffles in SelectionDAG. So we
should only return the optimized costs if its not a pairwise
shuffle.
I think SLP vectorizer gives priority to non pairwise shuffle if
the cost is the same. And the look up for reduction intrinsics
passes false for the pairwise flag. So this probably has no real
effect today.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69083
Summary:
Compare two values, and if they are different, return the position of the
most significant bit that is different in the values.
Needed for D69387.
Reviewers: nikic, spatel, sanjoy, RKSimon
Reviewed By: nikic
Subscribers: xbolva00, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69439
PTEST and especially the MOVMSK instructions are slow on Knights Landing
or later. As a bonus, this patch increases instruction parallelism by
emitting:
KORTEST(PCMPNEQ(a, b), PCMPNEQ(c, d)) == 0
Instead of:
KORTEST(AND(PCMPEQ(a, b), PCMPEQ(c, d))) == ~0
https://reviews.llvm.org/D69157
I've accidentally reverted one of my previous patches.
It was not catched by bots because (I guess) they do not
build in debug (we have a test case which triggers an assert
in MSVS when runs without this change).
More info: https://reviews.llvm.org/D68983#inline-624235
Reported by Jordan Rupprecht.
There is a minor flaw in the implementation of function lowerPhis.
This function replaces values of regclass Vreg_1 (boolean values)
involved in PHIs into an SGPR. Currently it iterates over the MBBs
and performs an inplace lowering of PHIs and fails to lower any
incoming value that itself is another PHI of Vreg_1 regclass.
The failure occurs only when the MBB where the incoming PHI value
belongs is not visited/lowered yet.
To fix this problem, collect all Vreg_1 PHIs upfront and then
perform the lowering.
Differential Revision: https://reviews.llvm.org/D69182
- Changed FileHandler read/write methods to return llvm::Error
- Using unified way of reporting errors
- Removed trailing '.' from the error messages
Differential Revision: https://reviews.llvm.org/D67031
Using `?` as an optional marker is very useful in Clang's AST-node
emitters because otherwise we need a separate class just to encode
the presence or absence of a base node reference.
- Changed FileHandler read/write methods to return llvm::Error
- Using unified way of reporting errors
- Removed trailing '.' from the error messages
Differential Revision: https://reviews.llvm.org/D67031
This makes the DAG behavior consistent with IR's extractelement after:
rGb32e4664a715
https://bugs.llvm.org/show_bug.cgi?id=42689
I've tried to maintain test intent for WebAssembly.
The AMDGPU test is trying to test for crashing or other bad behavior,
but I'm not sure if that's possible after this change.
MSVC supports it. Fixes the major MSVC compile time regression
introduced in r369961. Now
clang/lib/StaticAnalyzer/Frontend/CheckerRegistry.cpp compiles in 18s
instead of 7+ minutes.
Fixes PR43369
This prevents unused variable warning/error in -DNDEBUG builds. The
variable was introduced in 5934cd11ea.
Patch by: Shu-Chun Weng
Differential revision: https://reviews.llvm.org/D69451
The goal of this refactor is to enable ProcessMinidump to take into
account the loaded modules and their sections when computing the
permissions of various ranges of memory, as discussed in D66638.
This patch moves some of the responsibility for computing the ranges
from MinidumpParser into ProcessMinidump. MinidumpParser still does the
parsing, but ProcessMinidump becomes responsible for answering the
actual queries about memory ranges. This will enable it (in a follow-up
patch) to augment the information obtained from the parser with data
obtained from actual object files.
The changes in the actual code are fairly straight-forward and just
involve moving code around. MinidumpParser::GetMemoryRegions is renamed
to BuildMemoryRegions to emphasize that it does no caching. The only new
thing is the additional bool flag returned from this function. This
indicates whether the returned regions describe all memory mapped into
the target process. Data obtained from /proc/maps and the MemoryInfoList
stream is considered to be exhaustive. Data obtained from Memory(64)List
is not. This will be used to determine whether we need to augment the
data or not.
This reshuffle means that it is no longer possible/easy to test some of
this code via unit tests, as constructing a ProcessMinidump instance is
hard. Instead, I update the unit tests to only test the parsing of the
actual data, and test the answering of queries through a lit test using
the "memory region" command. The patch also includes some tweaks to the
MemoryRegion class to make the unit tests easier to write.
Reviewers: amccarth, clayborg
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D69035
In an attempt to ensure that every part of the module's memory image is
accounted for, D56537 created a special "container section" spanning the
entire image. While that seemed reasonable at the time (and it still
mostly does), it did create a problem of what to put as the "file size"
of the section, because the image is not continuous on disk, as we
generally assume (which is why I put zero there). Additionally, this
arrangement makes it unclear what kind of permissions should be assigned
to that section (which is what my next patch does).
To get around these, this patch partially reverts D56537, and goes back
to top-level sections. Instead, what I do is create a new "section" for
the object file header, which is also being loaded into memory, though
its not considered to be a section in the strictest sense. This makes it
possible to correctly assign file size section, and we can later assign
permissions to it as well.
Reviewers: amccarth, mstorsjo
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D69100
That used to fail in the last testcase function because after
%0:sreg_64.sub0 was folded into %3:sreg_32_xm0_xexec COPY, it
was further folded into S_STORE_DWORD_IMM. Its legal effective
subreg class is SReg_32 while instruction expects more restricted
SReg_32_XM0_EXEC. However, SIInstrInfo::isLegalRegOperand()
passed the legality check and it was caught in the verifier.
Borrowed code from the verifier to check for RC legality.
Differential Revision: https://reviews.llvm.org/D69445
Ilya Leoshkevich (<iii@linux.ibm.com>) reported an issue that
with -mattr=+alu32 CO-RE has a segfault in BPF MISimplifyPatchable
pass.
The pattern will be transformed by MISimplifyPatchable
pass looks like below:
r5 = ld_imm64 @"b:0:0$0:0"
r2 = ldw r5, 0
... r2 ... // use r2
The pass will remove the intermediate 'ldw' instruction
and replacing all r2 with r5 likes below:
r5 = ld_imm64 @"b:0:0$0:0"
... r5 ... // use r5
Later, the ld_imm64 insn will be replaced with
r5 = <patched immediate>
for field relocation purpose.
With -mattr=+alu32, the input code may become
r5 = ld_imm64 @"b:0:0$0:0"
w2 = ldw32 r5, 0
... w2 ... // use w2
Replacing "w2" with "r5" is incorrect and will
trigger compiler internal errors.
To fix the problem, if the register class of ldw* dest
register is sub_32, we just replace the original ldw*
register with:
w2 = w5
Directly replacing all uses of w2 with in-place
constructed w5 for the use operand seems not working in all cases.
The latest kernel will have -mattr=+alu32 on by default,
so added this flag to all CORE tests.
Tested with latest kernel bpf-next branch as well with this patch.
Differential Revision: https://reviews.llvm.org/D69438
Summary:
The version number has come out of sync with what is in CMakeLists.txt,
causing loading the bindings to fail.
Reviewers: AustinWells, abhina.sree
Reviewed By: AustinWells
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69436