The current implementation has a discrepancy between how char pointers
are serialized and deserialized. The latter treats it like a const char*
while the former serializes it as a pointer to a basic type.
Both are potentially wrong, as char pointers are mostly used in
combination with a size, and nothing guarantees that the string's length
(its first null byte to be more precise) is greater or equal to its
size. The real solution is to have a custom (de)serializer that uses
both pieces of infromation.
However, the implementation should be consistent between serialization
and deserialization and I believe treating char* as const char* is the
better alternative.
The legalizer produces a lot of these, and they make reading legalized
MIR annoying. For some reason, this does seem to sometimes introduce
copies of implicit def, which is dumb.
D73542 made a typo (`rel.type == R_PLT_PC`; should be `rel.expr`) and introduced a regression:
BL->BLX substitution was disabled when the target symbol is preemptible
(expr is R_PLT_PC).
The two added bl instructions in arm-thumb-interwork-shared.s check that
we patch BL to BLX.
Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1047531
The "{=v0}" constraint did not result in the expected error message in the
abscence of the vector facility, because 'v0' matches as a string into the
AnyRegBitRegClass in common code.
This patch adds checks for vector support in case of "{v" and soft-float in
case of "{f" to remedy this.
Review: Ulrich Weigand.
The load ports need a cycle for each potentially loaded element just like Haswell and Skylake. Unlike Haswell and Broadwell, the number of uops does not scale with the number of elements. Instead the load uops run for multiple cycles.
I've taken the latency number from the uops.info. The port binding for the non-load uops is taken from the original IACA data I have.
Differential Revision: https://reviews.llvm.org/D74000
Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given.
Reviewers: gchatelet, lenary, nikic
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74083
Summary: Optional regions are supported in the generic op print/parse form, update the docs to match.
Differential Revision: https://reviews.llvm.org/D74061
The problem was noticed by the Chrome OS toolchain folks
(crbug.com/1048445) because llvm-objcopy --add-gnu-debuglink would
insert the wrong checksum when processing a binary larger than 4 GB.
That use case regressed in 1e1e3ba252 when we started using
llvm::crc32() in more places.
Differential revision: https://reviews.llvm.org/D74039
Really the intrinsic definition is wrong, but work around this
here. The DAG lowering introduces an MMO. We have to introduce a new
operation to avoid the verifier complaining about the missing mayLoad.
I was debug stepping through an x86 shuffle lowering and
noticed we were doing an N^2 search for splat index. I
didn't find the equivalent functionality anywhere else in
LLVM, so here's a helper that takes an array of int and
returns a splatted index while ignoring undefs (any
negative value).
This might also be used inside existing
ShuffleVectorInst/ShuffleVectorSDNode functions and/or
help with D72467.
Differential Revision: https://reviews.llvm.org/D74064
Removed some #ifdefs specific to Windows handling of VFS paths. This
eliminates most of the differences between the Windows and non-Windows
code paths.
Making this work required some changes to account for the fact that VFS
file paths can be Posix style or Windows style, so you cannot just assume
that they use the host's native path style. In one case, this means
implementing our own version of make_absolute, since the filesystem code
in Support doesn't have styles in the sense that the path code does.
Differential Review: https://reviews.llvm.org/D71092
Use cmp ord instead of cmp_class compared to the DAG version for the
nan check, but mostly try to match the existsing pattern.
I think the sign doesn't matter for fract, so we could do a little
better with the source modifier matching.
I think this is also still broken as in D22898, but I'm leaving it
as-is for now while I don't have an SI system to test on.
Summary:
MLIRAnalysis depended on MLIRVectorOps
MLIRVectorOps depended on MLIRAnalysis for Loop information.
Both of these can be solved by factoring out libraries related to loop
analysis into their own library. The new MLIRLoopAnalysis might be
better off with the Loop Dialect in the future.
Reviewers: nicolasvasilache, rriddle!, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: Joonsoo, vchuravy, merge_guards_bot, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73655
Summary:
This breaks a cyclic library dependency where MLIRPass used the verifier
in MLIRAnalysis, but MLIRAnalysis also contained passes used for testing.
The presence of the test passes here is archaeology, predating
test/lib/Transform.
Reviewers: rriddle
Reviewed By: rriddle
Subscribers: merge_guards_bot, mgorny, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74067
Currently when generating debug-info for a BlockDecl we are setting the Name to the mangled name and not setting the LinkageName.
This means we see the mangled name for block invcations ends up in DW_AT_Name and not in DW_AT_linkage_name.
This patch fixes this case so that we also set the LinkageName as well.
Differential Revision: https://reviews.llvm.org/D73282
Summary:
I tried to move the `madvise` calls outside of one of the secondary
mutexes, but this backfired. There is situation when a low release
interval is set combined with secondary pressure that leads to a race:
a thread can get a block from the cache, while another thread is
`madvise`'ing that block, resulting in a null header.
I changed the secondary race test so that this situation would be
triggered, and moved the release into the cache mutex scope.
Reviewers: cferris, pcc, eugenis, hctim, morehouse
Subscribers: jfb, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D74072
contractCrossBankCopyIntoStore() finds the instruction defines the
source register and uses its output to replace the register. There are,
however, instructions that have multiple outputs, e.g. G_UNMERGE_VALUES.
Current implementation hardcodes to operand 0 and has no way of knowing
which output should be used.
This change adds another function to directly return the register that
is the source of the register and use that for folding.
This fixes https://bugs.llvm.org/show_bug.cgi?id=44783
Differential Revision: https://reviews.llvm.org/D74005
This is safer in case anyone tries to run MI optimization passes on
pre-selected MIR. If there turns out to be a real reason to do this,
we might need to add separate convergent intrinsic opcodes.
Summary:
The output from llvm-reduce still has significantly more attributes than
bugpoint does. Teach llvm-reduce to remove attributes.
Reviewers: diegotf, dblaikie, george.burgess.iv
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73853
This implements walking over G_ASHR in the same way as `getTestBitOperand` in
AArch64ISelLowering.
```
(tbz (ashr x, c), b) -> (tbz x, b+c) or (tbz x, msb) if b+c is > # bits in x
```
Differential Revision: https://reviews.llvm.org/D73933
These don't build on MSVC at the moment, so filter these out altogether
from the list of runtimes and print a warning.
Differential Revision: https://reviews.llvm.org/D73812
(1) The check needs to be on the 0th operand of whatever we're folding
(2) Checks for validity should happen before we change the bit
Fixes a bug which caused MultiSource/Applications/JM/lencod to fail at -O3.
Differential Revision: https://reviews.llvm.org/D74002
Rewrite the result register pair into the expected sinigle register
format in the legalizer.
I'm also operating under the assumption that TFE doesn't apply to
stores or atomics, but don't know if this is true or not.
Summary: Skip access specifiers before record definitions when deciding whether
or not to wrap lines so that C# class definitions do not get wrapped into a
single line.
Reviewers: krasimir, MyDeveloperDay
Reviewed By: krasimir
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D74050
Summary:
Tune the profile threshold flag value for instrumentation PGO based on internal
benchmarks.
Also, add flags to allow profile guided size optimizations for non-cold code
to be enabled separately for instrumentation and sample PGSO.
Neither changes the default behavior (yet) as it's disabled for non-cold code.
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72937
Pass the correct library directory from CMake to dotest.py when linking
liblldb, instead of trying to reconstruct the path from executable path.
This fixes link failures on platforms having non-null
LLVM_LIBDIR_SUFFIX.
Differential Revision: https://reviews.llvm.org/D73767
The mask results of these should be uniform. The trickier part is the
dummy booleans used as IR glue need to be treated as divergent. This
should make the divergence analysis results correct for the IR the DAG
is constructed from.
This should allow us to eliminate requiresUniformRegister, which has
an expensive, recursive scan over all users looking for control flow
intrinsics. This should avoid recent compile time regressions.
Summary: Merge '[', 'target' , ':' into a single token for C# attributes to
prevent the target from being seen as a label.
Reviewers: MyDeveloperDay, krasimir
Reviewed By: krasimir
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D74043
This reverts commit 41784bed01.
Since the original revision ead815924e,
this revision fixes three issues:
- This revision fixes the Windows build. My original patch improperly
copied EH pads on Windows. This patch disregards jump threading
opportunities having to do with EH pads.
- This revision fixes jump threading to a wrong destination.
Specifically, my original patch treated any Constant other than 0 as 1
while evaluating the branch condition. This bug led to treating
constant expressions like:
icmp ugt i8* null, inttoptr (i64 4 to i8*)
to "true". This patch fixes the bug by calling isOneValue.
- This revision fixes the cost calculation of two basic blocks being
threaded through. Note that getJumpThreadDuplicationCost returns
"(unsigned)~0" for those basic blocks that cannot be duplicated. If
we sum of two return values from getJumpThreadDuplicationCost, we
could have an unsigned overflow like:
(unsigned)~0 + 5 = 4
and mistakenly determine that it's safe and profitable to proceed
with the jump threading opportunity. The patch fixes the bug by
checking each return value before summing them up.
[JumpThreading] Thread jumps through two basic blocks
Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:
bb3:
%var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
%tobool = icmp eq i32 %cond, 0
br i1 %tobool, label %bb4, label ...
bb4:
%cmp = icmp eq i32* %var, null
br i1 %cmp, label bb5, label bb6
by duplicating basic blocks like bb3 above. Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:
bb3:
%var = phi i32* [ @a, %bb2 ]
%tobool = icmp eq i32 %cond, 0
br i1 %tobool, label %bb4, label ...
bb3.dup:
%var = phi i32* [ null, %bb1 ]
%tobool = icmp eq i32 %cond, 0
br i1 %tobool, label %bb4, label ...
bb4:
%cmp = icmp eq i32* %var, null
br i1 %cmp, label bb5, label bb6
Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.
Reviewers: wmi
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70247