This changes the custom syntax of the emitc.include operation for standard includes.
Reviewed By: marbre
Differential Revision: https://reviews.llvm.org/D105281
This follows up patches for the unsigned siblings:
0c400e8953c7b658aeb5
We are translating an offset signed compare to its
unsigned equivalent when one end of the range is
at the limit (zero or unsigned max).
(X + C2) >s C --> X <u (SMAX - C) (if C == C2 - 1)
(X + C2) <s C --> X >u (C ^ SMAX) (if C == C2)
This probably does not show up much in IR derived
from C/C++ source because that would likely have
'nsw', and we have folds for that already.
As with the previous unsigned transforms, the folds
could be generalized to handle non-constant patterns:
https://alive2.llvm.org/ce/z/Y8Xrrm
; sgt
define i1 @src(i8 %a, i8 %c) {
%c2 = add i8 %c, 1
%t = add i8 %a, %c2
%ov = icmp sgt i8 %t, %c
ret i1 %ov
}
define i1 @tgt(i8 %a, i8 %c) {
%c_off = sub i8 127, %c ; SMAX
%ov = icmp ult i8 %a, %c_off
ret i1 %ov
}
https://alive2.llvm.org/ce/z/c8uhnk
; slt
define i1 @src(i8 %a, i8 %c) {
%t = add i8 %a, %c
%ov = icmp slt i8 %t, %c
ret i1 %ov
}
define i1 @tgt(i8 %a, i8 %c) {
%c_offnot = xor i8 %c, 127 ; SMAX
%ov = icmp ugt i8 %a, %c_offnot
ret i1 %ov
}
This patch adds a new ShuffleKind SK_Splice and then handle the cost in
getShuffleCost, as in experimental.vector.reverse.
Differential Revision: https://reviews.llvm.org/D104630
This patch fixes PR50823.
The shuffle mask should be twisted twice before gotten the correct one due to the difference between inner HOP and outer.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D104903
We already have a fold for variable index with constant vector,
but if we can determine a scalar splat value, then it does not
matter whether that value is constant or not.
We overlooked this fold in D102404 and earlier patches,
but the fixed vector variant is shown in:
https://llvm.org/PR50817
Alive2 agrees on that:
https://alive2.llvm.org/ce/z/HpijPC
The same logic applies to scalable vectors.
Differential Revision: https://reviews.llvm.org/D104867
The function vectorizeChainsInBlock does not support scalable vector,
because function like canReuseExtract and isCommutative in the code
path assert with scalable vectors.
This patch avoids vectorizing blocks that have extract instructions with scalable
vector..
Differential Revision: https://reviews.llvm.org/D104809
Ignore top-level qualifiers in casts, which fixes issues in reinterpret_cast.
This rule comes from [expr.type]/8.2.2 which explains that casting to a
pr-qualified type should actually cast to the unqualified type. In C++
this is only done for types that aren't classes or arrays.
Fixes: PR49221
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D102689
Improve codegen when lowering the common vector shuffle case from the
vectorizer (op1[last]:op2[0:last-1]). This patch only handles this
common case as it is difficult to handle this more generally when using
fixed length vectors, due to being unable to use the SVE ext instruction.
Differential Revision: https://reviews.llvm.org/D105289
Loads of <4 x i8> vectors were modeled as extremely expensive. And while we
don't have a load instruction that supports this, it isn't that expensive to
create a vector of i8 elements. The codegen for this was fixed/optimised in
D105110. This now tweaks the cost model and enables SLP vectorisation of my
motivating case loadi8.ll.
Differential Revision: https://reviews.llvm.org/D103629
Opaque attributes that currently contain string literals can't currently be properly roundtripped as they are not printed as escaped strings. This leads to incorrect tokens being generated and the parser to almost certainly fail. This patch simply uses llvm::printEscapedString from LLVM. It escapes all non printable characters and quotes to \xx hex literals, and backslashes to two backslashes. This syntax is supported by MLIRs Lexer as well. The same function is also currently in use for the same purpose in printSymbolReference, printAttribute for StringAttr and many more in AsmPrinter.cpp.
Differential Revision: https://reviews.llvm.org/D105405
This patch fixes an issue which occurred in CodeGenPrepare and
HWAddressSanitizer, which both at some point create a map of Old->New
instructions and update dbg.value uses of these. They did this by
iterating over the dbg.value's location operands, and if an instance of
the old instruction was found, replaceVariableLocationOp would be
called on that dbg.value. This would cause an error if the same operand
appeared multiple times as a location operand, as the first call to
replaceVariableLocationOp would update all uses of the old instruction,
invalidating the old iterator and eventually hitting an assertion.
This has been fixed by no longer iterating over the dbg.value's location
operands directly, but by first collecting them into a set and then
iterating over that, ensuring that we never attempt to replace a
duplicated operand multiple times.
Differential Revision: https://reviews.llvm.org/D105129
Added support to check if architecture supports s_mulhi which is used as part of
the decision whether or not to use valu 24 bit mul (if the mulhi gets
transformed to a valu op anyway, then may as well use it).
This is an extension of the work in D97063
Differential Revision: https://reviews.llvm.org/D103321
Change-Id: I80b1323de640a52623d69ac005a97d06a5d42a14
Update CMakeLists.txt in the tutorial to reflect the latest changes in
LLVM. The demo project cannot be linked without added libraries.
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D105409
FeatureBitset is 4 64-bit values in an array. It's better passed by
reference rather than copying it.
I may be adding FeatureBitset as an argument to another function
and noticed this while working on that.
clang and gcc both seem to emit relocations in reverse order of
address. That means we can match relocations to their containing
subsections in `O(relocs + subsections)` rather than the `O(relocs *
log(subsections))` that our previous binary search implementation
required.
Unfortunately, `ld -r` can still emit unsorted relocations, so we have a
fallback code path for that (less common) case.
Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:
N Min Max Median Avg Stddev
x 20 4.04 4.11 4.075 4.0775 0.018027756
+ 20 3.95 4.02 3.98 3.985 0.020900768
Difference at 95.0% confidence
-0.0925 +/- 0.0124919
-2.26855% +/- 0.306361%
(Student's t, pooled s = 0.0195172)
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D105410
Summary: The patch adds the StringTable dumping to
llvm-readobj. Currently only XCOFF is supported.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D104613
Different constraints may share the same predicate, in this case, we
will generate duplicate ODS verification function.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D104369
Remove `getDynOperands` and `createOrFoldDimOp` from MemRef.h to decouple MemRef a bit from Tensor. These two functions are used in other dialects/transforms.
Differential Revision: https://reviews.llvm.org/D105260
Two bugs:
1. This tries to take the address of the last symbol plus the length
of the last symbol. However, the sorted vector is cuPtrVector,
not cuVector. Also, cuPtrVector has tombstone values removed
and cuVector doesn't. If there was a stripped value at the end,
the "last" element's value was UINT64_MAX, which meant the
sentinel value was one less than the length of that "last"
dead symbol.
2. We have to subtract in.header->addr. For 64-bit binaries that's
(1 << 32) and functionAddress is 32-bit so this is a no-op, but
for 32-bit binaries the sentinel's value was too large.
I believe this has no effect in practice since the first-level
binary search code in libunwind (in UnwindCursor.hpp) does:
uint32_t low = 0;
uint32_t high = sectionHeader.indexCount();
uint32_t last = high - 1;
while (low < high) {
uint32_t mid = (low + high) / 2;
if ((mid == last) ||
(topIndex.functionOffset(mid + 1) > targetFunctionOffset)) {
low = mid;
break;
} else {
low = mid + 1;
}
So the address of the last entry in the first-level table isn't really
checked -- except for the very end, but the check against `last` means
we just run the loop once more than necessary. But it makes `unwinddump` output
look less confusing, and it's what it looks was the intention here.
(No test since I can't think of a way to make FileCheck check that one
number is larger than another.)
Differential Revision: https://reviews.llvm.org/D105404
"bad second level page" and "second level compressed unwind table"
can now be grepped for.
(Also remove one of the two spaces between "second" and "level"
in the second message.)
Basically every kind of parseOptional* method in DialectAsmParser has a corresponding parse* method which will emit an error if the requested token has not been found. An odd one out of this rule is parseOptionalString which does not have a corresponding parseString method.
This patch adds that method and implements it in basically the same fashion as parseKeyword, by first going through parseOptionalString and emitting an error on failure.
Differential Revision: https://reviews.llvm.org/D105406
This API is not compatible with opaque pointers, the method
accepting an explicit pointer element type should be used instead.
Thankfully there were few in-tree users. The BPF case still ends
up using the pointer element type for now and needs something like
D105407 to avoid doing so.
Same as other CreateLoad-style APIs, these need an explicit type
argument to support opaque pointers.
Differential Revision: https://reviews.llvm.org/D105395
Compiling LLVM with Clang modules and libc++ identified that
`Support/Printable.h` and `ADL/SmallVector.h` were using features that
live in these headers.
Differential Revision: https://reviews.llvm.org/D105402
Compiling clangd with Clang modules and libc++ revealed that
`support/Threading.h` uses `std::atomic` but wasn't including the
correct header.
Differential Revision: https://reviews.llvm.org/D105400
Fix offset calculation routines in padding checker to avoid assertion
errors described in bugzilla issue 50426. The fields that are subojbects
of zero size, marked with [[no_unique_address]] or empty bitfields will
be excluded from padding calculation routines.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D104097
Specifically the CreateMaskedStore and CreateMaskedScatter APIs.
The CreateMaskedLoad and CreateMaskedGather APIs will need an
additional type argument.