Summary:
This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).
Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma
Reviewed By: efriedma
Subscribers: john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70680
Summary:
A copy-paste error in `arm_mve.td` meant that the MVE `vqrshrun`
intrinsic family was generating the `vqshrun` machine instruction,
because in the IR intrinsic call, the rounding flag argument was set
to 0 rather than 1.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72496
Summary:
Motivation: When formatting an array of typedefed chars, we would like to display the array as a string.
The string formatter currently does not trigger because the formatter lookup does not resolve typedefs for array elements (this behavior is inconsistent with pointers, for those we do look through pointee typedefs). This patch tries to make the array formatter lookup somewhat consistent with the pointer formatter lookup.
Reviewers: teemperor, clayborg
Reviewed By: teemperor, clayborg
Subscribers: clayborg, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D72133
Summary:
Now that D71894 has landed, we're able to run libc++abi tests remotely.
For that we can use the same CMake command as before. The tests can be run using `ninja check-cxxabi`.
Reviewers: andreil99, vvereschaka, aorlov
Reviewed By: vvereschaka, aorlov
Subscribers: mgorny, kristof.beyls, ldionne, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72459
Summary:
The `LIBCXX_CXX_ABI_LIBRARY_PATH` CMake variable is cached once in
libcxx/cmake/Modules/HandleLibCXXABI.cmake in the `setup_abi_lib` macro,
and then cached again in libcxx/test/CMakeLists.txt. There, if it is
not set to a value, it is by default set to `LIBCXX_LIBRARY_DIR`.
However, this new value is not actually cached, because the old (empty)
value has been already cached. Use the `FORCE` CMake flag so that it
is saved to the cache.
This should not break anything, because the code changed here previously
had no effect, when it should have.
Reviewers: jroelofs, bcraig, ldionne, EricWF, mclow.lists, vvereschaka, eastig
Reviewed By: vvereschaka
Subscribers: mgorny, christof, dexonsmith, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D69169
Teach SCEV about the @loop.decrement.reg intrinsic, which has exactly the same
semantics as a sub expression. This allows us to query hardware-loops, which
contain this @loop.decrement.reg intrinsic, so that we can calculate iteration
counts, exit values, etc. of hardwareloops.
This "int_loop_decrement_reg" intrinsic is defined as "IntrNoDuplicate". Thus,
while hardware-loops and tripcounts now become analysable by SCEV, this
prevents the usual loop transformations from applying transformations on
hardware-loops, which is what we want at this point, for which I have added
test cases for loopunrolling and IndVarSimplify and LFTR.
Differential Revision: https://reviews.llvm.org/D71563
- Update documentation now that the move to monorepo has been made
- Do not tie compiler extension testing to LLVM_BUILD_EXAMPLES
- No need to specify LLVM libraries for plugins
- Add NO_MODULE option to match Polly specific requirements (i.e. building the
module *and* linking it statically)
- Issue a warning when building the compiler extension with
LLVM_BYE_LINK_INTO_TOOLS=ON, as it modifies the behavior of clang, which only
makes sense for testing purpose.
Still mark llvm/test/Feature/load_extension.ll as XFAIL because of a
ManagedStatic dependency that's going to be fixed in a seperate commit.
Differential Revision: https://reviews.llvm.org/D72327
Summary:
Eventough it is OK to have a new line without any preceding spaces in
some markdown specifications, VSCode requires two spaces before a new line to
break a line inside a paragraph.
Reviewers: sammccall, ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72462
Summary:
Do not include tag keywords when printing types for symbol names, as it
will come from SymbolKind.
Also suppress them while printing definitions to prevent them occuring in
template arguments.
Make use of `getAsString`, instead of `print` in all places to have a consistent
style across the file.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72450
PowerPC uses a dedicated method to check if the machine instr is
predicable by opcode. However, there's a bit `isPredicable` in instr
definition. This patch removes the method and set the bit only to
opcodes referenced in it.
Differential Revision: https://reviews.llvm.org/D71921
If a system header provides an (inline) implementation of some of their
function, clang still matches on the function name and generate the appropriate
llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation «
users may not provide their own version of symbols » but doesn't account for the
fact that glibc itself can provide inline version of some functions.
It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that
case an inline version of memcpy calls __memcpy_chk, a function that performs
extra runtime checks. Clang currently ignores the inline version and thus
provides no runtime check.
This code fixes the issue by detecting functions whose name is a builtin name
but also have an inline implementation.
Differential Revision: https://reviews.llvm.org/D71082
Major changes are introduction of subsubsections to prevent people
putting new entries in wrong places. I also polished line length and
highlighting.
Patch by Eugene Zelenko!
Memory instruction widening recipes use the pointer operand of their load/store
ingredient for generating the needed GEPs, making it difficult to feed these
recipes with pointers based on other ingredients or none at all.
This patch modifies these recipes to use a VPValue for the pointer instead, in
order to reduce ingredient def-use usage by ILV as a step towards full
VPlan-based def-use relations. The recipes are constructed with VPValues bound
to these ingredients, maintaining current behavior.
Differential revision: https://reviews.llvm.org/D70865
Currently running the xray tools generates a number of errors:
$ ./bin/llvm-xray
: for the -k option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
: for the -d option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
: for the -o option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
: for the -f option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
: for the -s option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
: for the -r option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
: for the -p option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
: for the -m option: cl::alias must not have cl::sub(), aliased option's cl::sub() will be used!
<snip>
Patch by Ryan Mansfield.
Differential Revision: https://reviews.llvm.org/D69386
down to pass builder in ltobackend.
Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.
Differential Revision: https://reviews.llvm.org/D72386
The language wording change forgot to update overload resolution to rank
implicit conversion sequences based on qualification conversions in
reference bindings. The anticipated resolution for that oversight is
implemented here -- we order candidates based on qualification
conversion, not only on top-level cv-qualifiers, including ranking
reference bindings against non-reference bindings if they differ in
non-top-level qualification conversions.
For OpenCL/C++, this allows reference binding between pointers with
differing (nested) address spaces. This makes the behavior of reference
binding consistent with that of implicit pointer conversions, as is the
purpose of this change, but that pre-existing behavior for pointer
conversions is itself probably not correct. In any case, it's now
consistently the same behavior and implemented in only one place.
This reinstates commit de21704ba9,
reverted in commit d8018233d1, with
workarounds for some overload resolution ordering problems introduced by
CWG2352.
If an SGPR vector is indexed with a VGPR, the actual indexing will be
done on the SGPR and produce an SGPR. A copy needs to be inserted
inside the waterwall loop to the VGPR result.
An undefined weak does not fetch the lazy definition. A lazy weak symbol
should be considered undefined, and thus preemptible if .dynsym exists.
D71795 is not quite an NFC. It errors on an R_X86_64_PLT32 referencing
an undefined weak symbol. isPreemptible is false (incorrect) => R_PLT_PC
is optimized to R_PC => in isStaticLinkTimeConstant, an error is emitted
when an R_PC is applied on an undefined weak (considered absolute).
qemu has a very small maximum packet size (4096) and it actually
only uses half of that buffer for some implementation reason,
so when lldb asks for the register target definitions, the x86_64
definition is larger than 4096/2 and we need to fetch it in two parts.
This patch and test is fixing a bug in
GDBRemoteCommunicationClient::ReadExtFeature when reading a target
file in multiple parts. lldb was assuming that it would always
get back the maximum packet size response (4096) instead of
using the actual size received and asking for the next group of
bytes.
We now have two tests in gdb_remote_client for unique features
of qemu - TestNestedRegDefinitions.py would test the ability
of lldb to follow multiple levels of xml includes; I opted to
create a separate TestRegDefinitionInParts.py test to test this
wrinkle in qemu's gdb remote serial protocol stub implementation.
Instead of combining both tests into a single test file.
<rdar://problem/49537922>
explicit functions that are not candidates.
It's not always obvious that the reason a conversion was not possible is
because the function you wanted to call is 'explicit', so explicitly say
if that's the case.
It would be nice to rank the explicit candidates higher in the
diagnostic if an implicit conversion sequence exists for their
arguments, but unfortunately we can't determine that without potentially
triggering non-immediate-context errors that we're not permitted to
produce.
Summary: Some data values have a different storage width than the corresponding MLIR type, e.g. bfloat is currently stored as a double.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D72478
For arguments that are not expected to be materialized with
G_CONSTANT, this was emitting predicates which could never match. It
was first adding a meaningless LLT check, which would always fail due
to the operand not being a register.
Infer the cases where a literal should check for an immediate operand,
instead of a register This avoids needing to invent a special way of
representing timm literal values.
Also handle immediate arguments in GIM_CheckLiteralInt. The comments
stated it handled isImm() and isCImm(), but that wasn't really true.
This unblocks work on the selection of all of the complicated AMDGPU
intrinsics in future commits.
The current implementation assumes there is an instruction associated
with the transform, but this is not the case for
timm/TargetConstant/immarg values. These transforms should directly
operate on a specific MachineOperand in the source
instruction. TableGen would assert if you attempted to define an
equivalent GISDNodeXFormEquiv using timm when it failed to find the
instruction matcher.
Specially recognize SDNodeXForms on timm, and pass the operand index
to the render function.
Ideally this would be a separate render function type that looks like
void renderFoo(MachineInstrBuilder, const MachineOperand&), but this
proved to be somewhat mechanically painful. Add an optional operand
index which will only be passed if the transform should only look at
the one source operand.
Theoretically it would also be possible to only ever pass the
MachineOperand, and the existing renderers would check the parent. I
think that would be somewhat ugly for the standard usage which may
want to inspect other operands, and I also think MachineOperand should
eventually not carry a pointer to the parent instruction.
Use it in one sample pattern. This isn't a great example, since the
transform exists to satisfy DAG type constraints. This could also be
avoided by just changing the MachineInstr's arbitrary choice of
operand type from i16 to i32. Other patterns have nontrivial uses, but
this serves as the simplest example.
One flaw this still has is if you try to use an SDNodeXForm defined
for imm, but the source pattern uses timm, you still see the "Failed
to lookup instruction" assert. However, there is now a way to avoid
it.
Only PPC seems to be using it, and only checks some simple cases and
doesn't distinguish between FP. Just switch to using LLT to simplify
use from GlobalISel.
We were seeing some occasional build failures that would come and go.
It appeared to be this missing dependence.
Differential Revision: https://reviews.llvm.org/D72419