Recent work has introduced support for constructing loops via `::build` with
callbacks that construct loop bodies using only the core OpBuilder. This is now
supported on all loop types that Linalg lowers to. Refactor LoopNestBuilder in
Linalg to rely on this functionality instead of using a custom EDSC-based
approach to creating loop nests.
The specialization targeting parallel loops is also simplified by factoring out
the recursive call into a separate static function and considering only two
alternatives: top-level loop is parallel or sequential.
This removes the last remaining in-tree use of edsc::LoopBuilder, which is now
deprecated and will be removed soon.
Differential Revision: https://reviews.llvm.org/D81873
Similarly to `scf::ForOp`, introduce additional `function_ref` arguments to
`::build` functions of SCF `ParallelOp` and `ReduceOp`. The provided functions
will be called to construct the body of the respective operations while
constructing the operation itself. Exercise them in LoopUtils.
Differential Revision: https://reviews.llvm.org/D81872
Summary: we use the alias attribute, similar to what is done for ELF.
Reviewers: ZarkoCA, jasonliu, hubert.reinterpretcast, sfertile
Reviewed By: jasonliu
Subscribers: dberris, aheejin, mstorsjo, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D81120
Executing commands below will get you bombarded by a wall of Python
command prompts (>>> ).
$ echo 'foo' | ./bin/lldb -o script
$ cat /tmp/script
script
print("foo")
$ lldb --source /tmp/script
The issue is that our custom input reader doesn't handle EOF. According
to the Python documentation, file.readline always includes a trailing
newline character unless the file ends with an incomplete line. An empty
string signals EOF. This patch raises an EOFError when that happens.
[1] https://docs.python.org/2/library/stdtypes.html#file.readline
Differential revision: https://reviews.llvm.org/D81898
Generalize scalarization (recently enhanced with D80885)
to allow compares as well as binops.
Similar to binops, we are avoiding scalarization of a loaded
value because that could avoid a register transfer in codegen.
This requires 1 extra predicate that I am aware of: we do not
want to scalarize the condition value of a vector select. That
might also invert a transform that we do in instcombine that
prefers a vector condition operand for a vector select.
I think this is the final step in solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463
Differential Revision: https://reviews.llvm.org/D81661
The Standard documents the signature of std::advance as
template <class Iter, class Distance>
constexpr void advance(Iter& i, Distance n);
Furthermore, it does not appear to put any restriction on what the type
of Distance should be. While it is understood that it should usually
be std::iterator_traits::difference_type, I couldn't find any wording
that mandates that. Similarly, I couldn't find wording that forces the
distance to be a signed type.
This patch changes std::advance to accept any type in the second argument,
which appears to be what the Standard mandates. We then coerce it to the
iterator's difference type, but that's an implementation detail.
Differential Revision: https://reviews.llvm.org/D81425
When selecting 32 b -> 64 b G_ZEXTs, we don't have to always emit the extend.
If the instruction feeding into the G_ZEXT implicitly zero extends the high
half of the register, we can just emit a SUBREG_TO_REG instead.
Differential Revision: https://reviews.llvm.org/D81897
https://bugs.llvm.org/show_bug.cgi?id=46253
This is an obvious hack because realloc isn't any more affected than other
functions modeled by MallocChecker (or any user of CallDescription really),
but the nice solution will take some time to implement.
Differential Revision: https://reviews.llvm.org/D81745
Compiling assembly files when newlines are reduced to line markers within a `.macro` context will generate wrong information in `.debug_line` section.
This patch fixes this issue by evaluating line markers within the macro scope but not when they are used and evaluated.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D80381
Adds the callbacks for ordered with source/sink dependencies.
The test for task dependencies changed, because callbach.h now actually prints
the passed dependencies and the test also checks for the address.
Reviewed by: hbae
Differential Revision: https://reviews.llvm.org/D81807
Summary:
This revision replaces MatmulOp, now that DRR rules have been dropped.
This revision also fixes minor parsing bugs and a plugs a few holes to get e2e paths working (e.g. library call emission).
During the replacement the i32 version had to be dropped because only the EDSC operators +, *, etc support type inference.
Deciding on a type-polymorphic behavior, and implementing it, is left for future work.
Reviewers: aartbik
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D81935
Some tests were missing alignment info. Subsequent changes properly
preserve the set alignment. Set it properly beforehand, to avoid
unnecessary test changes.
It also updates cases where an alignment of 16 was specified, instead of
the vector element type alignment.
This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.
This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
The following people contributed to this patch:
Luke Geeson
- Momchil Velikov
- Mikhail Maltsev
- Luke Cheeseman
Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki,
stuij
Reviewed By: miyuki, stuij
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki, chill, pbarrio, stuij
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80752
Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f
Summary:
EOF macro token coming from a PCH file on macOS while marked as literal,
doesn't contain any literal data. This causes crash on every project
using PCHs.
This commit doesn't resolve the problem with PCH (maybe it was
designed like this for a purpose) or with `tryExpandAsInteger`, but
rather simply shoots off a crash itself.
Differential Revision: https://reviews.llvm.org/D81916
Previously we only printed a symbol value when it has a non-empty name
or non-zero value.
This patch changes the behavior. Now we only omit a symbols value when
a relocation does not reference a symbol (i.e. symbol index == 0).
Seems it is what GNU readelf does, looking on its output.
Differential revision: https://reviews.llvm.org/D81842
Currently, llvm-readelf crashes when there is a STT_SECTION symbol for the null section
and this symbol is used in a relocation.
Differential revision: https://reviews.llvm.org/D81840
An instruction like this will need to allocate some stack space for the
last parameter:
%x = call addrspace(1) i16 @bar(i64 undef, i64 undef, i16 undef, i16 0)
This worked fine when passing an actual value (in this case 0). However,
when passing undef, no value was pushed to the stack and therefore no
push instructions were created. This caused an unbalanced stack leading
to interesting results.
This commit fixes that by replacing the push logic with a regular stack
adjustment and stack-relative load/stores. This is less efficient but at
least it correctly compiles the code.
I can think of a few improvements in the future:
* The stack should have been adjusted in the function prologue when
there are no allocas in the function.
* Many (if not most) stack adjustments can be replaced by
pushing/popping the values directly. Exactly like the previous code
attempted but didn't do correctly.
* Small stack adjustments can be done more efficiently with a few
push/pop instructions (pushing/popping bogus values), both for code
size and for speed.
All in all, as long as there are no allocas in the function I think that
it is almost always more efficient to emit regular push/pop
instructions. This is however left for future optimizations.
Differential Revision: https://reviews.llvm.org/D78581
This patch fixes a bug in stack save/restore code. Because the frame
pointer was saved/restored manually (not by marking it as clobbered) the
StackSize variable was not updated accordingly. Most code still worked,
but code that tried to load a parameter passed on the stack did not.
This commit fixes this by marking the frame pointer as a
callee-clobbered register. This will let it be saved without any effort
in prolog/epilog code and will make sure the correct address is
calculated for loading parameters that are passed on the stack.
This approach is used by most other targets (such as X86, AArch64 and
RISC-V).
Differential Revision: https://reviews.llvm.org/D78579
These code patterns attempt to call isVMOVModifiedImm on a splat of i1
values, leading to an unreachable being hit. I've guarded the call on a
more specific set of sizes, as i1 vectors are legal under MVE.
Differential Revision: https://reviews.llvm.org/D81860
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79650
Fix a crash in clangd caused by an (admittidly incorrect) Remark diagnositic being emitted from readability-else-after-return.
This crash doesn't occur in clang-tidy so there are no tests there for this.
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D81785
GCC 7 was reporting "enumeral and non-enumeral type in conditional expression"
as a warning.
The code casts an instruction opcode enum to unsigned implicitly, in
line with intentions; so this commit silences the warning by making the
cast to unsigned explicit.