Summary:
This matches ELF.
This makes the number of ASan failures under the new pass manager on
Windows go from 18 to 1.
Under the old pass manager, the ASan module pass was one of the very
last things run, so these globals didn't get removed due to GlobalOpt.
But with the NPM the ASan module pass that adds these globals are run
much earlier in the pipeline and GlobalOpt ends up removing them.
Reviewers: vitalybuka, hans
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81175
Summary:
As explained in https://bugs.llvm.org/show_bug.cgi?id=46208,
symbolization on Windows after inlining and around
lambdas/std::functions doesn't work very well. Under the new pass
manager, there is inlining at -O1.
use-after-scope-capture.cpp checks that the symbolization points to the
line containing "return x;", but the combination of
Windows/inlining/lambdas makes the symbolization point to the line
"f = [&x]() {".
Mark the lambda as noinline since this test is not a test for
symbolization.
Reviewers: hans, dblaikie, vitalybuka
Subscribers: #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D81193
An initial patch adding combineSetCCMOVMSK to simplify MOVMSK and its vector input based on the comparison of the MOVMSK result.
This first stage just adds support for some simple MOVMSK(PACKSSBW()) cases where we remove the PACKSS if we're comparing ne/eq zero (any_of patterns), allowing us to directly compare against the v8i16 source vector(s) bitcasted to v16i8, with suitable masking to take into account of which signbits are valid.
Future combines could peek through further PACKSS, target shuffles, handle all_of patterns (ne/eq -1), optimize to a PTEST op, etc.
Differential Revision: https://reviews.llvm.org/D81171
Summary:
According to the comments, we want to convert the profile into two binary formats, and then into the md5text format.
We seems to have ignored the intermediate files.
This patch uses them to complete the full roundtrips.
Reviewers: wmi, wenlei
Reviewed By: wmi
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81202
Summary:
If you create an expression with parse errors, the `parser::Expr.typedExpr`
will be empty, which causes a compiler crash. The crash is caused by the
check in check-do-forall.cpp that scans all expresssions to see if `DO`
variables are being modified.
It turned out that the problem was that I was fetching subexpressions of type
`parser::Expr`, which are not guaranteed to have a non-null `typedExpr`. I
fixed this by only grabbing the top-level expression from which to gather
arguments as part of the DO loop analysis. This, in turn, exposed a problem
where I wasn't collecting all of the actual arguments in an expression. This
was caused by the fact that I wasn't recursing through the rest of the
expression after finding an argument. I fixed this by recursing through the
argument in the member function in `CollectActualArgumentsHelper`.
Reviewers: klausler, tskeith, DavidTruby
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81101
If both a.a and b.so define foo
```
ld.bfd -u foo a.a b.so # foo is defined
ld.bfd a.a b.so -u foo # foo is defined
ld.bfd -u foo b.so a.a # foo is undefined (provided at runtime by b.so)
ld.bfd b.so a.a -u foo # foo is undefined (provided at runtime by b.so)
```
In all cases we make foo undefined in the output. I tend to think the
GNU ld behavior makes more sense.
* In their model, they have to treat -u as a fake object file with an
undefined symbol before all input files, otherwise the first archive would not be fetched.
* Following their behavior allows us to drop a --warn-backrefs special case.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D81052
Treat N_AST symbol table entries like other debug entries and don't emit
them in the linked binary.
Differential revision: https://reviews.llvm.org/D81205
Improve consistency when printing test results:
Previously we were using different labels for group names (the header
for the list of, e.g., failing tests) and summary count lines. For
example, "Failing Tests"/"Unexpected Failures". This commit changes lit
to label things consistently.
Improve wording of labels:
When talking about individual test results, the first word in
"Unexpected Failures", "Expected Passes", and "Individual Timeouts" is
superfluous. Some labels contain the word "Tests" and some don't.
Let's simplify the names.
Before:
```
Failing Tests (1):
...
Expected Passes : 3
Unexpected Failures: 1
```
After:
```
Failed Tests (1):
...
Passed: 3
Failed: 1
```
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D77708
Summary:
Add regression tests of asmparser, mccodeemitter, and disassembler for
logical operation instructions. Also change asmparser to support CMOV
instruction. And, add new EQV/MRG/NND isntructions also.
Differential Revision: https://reviews.llvm.org/D81219
Summary:
`mlir-rocm-runner` is introduced in this commit to execute GPU modules on ROCm
platform. A small wrapper to encapsulate ROCm's HIP runtime API is also inside
the commit.
Due to behavior of ROCm, raw pointers inside memrefs passed to `gpu.launch`
must be modified on the host side to properly capture the pointer values
addressable on the GPU.
LLVM MC is used to assemble AMD GCN ISA coming out from
`ConvertGPUKernelToBlobPass` to binary form, and LLD is used to produce a shared
ELF object which could be loaded by ROCm HIP runtime.
gfx900 is the default target be used right now, although it could be altered via
an option in `mlir-rocm-runner`. Future revisions may consider using ROCm Agent
Enumerator to detect the right target on the system.
Notice AMDGPU Code Object V2 is used in this revision. Future enhancements may
upgrade to AMDGPU Code Object V3.
Bitcode libraries in ROCm-Device-Libs, which implements math routines exposed in
`rocdl` dialect are not yet linked, and is left as a TODO in the logic.
Reviewers: herhut
Subscribers: mgorny, tpr, dexonsmith, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #mlir, #llvm
Differential Revision: https://reviews.llvm.org/D80676
This patch updates TargetLoweringBase::computeRegisterProperties and
TargetLoweringBase::getTypeConversion to support scalable vectors,
and make the right calls on how to legalise them. These changes are required
to legalise both MVTs and EVTs.
Reviewers: efriedma, david-arm, ctetreau
Reviewed By: efriedma
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80640
Summary:
Add regression tests of asmparser, mccodeemitter, and disassembler for
branch instructions. In order to support them, we enhance asmparser
by adding splitting mnemonic mechanism, e.g. "bgt.l.t" into "b", "gt",
and ".l.t", and parsing mechanism for AS style memory addressing.
We also implment encoding and decoding mechanism for branch instructions.
Differential Revision: https://reviews.llvm.org/D81215
Before this patch, we tried detecting whether small atomics were available
without linking against libatomic. However, that's not really what we want
to know -- instead, we want to know what's required in order to support
atomics fully, which is to link against libatomic when it's provided.
That is both much simpler, and it doesn't suffer the problem that we would
not link against libatomic when small atomics didn't require it, which
lead to non-lockfree atomics never working.
Furthermore, because we understand that some platforms might not want to
(or be able to) ship non-lockfree atomics, we add that notion to the test
suite, independently of a potential extern library.
After this patch, we therefore:
(1) Link against libatomic when it is provided
(2) Independently detect whether non-lockfree atomics are supported in
the test suite, regardless of whether that means we're linking against
an external library or not (which is an implementation detail).
Differential Revision: https://reviews.llvm.org/D81190
Add support for flat, location, and noperspective decorations in the
serializer and deserializer to be able to process basic shader files
for graphics applications.
Differential Revision: https://reviews.llvm.org/D80837
Recently introduced allocation hoisting is quite conservative on the cases when it triggers.
This revision makes it such that the allocations for vector transfer lowerings are hoisted
to the top of the function.
This should be revisited in the context of parallelism and is a temporary workaround.
Differential Revision: https://reviews.llvm.org/D81253
Summary:
The poly64 types are guarded with ifdefs for AArch64 only. This is wrong. This
was also incorrectly documented in the ACLE spec, but this has been rectified in
the latest release. See paragraph 13.1.2 "Vector data types":
https://developer.arm.com/docs/101028/latest
This patch was written by Alexandros Lamprineas.
Reviewers: ostannard, sdesmalen, fpetrogalli, labrinea, t.p.northover, LukeGeeson
Reviewed By: ostannard
Subscribers: pbarrio, LukeGeeson, kristof.beyls, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79711
Current implementation of emitPatchpoint() is very inefficient:
for every FrameIndex operand if creates new MachineInstr with
that operand expanded and all other copied as is.
Since PATCHPOINT/STATEPOINT instructions may have *a lot* of
FrameIndex operands, we end up creating and erasing many
machine instructions. But we can do it in single pass, with only
one new machine instruction generated.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D81181
Calls that are marked as @notoc do not require the extra nop after the call
for the TOC restore.
Differential Revision: https://reviews.llvm.org/D81081
Summary:
This patch adds legalisation of extensions where the operand
of the extend is a legal scalable type but the result is not.
EXTRACT_SUBVECTOR is used to split the result, before
being replaced by target-specific [S|U]UNPK[HI|LO] operations.
For example:
```
zext <vscale x 16 x i8> %a to <vscale x 16 x i16>
```
should emit:
```
uunpklo z2.h, z0.b
uunpkhi z1.h, z0.b
```
Reviewers: sdesmalen, efriedma, david-arm
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79587