It looks like there's some old version of gcc that doesn't like this
static_assert (I couldn't reproduce the issue with gcc 8, 9 or 10).
Work around the issue by only checking the static_assert under clang,
which should provide sufficient coverage.
Should hopefully fix this buildbot:
https://lab.llvm.org/buildbot/#/builders/112/builds/5356
Remove the dependence on standard C++ header
for overloaded math functions in HIP header
since standard C++ header is not available for hipRTC.
Reviewed by: Artem Belevich, Justin Lebar
Differential Revision: https://reviews.llvm.org/D100794
With AndroidSizeClassMap all of the LSBs are in the range 4-6 so we
only need 2 bits of information per size class. Furthermore we have
32 size classes, which conveniently lets us fit all of the information
into a 64-bit integer. Do so if possible so that we can avoid a table
lookup entirely.
Differential Revision: https://reviews.llvm.org/D101105
Background:
CFGStackify's [[ 398f253400/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (L1481-L1540) | fixEndsAtEndOfFunction ]] fixes block/loop/try's return
type when the end of function is unreachable and the function return
type is not void. So if a function returns i32 and `block`-`end` wraps the
whole function, i.e., the `block`'s `end` is the last instruction of the
function, the `block`'s return type should be i32 too:
```
block i32
...
end
end_function
```
If there are consecutive `end`s, this signature has to be propagate to
those blocks too, like:
```
block i32
...
block i32
...
end
end
end_function
```
This applies to `try`-`end` too:
```
try i32
...
catch
...
end
end_function
```
In case of `try`, we not only follow consecutive `end`s but also follow
`catch`, because for the type of the whole `try` to be i32, both `try`
and `catch` parts have to be i32:
```
try i32
...
block i32
...
end
catch
...
block i32
...
end
end
end_function
```
---
Previously we only handled consecutive `end`s or `end` before a `catch`.
But now we have `delegate`, which serves like `end` for
`try`-`delegate`. So we have to follow `delegate` too and mark its
corresponding `try` as i32 (the function's return type):
```
try i32
...
catch
...
try i32 ;; Here
...
delegate N
end
end_function
```
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D101036
This adds support for YAML serialization of `Params` and `Results`
fields in `WebAssemblyMachineFunctionInfo`. Types are printed as `MVT`'s
string representation. This is for writing MIR tests easier.
The tests added are testing simple parsing and printing of `params` /
`results` fields under `machineFunctionInfo`.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D101029
This CL
1. Creates Utils/ directory under lib/Target/WebAssembly
2. Moves existing WebAssemblyUtilities.cpp|h into the Utils/ directory
3. Creates Utils/WebAssemblyTypeUtilities.cpp|h and put type
declarataions and type conversion functions scattered in various
places into this single place.
It has been suggested several times that it is not easy to share utility
functions between subdirectories (AsmParser, DIsassembler, MCTargetDesc,
...). Sometimes we ended up [[ https://reviews.llvm.org/D92840#2478863 | duplicating ]] the same function because of
this.
There are already other targets doing this: AArch64, AMDGPU, and ARM
have Utils/ subdirectory under their target directory.
This extracts the utility functions into a single directory Utils/ and
make them sharable among all passes in WebAssembly/ and its
subdirectories. Also I believe gathering all type-related conversion
functionalities into a single place makes it more usable. (Actually I
was working on another CL that uses various type conversion functions
scattered in multiple places, which became the motivation for this CL.)
Reviewed By: dschuff, aardappel
Differential Revision: https://reviews.llvm.org/D100995
We had got it backwards... the minimum version of the target
should be higher than the min version of the object files, presumably
since new platforms are backwards-compatible with older formats.
Fixes PR50078.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D101114
The single source file reduction.cpp is a little large in
terms of both source lines and generated text bytes, so
split SUM, PRODUCT, FINDLOC, and MAXLOC/MAXVAL/MINLOC/MINVAL
off into their own C++ source files that share a set of
implementation function templates now in a common header.
Differential Revision: https://reviews.llvm.org/D101111
Add PromoteIntOp_FP_TO_XINT_SAT to type legalize the bit width
operand from i32 to i64 for RV64.
Add test cases for the saturating intrinsics for half/float/double
and i32/i64. CodeGen is definitely not optimal. We can probably
make use of the native behavior of fcvt instructions in many cases.
Fixes PR50083
This patch is supposed to solve: https://bugs.llvm.org/show_bug.cgi?id=50075
The function `__dfsan_mem_transfer_callback` takes a `Len` argument of type `i64`; however, when processing a `MemTransferInst` such as `llvm.memcpy.p0i8.p0i8.i32`, the `len` argument has type `i32`. In order to make the type of `len` compatible with the one of the callback argument, this change zero-extends it when necessary.
Reviewed By: stephan.yichao.zhao, gbalats
Differential Revision: https://reviews.llvm.org/D101048
'not' expands to checking for an xor with a -1 constant. Since
this looks for a ConstantSDNode it will never match for a vector.
Co-authored-by: Craig Topper <craig.topper@sifive.com>
Differential Revision: https://reviews.llvm.org/D100687
A couple of our Instrumentation runtimes were gathering backtraces,
storing it in a StructuredData array and later creating a HistoryThread
using this data. By deafult HistoryThread will consider the history PCs
as return addresses and thus will substract 1 from them to go to the
call address.
This is usually correct, but it's also wasteful as when we gather the
backtraces ourselves, we have much better information to decide how
to backtrace and symbolicate. This patch uses the new
GetFrameCodeAddressForSymbolication() to gather the PCs that should
be used for symbolication and configures the HistoryThread to just
use those PCs as-is.
(The MTC plugin was actaully applying a -1 itself and then the
HistoryThread would do it again, so this actaully fixes a bug there.)
rdar://77027680
Differential Revision: https://reviews.llvm.org/D101094
Both the alias and aliasee linkage are important.
PR27866 provides some background.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D99629
This improves the lowering of v8i16 and v16i8 vector reverse shuffles.
Instead of going via a generic tbl it uses a rev64; ext pair, as already
happens for v4i32.
Differential Revision: https://reviews.llvm.org/D100882
The profiling runtime was designed to work without static initializers
or a a filesystem (see 117cf2bd1f and
others). The no-static-initializers part was already documented but this
part got missed before.
Differential Revision: https://reviews.llvm.org/D101000
lookupTarget() can update the passed triple argument. This happens
when no triple is given on the command line, and the architecture
argument does not match the architecture in the default triple.
For example, passing -march=aarch64 on the command line, and the
default triple being x86_64-windows-msvc, the triple is changed
to aarch64-windows-msvc.
However, this triple is not saved, and later in the code, the
triple is constructed again from the triple name, which is the
default triple at this point. Thus the default triple is passed
to constructor of MCSubtargetInfo instance.
The triple is only used determine the object file format, and by
chance, the AArch64 target also uses the COFF file format, and
all is fine. Obviously, the AArch64 target does not support all
available binary file formats, e.g. XCOFF and GOFF, and llvm-mca
crashes in this case.
The fix is to update the triple name with the changed triple
name for the target lookup. Then the default object file format
for the architecture is used, in the example ELF.
Reviewed By: andreadb, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D100992
In the most common case we call computeOddEvenMaskForPointerMaybe()
from quarantineOrDeallocateChunk(), in which case we need to look up
the class size from the SizeClassMap in order to compute the LSB. Since
we need to do a lookup anyway, we may as well look up the LSB itself
and avoid computing it every time.
While here, switch to a slightly more efficient way of computing the
odd/even mask.
Differential Revision: https://reviews.llvm.org/D101018
This avoids test failures where extra files exist in the tree, such
as the standard library built using the runtimes build.
Differential Revision: https://reviews.llvm.org/D101023
This change effectively reverts 86664638, but since there have been some changes on top and I wanted to leave the tests in, it's not a mechanical revert.
Why revert this now? Two main reasons:
1) There are continuing discussion around what the semantics of nofree. I am getting increasing uncomfortable with the seeming possibility we might redefine nofree in a way incompatible with these changes.
2) There was a reported miscompile triggered by this change (https://github.com/emscripten-core/emscripten/issues/9443). At first, I was making good progress on tracking down the issues exposed and those issues appeared to be unrelated latent bugs. Now that we've found at least one bug in the original change, and the investigation has stalled, I'm no longer comfortable leaving this in tree. In retrospect, I probably should have reverted this earlier and investigated the issues once the triggering change was out of tree.
These instructions don't really exist, but we have ways we can
emulate them.
.vv will swap operands and use vmsle().vv. .vi will adjust the
immediate and use .vmsgt(u).vi when possible. For .vx we need to
use some of the multiple instruction sequences from the V extension
spec.
For unmasked vmsge(u).vx we use:
vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd
For cases where mask and maskedoff are the same value then we have
vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that
requires a temporary so we use:
vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt
For other masked cases we use this sequence:
vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0
We trust that register allocation will prevent vd in vmslt{u}.vx
from being v0 since v0 is still needed by the vmxor.
Differential Revision: https://reviews.llvm.org/D100925
Refactor to use new multiclass instead of individual patterns.
We already supported this due to SEW=64 on RV32, but we didn't have
test cases for all the types we supported.
Part of D100925
We don't have instructions for these, but can swap the operands
to use vmle/vmflt. This makes the IR interface more consistent and
simplifies the frontend implementation.
Part of D100925
Switching from `%f18` to `%flang_fc1` in LIT tests added in
https://reviews.llvm.org/D91159. This way these tests are run with the
new driver, `flang-new`, when enabled (i.e. when
`FLANG_BUILD_NEW_DRIVER` is set).
Differential Revision: https://reviews.llvm.org/D101078
Implementations are allowed to optimize an x0 stride to perform
less memory accesses. This is the case in SiFive cores.
No idea if this is the case in other implementations. We might
need a tuning flag for this.
Reviewed By: frasercrmck, arcbbb
Differential Revision: https://reviews.llvm.org/D100815
The `--allow-jit` flag allows the user to force the IR interpreter to run the
provided expression.
The `--top-level` flag parses and injects the code as if its in the top level
scope of a source file.
Both flags just change the ExecutionPolicy of the expression:
* `--allow-jit true` -> doesn't change anything (its the default)
* `--allow-jit false` -> ExecutionPolicyNever
* `--top-level` -> ExecutionPolicyTopLevel
Passing `--allow-jit false` and `--top-level` currently causes the `--top-level`
to silently overwrite the ExecutionPolicy value that was set by `--allow-jit
false`. There isn't any ExecutionPolicy value that says "top-level but only
interpret", so I would say we reject this combination of flags until someone
finds time to refactor top-level feature out of the ExecutionPolicy enum.
The SBExpressionOptions suffer from a similar symptom as `SetTopLevel` and
`SetAllowJIT` just silently disable each other. But those functions don't have
any error handling, so not a lot we can do about this in the meantime.
Reviewed By: labath, kastiglione
Differential Revision: https://reviews.llvm.org/D91780