This fixes detection when linking isn't supported (i.e. while building
builtins the first time).
Since 8368e4d54c, after setting
CMAKE_TRY_COMPILE_TARGET_TYPE to STATIC_LIBRARY, this isn't strictly
needed, but is good for correctness anyway (and in case that commit
ends up reverted).
Differential Revision: https://reviews.llvm.org/D98737
Supporting ranges in the byte code requires additional complexity, given that a range can't be easily representable as an opaque void *, as is possible with the existing bytecode value types (Attribute, Type, Value, etc.). To enable representing a range with void *, an auxillary storage is used for the actual range itself, with the pointer being passed around in the normal byte code memory. For type ranges, a TypeRange is stored. For value ranges, a ValueRange is stored. The above problem represents a majority of the complexity involved in this revision, the rest is adapting/adding byte code operations to support the changes made to the PDL interpreter in the parent revision.
After this revision, PDL will have initial end-to-end support for variadic operands/results.
Differential Revision: https://reviews.llvm.org/D95723
This revision extends the PDL Interpreter dialect to add support for variadic operands and results, with ranges of these values represented via the recently added !pdl.range type. To support this extension, three new operations have been added that closely match the single variant:
* pdl_interp.check_types : Compare a range of types with a known range.
* pdl_interp.create_types : Create a constant range of types.
* pdl_interp.get_operands : Get a range of operands from an operation.
* pdl_interp.get_results : Get a range of results from an operation.
* pdl_interp.switch_types : Switch on a range of types.
This revision handles adding support in the interpreter dialect and the conversion from PDL to PDLInterp. Support for variadic operands and results in the bytecode will be added in a followup revision.
Differential Revision: https://reviews.llvm.org/D95722
This revision extends the PDL dialect to add support for variadic operands and results, with ranges of these values represented via the recently added !pdl.range type. To support this extension, three new operations have been added that closely match the single variant:
* pdl.operands : Define a range of input operands.
* pdl.results : Extract a result group from an operation.
* pdl.types : Define a handle to a range of types.
Support for these in the pdl interpreter dialect and byte code will be added in followup revisions.
Differential Revision: https://reviews.llvm.org/D95721
This has a numerous amount of benefits, given the overly clunky nature of CreateNativeOp:
* Users can now call into arbitrary rewrite functions from inside of PDL, allowing for more natural interleaving of PDL/C++ and enabling for more of the pattern to be in PDL.
* Removes the need for an additional set of C++ functions/registry/etc. The new ApplyNativeRewriteOp will use the same PDLRewriteFunction as the existing RewriteOp. This reduces the API surface area exposed to users.
This revision also introduces a new PDLResultList class. This class is used to provide results of native rewrite functions back to PDL. We introduce a new class instead of using a SmallVector to simplify the work necessary for variadics, given that ranges will require some changes to the structure of PDLValue.
Differential Revision: https://reviews.llvm.org/D95720
Up until now, results have been represented as additional results to a pdl.operation. This is fairly clunky, as it mismatches the representation of the rest of the IR constructs(e.g. pdl.operand) and also isn't a viable representation for operations returned by pdl.create_native. This representation also creates much more difficult problems when factoring in support for variadic result groups, optional results, etc. To resolve some of these problems, and simplify adding support for variable length results, this revision extracts the representation for results out of pdl.operation in the form of a new `pdl.result` operation. This operation returns the result of an operation at a given index, e.g.:
```
%root = pdl.operation ...
%result = pdl.result 0 of %root
```
Differential Revision: https://reviews.llvm.org/D95719
Also use this in ReadBinaryName which currently is producing
warnings.
Keep pragmas for silencing warnings in sanitizer_unwind_win.cpp,
as that can be called more frequently.
Differential Revision: https://reviews.llvm.org/D97726
Previously we created a new node, then filled in the pieces. Now, we clone the existing node, then change the respective fields. The only change in handling is with phis since we have to handle multiple incoming edges from the same block a bit differently.
Differential Revision: https://reviews.llvm.org/D98316
A broadcast is a shufflevector where only one input is used. Because of the way we handle constants (undef is a constant), the canonical shuffle sees a meet of (some value) and (nullptr). Given this, every broadcast gets treated as a conflict and a new base pointer computation is added.
The other way to tackle this would be to change constant handling specifically for undefs, but this seems easier.
Differential Revision: https://reviews.llvm.org/D98315
Android's native bridge (i.e. AArch64 emulator) doesn't support TBI so
we need a way to disable TBI on Linux when targeting the native bridge.
This can also be used to test the no-TBI code path on Linux (currently
only used on Fuchsia), or make Scudo compatible with very old
(pre-commit d50240a5f6ceaf690a77b0fccb17be51cfa151c2 from June 2013)
Linux kernels that do not enable TBI.
Differential Revision: https://reviews.llvm.org/D98732
RS4GC needs to rewrite the IR to ensure that every relocated pointer has an associated base pointer. The existing code isn't particularly smart about avoiding duplication of existing IR when it turns out the original pointer we were asked to materialize a base pointer for is itself a base pointer.
This patch adds a stage to the algorithm which prunes nodes proven (with a simple forward dataflow fixed point) to be base pointers from the list of nodes considered for duplication. This does require changing some of the later invariants slightly, that's probably the riskiest part of the change.
Differential Revision: D98122
Add MemorySSAWrapperPass as a dependency to MemCpyOptLegacyPass,
since MemCpyOpt now uses MemorySSA by default.
Differential Revision: https://reviews.llvm.org/D98484
The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs()
needs to be called before iterating the interfering live ranges.
The rest of the patch offers support that is the case: instead of clearing the query's
InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix
is invalidated (existing triggering mechanism).
Without the change in RegAllocGreedy.cpp, the compiler ices.
This patch should make it more easily discoverable by developers that
collectInterferringVregs needs to be called before iterating.
I will follow up with a subsequent patch to improve the usability and maintainability of Query.
Differential Revision: https://reviews.llvm.org/D98232
This is my attempt to merge D98077 (bugfix the format strings for
Windows paths, which use wchar_t not char)
and D96986 (replace C++ variadic templates with C-style varargs so that
`__attribute__((format(printf)))` can be applied, for better safety)
and D98065 (remove an unused function overload).
The one intentional functional change here is in `__create_what`.
It now prints path1 and path2 in square-brackets _and_ double-quotes,
rather than just square-brackets. Prior to this patch, it would
print either path double-quoted if-and-only-if it was the empty
string. Now the double-quotes are always present. I doubt anybody's
code is relying on the current format, right?
Differential Revision: https://reviews.llvm.org/D98097
This parallels ConstantDataArray::getRaw() and can be used with ConstantDataSequential::getRawDataValues() in the base class for both types.
Update BuildConstantData{Array,Vector} tests to test the getRaw API. Also removes its unused Module.
In passing, update some comments to include the support for half and bfloat. Update tests to include testing for bfloat.
Differential Revision: https://reviews.llvm.org/D98302
If llvm so lib is dlopened and dlclosed several times, then memory leak can be observed, reported by Valgrind.
This patch fixes the issue.
Reviewed By: lattner, dblaikie
Differential Revision: https://reviews.llvm.org/D83372
This was (partially) reverted in cfe8f8e0 because the conversion from readonly to readnone in Intrinsics.td exposed a couple of problems. This change has been reworked to not need that change (via some explicit checks in client code). This is being done to address the original optimization issue and simplify the testing of the readonly changes. I'm working on that piece under 49607.
Original commit message follows:
The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)
We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.
Differential Revision: https://reviews.llvm.org/D97974
Instead of maintaining a separate map from predicated instructions to
recipes, we can instead directly look at the VP operands. If the operand
comes from a predicated instruction, the operand will be a
VPPredInstPHIRecipe with a VPReplicateRecipe as its operand.
This is a follow-up to D98588, and fixes the inline `FIXME` about a GEP-related simplification not
preserving the provenance.
https://alive2.llvm.org/ce/z/qbQoAY
Additional tests were added in {rGf125f28afdb59eba29d2491dac0dfc0a7bf1b60b}
Depends on D98672
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D98611
In previous versions of clang, __is_signed and __is_unsigned builtins did not
correspond to is_signed and is_unsigned behaviour for enums. The builtins were
fixed in D67897 and D98104.
* Disable the fast path of is_unsigned for clang versions < 13
* Add more tests for is_signed, is_unsigned and is_arithmetic
Differential Revision: https://reviews.llvm.org/D97283
`CodeGenFunction::EmitRuntimeCall` automatically sets the right calling
convention for the callee so we can avoid setting it ourselves.
As requested in https://reviews.llvm.org/D98411
Reviewed by: anastasia
Differential Revision: https://reviews.llvm.org/D98705
This adds a new integration test. However, it also
adapts to a recent memref.XXX change for existing tests
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D98680
There are several enum values that have been added to LLVM-C that are
missing from the OCaml bindings. The types defined in
bindings/ocaml/llvm/llvm.ml should be in sync with the corresponding
enum definitions in include/llvm-c/Core.h. The enum values are passed
from C to OCaml unmodified, and clients of the OCaml bindings
interpret them as tags of the corresponding OCaml types. So the only
changes needed are to add the missing constructors to the type
definitions, and to change the name of the maximum opcode in an
assertion.
Differential Revision: https://reviews.llvm.org/D98578
This commit folds sxtw'd or uxtw'd offsets into gather loads where
possible with a DAGCombine optimization.
As an example, the following code:
1 #include <arm_sve.h>
2
3 svuint64_t func(svbool_t pred, const int32_t *base, svint64_t offsets) {
4 return svld1sw_gather_s64offset_u64(
5 pred, base, svextw_s64_x(pred, offsets)
6 );
7 }
would previously lower to the following assembly:
sxtw z0.d, p0/m, z0.d
ld1sw { z0.d }, p0/z, [x0, z0.d]
ret
but now lowers to:
ld1sw { z0.d }, p0/z, [x0, z0.d, sxtw]
ret
Differential Revision: https://reviews.llvm.org/D97858
One of (and primary) callers of isBasicBlockEntryGuardedByCond is
isKnownPredicateAt, which makes isKnownPredicate check before it.
It already makes non-recursive check inside. So, on this execution
path this check is made twice. The only other caller is
isLoopEntryGuardedByCond. Moving the check there should save some
compile time.
The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF
before connecting it to the vector instruction. This occurs when
constrainRegClass reduces to a class with less than 4 registers.
I believe LMUL8 on masked instructions triggers this since the
result can only use the v8, v16, or v24 register group as the mask
is using v0.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D98567
The test file has embedded slashes. This is fine for normal users that
are just recording and reordering paths, but not great when the trace
data is committed back to a repository that should work on both Unix and
Windows.
This commit implements an IR-level optimization to eliminate idempotent
SVE mul/fmul intrinsic calls. Currently, the following patterns are
captured:
fmul pg (dup_x 1.0) V => V
mul pg (dup_x 1) V => V
fmul pg V (dup_x 1.0) => V
mul pg V (dup_x 1) => V
fmul pg V (dup v pg 1.0) => V
mul pg V (dup v pg 1) => V
The result of this commit is that code such as:
1 #include <arm_sve.h>
2
3 svfloat64_t foo(svfloat64_t a) {
4 svbool_t t = svptrue_b64();
5 svfloat64_t b = svdup_f64(1.0);
6 return svmul_m(t, a, b);
7 }
will lower to a nop.
This commit does not capture all possibilities; only the simple cases
described above. There is still room for further optimisation.
Differential Revision: https://reviews.llvm.org/D98033
The default promotion uses zero extends that become shifts. We
cam use sign extend instead which is better for RISCV.
I've used two different implementations based on whether we
have minu/maxu instructions.
Differential Revision: https://reviews.llvm.org/D98683
There is no syntax like {@code ...} in Doxygen, @code is a block command
that ends with @endcode, and generally these are not enclosed in braces.
The correct syntax for inline code snippets is @c <code>.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98665
The deprecation notice was cherrypicked to the release branch in f8b3298924 so its safe to remove this for the 13.X release cycle.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98612
We enumerated the cross product Domain x Scatter, but sorted only be the
scatter key. In case there are are multiple statement instances per
scatter value, the order between statement instances of the same loop
iteration was undefined.
Propertly enumerate and sort only by the scatter value, and group the
domains using the scatter dimension again.
Thanks to Leonard Chan for the report.
In preparation for D98611, the upcoming change will need to apply additional checks to `P` and `V`,
and so this refactor paves the way for adding additional checks in a less awkward way.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D98672