D106408 was doing this for all targets although it was
reverted due to couple performance regressions on some targets.
The difference for AMDGPU is the ability to rematerialize SOP
instructions with virtual register uses like we already do for VOP.
Differential Revision: https://reviews.llvm.org/D110743
bitcast (inselt (bitcast X), Y, 0) --> or (and X, MaskC), (zext Y)
https://alive2.llvm.org/ce/z/Ux-662
Similar to D111082 / db231ebdb0 :
We want to avoid relatively opaque vector ops on types that are
likely supported by the backend as scalar integers. The bitwise
logic ops are more likely to allow further combining.
We probably want to generalize this to allow a shift too, but
that would oppose instcombine's general rule of not creating
extra instructions, so that's left as a potential follow-up.
Alternatively, we could do that transform in VectorCombine
with the help of the TTI cost model.
This is part of solving:
https://llvm.org/PR52057
Enable building the ORC runtime for 64-bit and 32-bit ARM architectures,
and for all Darwin embedded platforms (iOS, tvOS, and watchOS). This
covers building the cross-platform code, but does not add TLV runtime
support for the new architectures, which can be added independently.
Incidentally, stop building the Mach-O TLS support file unnecessarily on
other platforms.
Differential Revision: https://reviews.llvm.org/D112111
We already disallow mixing SEH and C++ exceptions, and
mixing SEH and Objective-C exceptions seems to not work (see PR52233).
Emitting an error is friendlier than crashing.
Differential Revision: https://reviews.llvm.org/D112157
The path functions in this patch are unimplemented (as per the TODO comment from upstream). To avoid running into a linker error (missing symbol), this patch raises a compile error by commenting out the functions, which is more user friendly.
Differential Revision: https://reviews.llvm.org/D111892
isSpecifierType
There is no reason to have this here, (since all tests pass) and it
isn't even a specifier anyway. We can just treat it as a pointer
instead.
Differential Revision: https://reviews.llvm.org/D110068
As discussed in:
* https://reviews.llvm.org/D94166
* https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html
The GlobalIndirectSymbol class lost most of its meaning in
https://reviews.llvm.org/D109792, which disambiguated getBaseObject
(now getAliaseeObject) between GlobalIFunc and everything else.
In addition, as long as GlobalIFunc is not a GlobalObject and
getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
is a GlobalIFunc cannot currently be modeled properly. Creating
aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
which is undesirable because it should return the object itself for
non-aliases.
This patch refactors the GlobalIFunc class to inherit directly from
GlobalObject, and removes GlobalIndirectSymbol (while inlining the
relevant parts into GlobalAlias and GlobalIFunc). This allows for
calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
itself, making getAliaseeObject() more consistent and enabling
alias-to-ifunc to be properly modeled in the IR.
I exercised some judgement in the API clients of GlobalIndirectSymbol:
some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
some remained shared (with the type adapted to become GlobalValue).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D108872
Add relaxed. f32x4.min, f32x4.max, f64x2.min, f64x2.max. These are only
exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112146
According to the OpenMP 5.0 standard, names and hints of critical operation are
closely related. The following are the restrictions on them:
- Unless the effect is as if `hint(omp_sync_hint_none)` was specified, the
critical construct must specify a name.
- If the hint clause is specified, each of the critical constructs with the
same name must have a hint clause for which the hint-expression evaluates to
the same value.
These restrictions will be enforced by design if the hint expression is a part
of the `omp.critical.declare` operation.
- Any operation with no "name" will be considered to have
`hint(omp_sync_hint_none)`.
- All the operations with the same "name" will have the same hint value.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D112134
Currently we have a way to run a plugin if specified on the command line
after the main action, and ways to unconditionally run the plugin before
or after the main action, but no way to run a plugin if specified on the
command line before the main action.
This introduces the missing option.
This is helpful because -clear-ast-before-backend clears the AST before
codegen, while some plugins may want access to the AST.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D112096
This patch fixes a crash when despeculating ctlz/cttz intrinsics with
scalable-vector types. It is not safe to speculatively get the size of
the vector type in bits in case the vector type is not a fixed-length type. As
it happens this isn't required as vector types are skipped anyway.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112141
Previously we used builtin_alias for overloaded intrinsics, but
macros for the non-overloaded version. This patch changes the
non-overloaded versions to also use builtin_alias, but without
the overloadable attribute.
Reviewed By: khchen, HsiangKai
Differential Revision: https://reviews.llvm.org/D112020
It turns out llvm::isa<> is variadic, and we could have used this at a
lot of places.
The following patterns:
x && isa<T1>(x) || isa<T2>(x) ...
Will be replaced by:
isa_and_non_null<T1, T2, ...>(x)
Sometimes it caused further simplifications, when it would cause even
more code smell.
Aside from this, keep in mind that within `assert()` or any macro
functions, we need to wrap the isa<> expression within a parenthesis,
due to the parsing of the comma.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D111982
In Driver.cpp, addFramework used std::string instance to represent the path of a framework, which will be freed after the function returns. However, this string is stored in loadedArchive, which will be used later to compare with path of newly added frameworks. This caused https://bugs.llvm.org/show_bug.cgi?id=52133. A test is included in this commit to reproduce this bug.
Now resolveDylibPath returns a StringRef instance, and it uses StringSaver to save its data, then returns it to functions on the top. This ensures the resolved framework path is still valid after LC_LINKER_OPTION is parsed.
Reviewed By: int3, #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D111706
Removed/replaced RUN lines using legacy PM syntax in favor of using
-passes in lit tests for Float2Int, MetaRenamer, StripDeadPrototypes
and StripSymbols.
Our fallback expansion for CTLZ/CTTZ relies on CTPOP. If CTPOP
isn't legal or custom for a vector type we would scalarize the
CTLZ/CTTZ. This is different than CTPOP itself which would use a
vector expansion.
This patch teaches expandCTLZ/CTTZ to rely on the vector CTPOP
expansion instead of scalarizing. To do this I had to add additional
checks to make sure the operations used by CTPOP expansions are all
supported. Some of the operations were already needed for the CTLZ/CTTZ
expansion.
This is a huge improvement to the RISCV which doesn't have a scalar
ctlz or cttz in the base ISA.
For WebAssembly, I've added Custom lowering to keep the scalarizing
behavior. I've also extended the scalarizing to CTPOP.
Differential Revision: https://reviews.llvm.org/D111919
Includer cache could get into a bad state when a main file went bad and
added back afterwards. This patch adds a check to invalidate to prevent
that.
Differential Revision: https://reviews.llvm.org/D112130
Follow up to also use the prefixed emitters in OpFormatGen (moved
getGetterName(s) and getSetterName(s) to Operator as that is most
convenient usage wise even though it just depends on Dialect). Prefix
accessors in Test dialect and follow up on missed changes in
OpDefinitionsGen.
Differential Revision: https://reviews.llvm.org/D112118
Here's another performance patch for InstrRefBasedLDV: rather than
processing all variable values in a scope at a time, instead, process one
variable at a time. The benefits are twofold:
* It's easier to reason about one variable at a time in your mind,
* It improves performance, apparently from increased locality.
The downside is that the value-propagation code gets indented one level
further, plus there's some churn in the unit tests.
Differential Revision: https://reviews.llvm.org/D111799
This revision uses the newly refactored StructuredGenerator to create a simple vectorization for conv1d_nwc_wcf.
Note that the pattern is not specific to the op and is technically not even specific to the ConvolutionOpInterface (modulo minor details related to dilations and strides).
The overall design follows the same ideas as the lowering of vector::ContractionOp -> vector::OuterProduct: it seeks to be minimally complex, composable and extensible while avoiding inference analysis. Instead, we metaprogram the maps/indexings we expect and we match against them.
This is just a first stab and still needs to be evaluated for performance.
Other tradeoffs are possible that should be explored.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D111894
Previously, when the fuzzing loop replaced an input in the corpus, it didn't update the execution time of the input. Therefore, some schedulers (e.g. Entropic) would adjust weights based on the incorrect execution time.
This patch updates the execution time of the input when replacing it.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D111479
This temporary FIXME really belongs to the testing config, not to the
specific CMake cache that enables that configuration.
Differential Revision: https://reviews.llvm.org/D112031
gdbserver does not expose combined ymm* registers but rather XSAVE-style
split xmm* and ymm*h portions. Extend value_regs to support combining
multiple registers and use it to create user-friendly ymm* registers
that are combined from split xmm* and ymm*h portions.
Differential Revision: https://reviews.llvm.org/D108937
Fix incorrect values for value_regs, and incomplete values for
invalidate_regs in RegisterInfos_arm. The value_regs entry needs
to list only one base (i.e. larger) register that needs to be read
to get the value for this register, while invalidate_regs needs to list
all other registers (including pseudo-register) whose values would
change when this register is written to.
7a8ba4ffbe fixed a similar problem
for ARM64.
Differential Revision: https://reviews.llvm.org/D112066
Support arbitrarily-sized FPR writes on ARM in order to fix writing qN
registers directly. Currently, writing them works only by accident
due to value_regs splitting them into smaller writes via dN and sN
registers.
Differential Revision: https://reviews.llvm.org/D112131
When inserting a scalable subvector into a scalable vector through
the stack, the index to store to needs to be scaled by vscale.
Before this patch, that didn't yet happen, so it would generate the
wrong offset, thus storing a subvector to the incorrect address
and overwriting the wrong lanes.
For some insert:
nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2)
The offset was not scaled by vscale:
orr x8, x8, #0x4
st1h { z0.h }, p0, [sp]
st1h { z1.d }, p1, [x8]
ld1h { z0.h }, p0/z, [sp]
And is changed to:
mov x8, sp
st1h { z0.h }, p0, [sp]
st1h { z1.d }, p1, [x8, #1, mul vl]
ld1h { z0.h }, p0/z, [sp]
Differential Revision: https://reviews.llvm.org/D111633
This commit switches libunwind from using the complicated logic in
libc++'s testing configuration to a from-scratch configuration.
I tried to make sure that all cases that were handled in the old
config were handled by this one too, so hopefully this shouldn't
break anyone. However, if you encounter issues with this change,
please let me know and feel free to revert if I don't reply quickly.
This change was engineered to be easily revertable.
Differential Revision: https://reviews.llvm.org/D112082
When we added support for if consteval, we accidentally formed a discarded
statement evaluation context for the branch-not-taken. However, a discarded
statement is a property of an if constexpr statement, not an if consteval
statement (https://eel.is/c++draft/stmt.if#2.sentence-2). This turned out to
cause issues when deducing the return type from a function with a consteval if
statement -- we wouldn't consider the branch-not-taken when deducing the return
type.
This fixes PR52206.
Note, there is additional work left to be done. We need to track discarded
statement and immediate evaluation contexts separately rather than as being
mutually exclusive.
Add a FPU_QREG macro to define qN registers. This is a piece-wise
attempt of reconstructing D112066 with the goal of figuring out which
part of the larger change breaks the buildbot.
Differential Revision: https://reviews.llvm.org/D112066
Replace X86ProcFamilyEnum::IntelSLM enum with a TuningUseSLMArithCosts flag instead, matching what we already do for Goldmont.
This just leaves X86ProcFamilyEnum::IntelAtom to replace with general Tuning/Feature flags and we can finally get rid of the old X86ProcFamilyEnum enum.
Differential Revision: https://reviews.llvm.org/D112079