This change exposes the sqrt library function for HLSL scalar types,
excluding long and long long doubles. Sqrt is supported for all scalar, vector,
and matrix types. This patch only adds a subset of scalar type support.
Long and long long double support is missing in this patch because that type
doesn't exist in HLSL.
The full documentation of the HLSL asin function is available here:
https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-sqrt
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D132711
Clean up ahead of a patch to fix bugs in the AMDGPUDisassembler.
Use lit.local.cfg substitutions and more idiomatic use of split-file to
simplify and extend existing kernel-descriptor disassembly tests.
Add a comment to AMDHSAKernelDescriptor.h, as at least one small set
towards keeping all kernel-descriptor sensitive code in sync.
Reviewed By: kzhuravl, arsenm
Differential Revision: https://reviews.llvm.org/D130105
This can be used to disable vectorization and memcpy/memset
expansion for things like OS kernels. It also disables implicit
uses of scalar FP, but I don't know if we have any of those for
RISC-V.
NOTE: Without this patch you can still do -Xclang -no-implicit-float
Reviewed By: rui.zhang
Differential Revision: https://reviews.llvm.org/D134077
These patterns were previously only implemented for i1 type but can be extended for any integer type by also handling zext and sext operands.
Differential Revision: https://reviews.llvm.org/D134142
Specifying the python path here ensures that the python binary used matches the
one used by the main MLIR tests. This is useful when cmake's automatic detection
has to be overridden.
Reviewed By: stellaraccident, bondhugula
Differential Revision: https://reviews.llvm.org/D134251
If one of the operands in a matrix multiplication is negated we can optimise the equation by moving the negation to the smallest element of the operands or the result.
Reviewed By: spatel, fhahn
Differential Revision: https://reviews.llvm.org/D133300
Clang already generates code that doesn't use writeable data in executable
sections so the linker flag is all that is necessary.
-Wl,--no-execute-only can be used to turn this default off.
Differential Revision: https://reviews.llvm.org/D134289
This patch "fixes" a longstanding issue where the assembly format for
ArrayRefParameter could not handle an empty list. This is because there
was no way to generically optionally parse the first element of the
array. The only solution was to write a (relatively simple) custom parser.
This patch implements "empty" ArrayRefParameters by using
inverted optional groups and an optional ArrayRefParameter.
Depends on D133816
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D133819
This patch consolidates the notions of an optional parameter and a
default parameter. An optional parameter is a parameter equal to its
default value, which for a "purely optional" parameter is its "null"
value.
This allows the existing `comparator` and `defaultValue` fields to be
used enabled more complex "optional" parameters, such as empty arrays.
Depends on D133812
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D133816
This patch changes optional groups to allow anchors in the 'else'
element group. When printing, the optional condition is inverted to
decide which group to print. This is useful for parsing concrete
optional elements that don't have a `parseOptional*` method or some
other way to test whether it's present.
Depends on D133805
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D133812
Instead of its index. There is no benefit to storing the index instead
of the pointer.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D133805
Several different places in the code had similar computations for the parameters that were eventually passed to the TypeInfo constructor.
This centralizes that code in one function, and allows passing TypeInfo to the various other *Info structures that need it.
Remove some "auto" types and replace with the real type for getting declarations. This was making some duplicate checking difficult to see.
Reviewed By: paulkirth
Differential Revision: https://reviews.llvm.org/D134225
- Merge pairs like `eFormatCategoryItemSummary` and
`eFormatCategoryItemRegexSummary` into a single value. See explanation
below.
- Rename `eFormatCategoryItemValue` to `eFormatCategoryItemFormat`. This
makes the enum match the names used elsewhere for formatter kinds
(format, summary, filter, synth).
- Delete unused values `eFormatCategoryItemValidator` and
`eFormatCategoryItemRegexValidator`.
This enum is only used to reuse some code in CommandObjectType.cpp. For
example, instead of having separate implementations for `type summary
delete`, `type format delete`, and so on, there's a single generic
implementation that takes an enum value, and then the specific commands
derive from it and set the right flags for the specific kind of
formatter.
Even though the enum distinguishes between regular and regex matches for
every kind of formatter, this distinction is never used: enum values are
always specified in pairs like
`eFormatCategoryItemSummary | eFormatCategoryItemRegexSummary`.
This causes some ugly code duplication in TypeCategory.cpp. In order to
handle every flag combination some code appears 8 times:
{format, summary, synth, filter} x {exact, regex}
Differential Revision: https://reviews.llvm.org/D134244
Since there is no guaranteed correspondence of SDNode and MI operands, we need
getters simular to RISCVII::get*OpNum for SDNodes.
More uses of getVecPolicyOpIdx will be added in D130895.
Reviewed By: craig.topper, arcbbb
Differential Revision: https://reviews.llvm.org/D134179
This implements experimental support for the Zawrs extension as specified here: https://github.com/riscv/riscv-zawrs/releases/download/V1.0-rc3/Zawrs.pdf. Despite the 1.0 version name, this has not been ratified and there was a major change to proposed specification between rc2 and rc3. Once this is ratified, it'll move out of experimental status.
This change adds assembly support, but does not include C language or IR intrinsics. We can decide if we want them, and handle that in a separate patch.
Differential Revision: https://reviews.llvm.org/D133443
While testing a test failure of C++17 with Clang ToT it was noticed the
paper
P0602R4 variant and optional should propagate copy/move triviality
was not applied as a DR in libc++.
This was discovered while investigating the issue "caused by" D131479.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D133326
Update the formatter day tests to the new style.
Other test will be done separately.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D134031
Switch to the new granular format_functions header. Since the chrono's
format dependency in C++20 hasn't been in a release it's save to remove
it.
Depends on D133665
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D133796
Fix __builtin_assume_aligned incorrect type descriptor
example from @rsmith
struct A { int n; };
struct B { int n; };
struct C : A, B {};
void *f(C *c) {
// Incorrectly returns `c` rather than the address of the B base class.
return __builtin_assume_aligned((B*)c, 8);
}
Differential Revision: https://reviews.llvm.org/D133583
With the recent addition of new parameter MergeAttributes (D134117),
callers need to specify several default parameters before getting to
specify the new parameter.
This patch reorders the parameters so that callers do not have to
specify as many default parameters.
Differential Revision: https://reviews.llvm.org/D134125
This patch enables the LSLFast feature for Cortex-A76, Cortex-A77,
Cortex-A78, Cortex-A78C, Cortex-A710, Cortex-X1, Cortex-X2, Neoverse N1,
Neoverse N2, Neoverse V1 and the Neoverse 512TB pseudo-cpu, in-line with
the software optimization guides for those CPUs.
Differntial revision: https://reviews.llvm.org/D134273
Having the flags only pass through if you're using the dxc-driver means
that the clang driver doesn't work for HLSL, which is undesirable. This
change switches to instead passing flags based on the language mode
similar to how OpenCL does it. This allows the clang driver to be used
for HLSL source files as well.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D133958
I'm not sure why the SEXT_INREG was gated on a bitwidth check of the mask
vs element size.
This fixes a miscompile in chromium's skia library.
Differential Revision: https://reviews.llvm.org/D134236
Introduces a simple framework for runtime tests of the wide integer emulation.
In these tests, we are only interested in checking that both wide and narrow calculation
produce the same results, and do not check for exact results. This allows us to cover
more of the input space, as we do not have to hardcode each of the expected outputs.
Introduce common helper functions to check the results, print a message on mismatch,
and sample the input space.
Implement runtime comparrison tests for `arith.muli` and `arith.shrui`.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D134184
The new test cases focus on known edge cases in the current implementation.
Specifically, we check for low (0, 1), mid (7, 8, 9) and high (15) shift amounts with i16 operands.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D134182
The new test pass allows for running wide integer emulation conversion
within specified functions only.
I intend to use it in integration tests in a way that allows me print both
original and emulated results in the same format, or even compare both results
at runtime and print on mismatch only.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D134120
This is a breaking change. If you were passing one of those three runtimes
in LLVM_ENABLE_PROJECTS, you need to start passing them in LLVM_ENABLE_RUNTIMES
instead. The runtimes in LLVM_ENABLE_RUNTIMES will start being built using
the "bootstrapping build" instead, which means that they will be built
using the just-built Clang. This is usually what you wanted anyway.
If you were using LLVM_ENABLE_PROJECTS=all with the explicit goal of
building these three runtimes, you can now use LLVM_ENABLE_RUNTIMES=all
and these runtimes will be built using the bootstrapping build.
Differential Revision: https://reviews.llvm.org/D132480
Providing access to the mapping of annotations allows test helpers to
be expressive by using the annotations as expectations. For example, a
matcher could verify that all annotated points were matched by a
matcher, or that an refactoring surgically modifies specific ranges.
Differential Revision: https://reviews.llvm.org/D134072
Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that hack.
We introduce a new register class VGPR_32_Lo128 which is used for 16-bit
operands of VOP1, VOP2, and VOPC instructions. This register class only has the
low 128 VGPRs, but is otherwise identical to VGPR_32. Therefore, 16-bit VOP1,
VOP2, and VOPC instructions are correctly limited to use the first 128
VGPRs, while the other instructions can freely use all 256.
We introduce new pseduo-instructions used on GFX11 which have the suffix
t16 (True 16) to use the VGPR_32_Lo128 register class.
Reviewed By: foad, rampitec, #amdgpu
Differential Revision: https://reviews.llvm.org/D133723