clang with -flto does not handle -foptimization-record-path=<path>
This dulicates the code from ToolChains/Clang.cpp with modifications to
support everything in the same fashion.
Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.
More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).
Also adds a new target feature to X86: +lvi-load-hardening
The feature can be added via the clang CLI using -mlvi-hardening.
Differential Revision: https://reviews.llvm.org/D75936
This patch adds a test for the PowerPC fma compiler builtins, some variations
of which negate inputs and outputs. The code to generate IR for these
builtins was untested before this patch.
Originally, the code used the outdated method of subtracting floating point
values from -0.0 as floating point negation. This patch remedies that.
Patch by: Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D76949
Summary:
- `RegisterVar` has `void` return type and `size_t` in its variable size
parameter in HIP or CUDA 9.0+.
Reviewers: tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77398
Summary:
This matches llvm::VectorType.
It moves the size from the type bitfield into VectorType, increasing size by 8
bytes (including padding of 4). This is OK as we don't expect to create terribly
many of these types.
c.f. D77313 which enables large power-of-two sizes without growing VectorType.
Reviewers: efriedma, hokein
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77335
This pass replaces each indirect call/jump with a direct call to a thunk that looks like:
lfence
jmpq *%r11
This ensures that if the value in register %r11 was loaded from memory, then
the value in %r11 is (architecturally) correct prior to the jump.
Also adds a new target feature to X86: +lvi-cfi
("cfi" meaning control-flow integrity)
The feature can be added via clang CLI using -mlvi-cfi.
This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code.
Differential Revision: https://reviews.llvm.org/D76812
The problem on Windows was that the \b in "..\bin" was interpreted
as an escape sequence. Use r"" strings to prevent that.
This reverts commit ab11b9eefa,
with raw strings in the lit.site.cfg.py.in files.
Differential Revision: https://reviews.llvm.org/D77184
Currently, all generated lit.site.cfg files contain absolute paths.
This makes it impossible to build on one machine, and then transfer the
build output to another machine for test execution. Being able to do
this is useful for several use cases:
1. When running tests on an ARM machine, it would be possible to build
on a fast x86 machine and then copy build artifacts over after building.
2. It allows running several test suites (clang, llvm, lld) on 3
different machines, reducing test time from sum(each test suite time) to
max(each test suite time).
This patch makes it possible to pass a list of variables that should be
relative in the generated lit.site.cfg.py file to
configure_lit_site_cfg(). The lit.site.cfg.py.in file needs to call
`path()` on these variables, so that the paths are converted to absolute
form at lit start time.
The testers would have to have an LLVM checkout at the same revision,
and the build dir would have to be at the same relative path as on the
builder.
This does not yet cover how to figure out which files to copy from the
builder machine to the tester machines. (One idea is to look at the
`--graphviz=test.dot` output and copy all inputs of the `check-llvm`
target.)
Differential Revision: https://reviews.llvm.org/D77184
Summary:
Previously we induced a state split if there were multiple argument
constraints given for a function. This was because we called
`addTransition` inside the for loop.
The fix is to is to store the state and apply the next argument
constraint on that. And once the loop is finished we call `addTransition`.
Reviewers: NoQ, Szelethus, baloghadamsoftware
Subscribers: whisperity, xazax.hun, szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, gamesh411, C
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76790
Add the sub-group builtin functions from the OpenCL Extension
specification. This patch excludes the sub_group_barrier builtins
that take argument types not yet handled by the
`-fdeclare-opencl-builtins` machinery.
Co-authored-by: Pierre Gondois <pierre.gondois@arm.com>
Summary:
The construction of constants for structs/unions was conflicting the
expected memory layout for over-sized bit-fields. When building the
necessary bits for those fields, clang was ignoring the size information
computed for the struct/union memory layout and using the original data
from the AST's FieldDecl information. This caused an issue in big-endian
targets, where the field's contant was incorrectly misplaced due to
endian calculations.
This patch aims to separate the constant value from the necessary
padding bits, using the proper size information for each one of them.
With this, the layout of constants for over-sized bit-fields matches the
ABI requirements.
Reviewers: rsmith, eli.friedman, efriedma
Reviewed By: efriedma
Subscribers: efriedma, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77048
Summary:
As defined by Arm C Language Extensions (ACLE) these macro defines
should be set to specific values depending on -mbranch-protection.
Reviewers: chill
Reviewed By: chill
Subscribers: danielkiss, cfe-commits, kristof.beyls
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77134
Currently deferred diagnostic emitter checks variable decl in DeclRefExpr, which
causes infinite recursion for cases like long a = (long)&a;.
Deferred diagnostic emitter does not need check variable decls in DeclRefExpr
since reference of a variable does not cause emission of functions directly or
indirectly. Therefore there is no need to check variable decls in DeclRefExpr.
Differential Revision: https://reviews.llvm.org/D76937
Caused an assertion due to mismatched bit widths - this seems like the
right API to use for a possibly width-varying equality test. Though
certainly open to some post-commit review feedback if there's a more
suitable way to do this comparison/test.
This wasn't respecting the flush mode based on the default, and also
wasn't correctly handling the explicit
-fno-cuda-flush-denormals-to-zero overriding the mode.
Prior to this change the clang interface stubs format resembled
something ending with a symbol list like this:
Symbols:
a: { Type: Func }
This was problematic because we didn't actually want a map format and
also because we didn't like that an empty symbol list required
"Symbols: {}". That is to say without the empty {} llvm-ifs would crash
on an empty list.
With this new format it is much more clear which field is the symbol
name, and instead the [] that is used to express an empty symbol vector
is optional, ie:
Symbols:
- { Name: a, Type: Func }
or
Symbols: []
or
Symbols:
This further diverges the format from existing llvm-elftapi. This is a
good thing because although the format originally came from the same
place, they are not the same in any way.
Differential Revision: https://reviews.llvm.org/D76979
The driver enables -fdiagnostics-show-option by default, so flip the CC1
default to reduce the lengths of common CC1 command lines.
This change also makes ParseDiagnosticArgs() consistently enable
-fdiagnostics-show-option by default.
Summary: This allows instrumenting things like global initializers
Reviewers: dberris, MaskRay, smeenai
Subscribers: cfe-commits, johnislarry
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77191
adb290d974 added a new case to
err_atomic_specifier_bad_type. The diagnostic has two %select's
controlled by the same argument, but only the first was updated to have
the new case. Add the extra case for the second %select and add a
test case that exercises the last case.
Apparently HIPToolChain does not subclass from AMDGPUToolChain, so
this was not applying the new denormal attributes. I'm not sure why
this doesn't subclass. Just copy the implementation for now.
Need to allow arrayshaping expression in a list of expressions, so use
ParseAssignmentExpression() when try to parse the base of the shaping
operation.
Previously we would emit a PCH with errors, but fail the overall
compilation. If run using the driver, that would result in removing the
just-produced PCH. Instead, we should have the compilation result match
whether we were able to emit the PCH.
Differential Revision: https://reviews.llvm.org/D77159
rdar://61110294
This reverts commit 28518d9ae3.
There is a failure in MsgPackReader.cpp when built with clang. It
complains about "signext and zeroext" are incompatible. Investigating
offline if it is infact a UB in the MsgPackReader code.
Since GlobalISel is maturing and is already on at -O0 for AArch64, it's not
completely "experimental". Create a more appropriate driver flag and make
the older option an alias for it.
Differential Revision: https://reviews.llvm.org/D77103
Consider a callee function that has a call (C) within it which feeds
into the return. When we inline that callee into a callsite that has
return attributes, we can backward propagate those attributes to the
call (C) within that inlined callee body.
This is safe to do so only if we can guarantee transfer of execution to
successor in the window of instructions between return value (i.e. the
call C) and the return instruction.
See added test cases.
Reviewed-By: reames, jdoerfert
Differential Revision: https://reviews.llvm.org/D76140
Summary:
The basic idea is to walk through the concept definition, looking for
t.foo() where t has the constrained type.
In this patch:
- nested types are recognized and offered after ::
- variable/function members are recognized and offered after the correct
dot/arrow/colon trigger
- member functions are recognized (anything directly called). parameter
types are presumed to be the argument types. parameters are unnamed.
- result types are available when a requirement has a type constraint.
These are printed as constraints, except same_as<T> which prints as T.
Not in this patch:
- support for merging/overloading when two locations describe the same member.
The last one wins, for any given name. This is probably important...
- support for nested template members (T::x<int>)
- support for completing members of (instantiations of) template template parameters
Reviewers: nridge, saar.raz
Subscribers: mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73649
Summary:
This check was causing a crash in a test case where the 0th argument was
uninitialized ('Assertion `T::isKind(*this)' at line SVals.h:104). This
was happening since the argument was actually undefined, but the castAs
assumes the value is DefinedOrUnknownSVal.
The fix appears to be simply to check for an undefined value and skip
the check allowing the uninitalized value checker to detect the error.
I included a test case that I verified to produce the negative case
prior to the fix, and passes with the fix.
Reviewers: martong, NoQ
Subscribers: xazax.hun, szepet, rnkovacs, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, Charusso, ASDenysPetrov, baloghadamsoftware, dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77012
-Wthread-safety was failing to detect certain AST patterns it should
detect. Make the pattern detection a bit more comprehensive.
Due to an unrelated bug involving template instantiation, this showed up
as a regression in 10.0 vs. 9.0 in the original bug report. The included
testcase fails on older versions of clang, though.
Fixes https://bugs.llvm.org/show_bug.cgi?id=45323 .
Differential Revision: https://reviews.llvm.org/D76943
clauses.
Implemented codegen for array shaping operation in depend clauses. The
begin of the expression is the pointer itself, while the size of the
dependence data is the mukltiplacation of all dimensions in the array
shaping expression.
Summary:
Added basic representation and parsing/sema handling of array-shaping
operations. Array shaping expression is an expression of form ([s0]..[sn])base,
where s0, ..., sn must be a positive integer, base - a pointer. This
expression is a kind of cast operation that converts pointer expression
into an array-like kind of expression.
Reviewers: rjmccall, rsmith, jdoerfert
Subscribers: guansong, arphaman, cfe-commits, caomhin, kkwli0
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74144
Summary: We mark these decls as invalid.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77037
Summary:
If the bitwith expr contains errors, we mark the field decl invalid.
This patch also tweaks the behavior of ObjCInterfaceDecl to be consistent with
existing RecordDecl -- getObjCLayout method is only called with valid decls.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76953
Summary:
The kernel kmalloc function may return a constant value ZERO_SIZE_PTR
if a zero-sized block is allocated. This special value is allowed to
be passed to kfree and should produce no warning.
This is a simple version but should be no problem. The macro is always
detected independent of if this is a kernel source code or any other
code.
Reviewers: Szelethus, martong
Reviewed By: Szelethus, martong
Subscribers: rnkovacs, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gamesh411, Charusso, martong, ASDenysPetrov, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76830
Iterator checkers (and planned container checkers) need the option
aggressive-binary-operation-simplification to be enabled. Without this
option they may cause assertions. To prevent such misuse, this patch adds
a preventive check which issues a warning and denies the registartion of
the checker if this option is disabled.
Differential Revision: https://reviews.llvm.org/D75171
The __builtin_stdarg_start is the legacy spelling of __builtin_va_start.
It should behave exactly the same, but for the last 9 years it would
behave subtly different for diagnostics. Follow the change from
29ad95b232 to require custom type checking.
Currently, bpf does not specify 128bit alignment in its
layout spec. So for a structure like
struct ipv6_key_t {
unsigned pid;
unsigned __int128 saddr;
unsigned short lport;
};
clang will generate IR type
%struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
Additional padding is to ensure later IR->MIR can generate correct
stack layout with target layout spec.
But it is common practice for a tracing program to be
first compiled with target flag (e.g., x86_64 or aarch64) through
clang to generate IR and then go through llc to generate bpf
byte code. Tracing program often refers to kernel internal
data structures which needs to be compiled with non-bpf target.
But such a compilation model may cause a problem on aarch64.
The bcc issue https://github.com/iovisor/bcc/issues/2827
reported such a problem.
For the above structure, since aarch64 has "i128:128" in its
layout string, the generated IR will have
%struct.ipv6_key_t = type { i32, i128, i16 }
Since bpf does not have "i128:128" in its spec string,
the selectionDAG assumes alignment 8 for i128 and
computes the stack storage size for the above is 32 bytes,
which leads incorrect code later.
The x86_64 does not have this issue as it does not have
"i128:128" in its layout spec as it does permits i128 to
be alignmented at 8 bytes at stack. Its IR type looks like
%struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
The fix here is add i128 support in layout spec, the same as
aarch64. The only downside is we may have less optimal stack
allocation in certain cases since we require 16byte alignment
for i128 instead of 8. But this is probably fine as i128 is
not used widely and in most cases users should already
have proper alignment.
Differential Revision: https://reviews.llvm.org/D76587
The main purpose of introducing these builtins is to add a range
metadata [1, 1025) on the work group size loaded from dispatch
ptr, which cannot be done by source code.
Differential Revision: https://reviews.llvm.org/D76772
scope.
There are a few contexts in which we assume a name is a template name;
if such a context is one where we should perform an unqualified lookup,
and lookup finds nothing, we would form a dependent template name even
if the name is not dependent. This happens in particular for the lookup
of a pseudo-destructor.
In passing, rename ActOnDependentTemplateName to just ActOnTemplateName
given that we apply it for non-dependent template names too.
Instead of bailing out of parsing when we encounter an invalid
template-name or template arguments in a template-id, produce an
annotation token describing the invalid construct.
This avoids duplicate errors and generally allows us to recover better.
In principle we should be able to extend this to store some kinds of
invalid template-id in the AST for tooling use, but that isn't handled
as part of this change.
Currently -fno-unroll-loops is ignored when doing LTO on Darwin. This
patch adds a new -lto-no-unroll-loops option to the LTO code generator
and forwards it to the linker if -fno-unroll-loops is passed.
Reviewers: thegameg, steven_wu
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D76916
in the debug info with -fdebug-prefix-map.
rdar://problem/55685132
This reapplies an earlier attempt to commit this without
modifications.
Differential Revision: https://reviews.llvm.org/D76385
Built-in SVE types are trivial, since they're trivially copyable
and support default construction.
Differential Revision: https://reviews.llvm.org/D76692
SVE types are trivially copyable: they can be copied simply
by reproducing the byte representation of the source object.
Differential Revision: https://reviews.llvm.org/D76691
Summary:
Unsigned types can alias the corresponding signed types. I don't see
that this is explicitly mentioned in the Embedded-C specification, but
I think it should work the same as for the integer types.
Patch by: materi
Reviewers: ebevhan, leonardchan
Reviewed By: leonardchan
Subscribers: kosarev, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76856
This is the second part loosely extracted from D71179 and cleaned up.
This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.
A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].
The logic is as follows:
If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
1) Create a function declaration for the definition we were about to generate.
2) Create a function definition but with a mangled name (according to
`<SELECTOR>`).
3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
one used already for `omp declare variant`, using and the mangled
function definition as specialization for the context defined by
`<SELECTOR>`.
When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman
Subscribers: bollu, guansong, openmp-commits, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75779
This is the first part extracted from D71179 and cleaned up.
This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].
A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D74941
Summary:
- Even though the bindless surface/texture interfaces are promoted,
there are still code using surface/texture references. For example,
[PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
compilation issue for code using `tex2D` with texture references. For
better compatibility, this patch proposes the support of
surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
`nvcc` does use builtins for texture support. From the limited NVVM
documentation[^nvvm] and NVPTX backend texture/surface related
tests[^test], it's believed that surface/texture references are
supported by replacing their reference types, which are annotated with
`device_builtin_surface_type`/`device_builtin_texture_type`, with the
corresponding handle-like object types, `cudaSurfaceObject_t` or
`cudaTextureObject_t`, in the device-side compilation. On the host
side, that global handle variables are registered and will be
established and updated later when corresponding binding/unbinding
APIs are called[^bind]. Surface/texture references are most like
device global variables but represented in different types on the host
and device sides.
- In this patch, the following changes are proposed to support that
behavior:
+ Refine `device_builtin_surface_type` and
`device_builtin_texture_type` attributes to be applied on `Type`
decl only to check whether a variable is of the surface/texture
reference type.
+ Add hooks in code generation to replace that reference types with
the correponding object types as well as all accesses to them. In
particular, `nvvm.texsurf.handle.internal` should be used to load
object handles from global reference variables[^texsurf] as well as
metadata annotations.
+ Generate host-side registration with proper template argument
parsing.
---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later.
Reviewers: tra, rjmccall, yaxunl, a.sidorin
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76365
There was a misisng space between the -march and --cuda-gpu-arch
arguments, so --cuda-gpu-arch wasn't actually being parsed. I'm not
sure what the intent of the sm_10 run lines were, but they error as an
unsupported architecture. Switch these to something else.
This reverts commit 0788acbccb.
This reverts commit c2d7a1f79cedfc9fcb518596aa839da4de0adb69: Revert "[clangd] Add test for FindTarget+RecoveryExpr (which already works). NFC"
It causes a crash on invalid code:
class X {
decltype(unresolved()) foo;
};
constexpr int s = sizeof(X);
Summary:
This patch introduces command-line support for the Armv8.6-a architecture and assembly support for BFloat16. Details can be found
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
in addition to the GCC patch for the 8..6-a CLI:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg02647.html
In detail this patch
- march options for armv8.6-a
- BFloat16 assembly
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
Based on work by:
- labrinea
- MarkMurrayARM
- Luke Cheeseman
- Javed Asbar
- Mikhail Maltsev
- Luke Geeson
Reviewers: SjoerdMeijer, craig.topper, rjmccall, jfb, LukeGeeson
Reviewed By: SjoerdMeijer
Subscribers: stuij, kristof.beyls, hiraditya, dexonsmith, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D76062
If an error happens which is related to a container the Container
Modeling checker adds note tags to all the container operations along
the bug path. This may be disturbing if there are other containers
beside the one which is affected by the bug. This patch restricts the
note tags to only the affected container and adjust the debug checkers
to be able to test this change.
Differential Revision: https://reviews.llvm.org/D75514
Summary: Previously, we dropped the AST node for nonexistent member exprs.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76764
Container operations such as `push_back()`, `pop_front()`
etc. increment and decrement the abstract begin and end
symbols of containers. This patch introduces note tags
to `ContainerModeling` to track these changes. This helps
the user to better identify the source of errors related
to containers and iterators.
Differential Revision: https://reviews.llvm.org/D73720
As reported in PR45298 and PR45299, vector_size type checking would
crash when done in a situation where the scalar is dependent, such as
a member of the current instantiation.
This is because the scalar checking ensures that you can implicitly
convert a value to a vector-type as long as it doesn't require
truncation. It does this by using the constant evaluator to get the
value as a float. Unfortunately, if the scalar is dependent (such as a
member of the current instantiation), we would hit the assert in the
evaluator.
This patch suppresses the truncation- of-value check in the first phase
of translation. All values are properly errored upon instantiation. This
has one minor regression, in that previously in a non-asserts build,
template<typename T>
struct S {
float4 f(float4 f) {
return k + f;
}
static constexpr k = 1.1; // causes a truncation on conversion.
};
would error immediately. Because 'k' is value dependent (as a
member-of-the-current-instantiation), this would still be evaluatable
(despite normally asserting). Due to this patch, this diagnostic is
delayed until instantiation time.
Summary:
This patch implements the following CDE intrinsics:
T __arm_vcx1q_m(int coproc, T inactive, uint32_t imm, mve_pred_t p);
T __arm_vcx2q_m(int coproc, T inactive, U n, uint32_t imm, mve_pred_t p);
T __arm_vcx3q_m(int coproc, T inactive, U n, V m, uint32_t imm, mve_pred_t p);
T __arm_vcx1qa_m(int coproc, T acc, uint32_t imm, mve_pred_t p);
T __arm_vcx2qa_m(int coproc, T acc, U n, uint32_t imm, mve_pred_t p);
T __arm_vcx3qa_m(int coproc, T acc, U n, V m, uint32_t imm, mve_pred_t p);
The intrinsics are not part of the released ACLE spec, but internally at
Arm we have reached consensus to add them to the next ACLE release.
Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76610
SPIRV2.0 Spec only specifies Linux mangling, however our downstream has
use for a Windows mangling for these types.
Unfortunately, the SPIRV
spec specifies a single mangling for all pipe types, despite clang
allowing overloading on these types. Because of this, this patch
chooses to mangle the read/writability and element type for the windows
mangling.
The windows manglings in the test all demangle according to demangler:
"void __cdecl test1(struct __clang::ocl_pipe<int,1>)
"void __cdecl test2(struct __clang::ocl_pipe<float,0>)
"void __cdecl test2(struct __clang::ocl_pipe<int,1>)
"void __cdecl test3(struct __clang::ocl_pipe<int const,1>)
"void __cdecl test4(struct __clang::ocl_pipe<union
__clang::__vector<unsigned char,3>,1>)
"void __cdecl test5(struct __clang::ocl_pipe<union
__clang::__vector<int,4>,1>)
"void __cdecl test_reserved_read_pipe(struct __clang::_ASCLglobal<struct
Person > * __ptr64,struct __clang::ocl_pipe<struct Person,1>)
Differential Revision: https://reviews.llvm.org/D75685
In order to support non-user-named kernels, SYCL needs some way in the
integration headers to name the kernel object themselves. Initially, the
design considered just RTTI naming of the lambdas, this results in a
quite unstable situation in light of some device/host macros.
Additionally, this ends up needing to use RTTI, which is a burden on the
implementation and typically unsupported.
Instead, we've introduced a builtin, __builtin_unique_stable_name, which
takes a type or expression, and results in a constexpr constant
character array that uniquely represents the type (or type of the
expression) being passed to it.
The implementation accomplishes that simply by using a slightly modified
version of the Itanium Mangling. The one exception is when mangling
lambdas, instead of appending the index of the lambda in the function,
it appends the macro-expansion back-trace of the lambda itself in the
form LINE->COL[~LINE->COL...].
Differential Revision: https://reviews.llvm.org/D76620
Casts from an SVE type to itself aren't very useful, but they are
supposed to be valid, and could occur in things like macro expansions.
Such casts already work for C++ and are tested by sizeless-1.cpp.
This patch makes them work for C too.
Differential Revision: https://reviews.llvm.org/D76694
When compiling C, a ?: between two values of the same SVE type
currently gives an error such as:
incompatible operand types ('svint8_t' (aka '__SVInt8_t') and 'svint8_t')
It's supposed to be valid to select between (cv-qualified versions of)
the same SVE type, so this patch adds that case.
These expressions already work for C++ and are tested by
SemaCXX/sizeless-1.cpp.
Differential Revision: https://reviews.llvm.org/D76693
Summary:
These were accidentally left out of D76123. I added tests for the
other three instructions in this small cross-product family (vqdmlah,
vqrdmlah, vqrdmlash) but missed this one.
Reviewers: miyuki
Reviewed By: miyuki
Subscribers: kristof.beyls, dmgreen, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76714
The code was pretending to be doing something useful with vectors, but
really it was doing nothing: the element type of a vector is always a
scalar type, so constWithPadding would always just return the input constant.
Split off from D75661 so it can be reviewed separately.
While I'm here, also add testcase to show missing vector handling.
Differential Revision: https://reviews.llvm.org/D76528
Summary:
After we parse the switch condition, we don't do the type check for
type-dependent expr (e.g. TypoExpr) (in Sema::CheckSwitchCondition), then the
TypoExpr is corrected to an invalid-type expr (in Sema::MakeFullExpr) and passed
to the ActOnStartOfSwitchStmt, which triggers the assertion.
Fix https://github.com/clangd/clangd/issues/311
Reviewers: sammccall
Subscribers: ilya-biryukov, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76592
The argument after -Xarch_device will be added to the arguments for CUDA/HIP
device compilation and will be removed for host compilation.
The argument after -Xarch_host will be added to the arguments for CUDA/HIP
host compilation and will be removed for device compilation.
Differential Revision: https://reviews.llvm.org/D76520
Normally clang avoids creating expressions when it encounters semantic
errors, even if the parser knows which expression to produce.
This works well for the compiler. However, this is not ideal for
source-level tools that have to deal with broken code, e.g. clangd is
not able to provide navigation features even for names that compiler
knows how to resolve.
The new RecoveryExpr aims to capture the minimal set of information
useful for the tools that need to deal with incorrect code:
source range of the expression being dropped,
subexpressions of the expression.
We aim to make constructing RecoveryExprs as simple as possible to
ensure writing code to avoid dropping expressions is easy.
Producing RecoveryExprs can result in new code paths being taken in the
frontend. In particular, clang can produce some new diagnostics now and
we aim to suppress bogus ones based on Expr::containsErrors.
We deliberately produce RecoveryExprs only in the parser for now to
minimize the code affected by this patch. Producing RecoveryExprs in
Sema potentially allows to preserve more information (e.g. type of an
expression), but also results in more code being affected. E.g.
SFINAE checks will have to take presence of RecoveryExprs into account.
Initial implementation only works in C++ mode, as it relies on compiler
postponing diagnostics on dependent expressions. C and ObjC often do not
do this, so they require more work to make sure we do not produce too
many bogus diagnostics on the new expressions.
See documentation of RecoveryExpr for more details.
original patch from Ilya
This change is based on https://reviews.llvm.org/D61722
Reviewers: sammccall, rsmith
Reviewed By: sammccall, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69330
Some methods are sometimes declared in the @implementation blocks which
can cause undiagnosed clashes.
Just write a checkObjCDirectMethodClashes() for this purpose.
Also make sure that "unavailable" selectors do not inherit
objc_direct_members.
Differential Revision: https://reviews.llvm.org/D76643
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Radar-ID: rdar://problem/59332804, rdar://problem/59782963
On Windows, when the LLVM toolchain is in the current path (%PATH%), fusing the linker yields c:\{path}\lld.exe whereas the hip tests did not expect the .exe part.
This only happens if LLD is not present in the build folder, which consequently makes the code in Driver::GetProgramPath() to fall back to %PATH% and platform-specific search, which includes the .exe part (see https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/Driver.cpp#L4733).
Differential Revision: https://reviews.llvm.org/D76631
Upon calling one of the functions `std::advance()`, `std::prev()` and
`std::next()` iterators could get out of their valid range which leads
to undefined behavior. If all these funcions are inlined together with
the functions they call internally (e.g. `__advance()` called by
`std::advance()` in some implementations) the error is detected by
`IteratorRangeChecker` but the bug location is inside the STL
implementation. Even worse, if the budget runs out and one of the calls
is not inlined the bug remains undetected. This patch fixes this
behavior: all the bugs are detected at the point of the STL function
invocation.
Differential Revision: https://reviews.llvm.org/D76379
Whenever the analyzer budget runs out just at the point where
`std::advance()`, `std::prev()` or `std::next()` is invoked the function
are not inlined. This results in strange behavior such as
`std::prev(v.end())` equals `v.end()`. To prevent this model these
functions if they were not inlined. It may also happend that although
`std::advance()` is inlined but a function it calls inside (e.g.
`__advance()` in some implementations) is not. This case is also handled
in this patch.
Differential Revision: https://reviews.llvm.org/D76361
There's inconsistency in handling array types between the
`distributeFunctionTypeAttrXXX` functions and the
`FunctionTypeUnwrapper` in `SemaType.cpp`.
This patch lets `FunctionTypeUnwrapper` apply function type attributes
through array types.
Differential Revision: https://reviews.llvm.org/D75109
Summary: If the size parameter of `__builtin_memcpy_inline` comes from an un-instantiated template parameter current code would crash.
Reviewers: efriedma, courbet
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76504
Summary:
Since the conditional operator cannot be used with vector conditions
in C, we need a builtin to be able to express this operation in C
source.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76538
The .i files in the clang tests (2 files) were not run by lit :
clang/test/CodeGen/debug-info-preprocessed-file.i
clang/test/FrontEnd/processed-input.i
Differential Revision: https://reviews.llvm.org/D75853
accept as an extension.
This attempts to accept the same cases a GCC, plus cases where a
comparison is rewritten to an operator== with an integral but non-bool
return type; this is sufficient to avoid most problems with various
major open-source projects (such as ICU) and appears to fix all but one
of the comparison-related C++20 build breaks in LLVM.
This approach is being pursued for standardization.
Before this patch a Clang module skeleton CU would have a
DW_AT_comp_dir pointing to the directory of the module map file, and
this information was not used by anyone. Even worse, LLDB actually
resolves relative DWO paths by appending it to DW_AT_comp_dir. This
patch sets it to the same directory that is used as the main CU's
compilation directory, which would make the LLDB code work.
Differential Revision: https://reviews.llvm.org/D76377
region.
According to OpenMP 5.0, exactly one scan directive must appear in the loop body of an enclosing worksharing-loop, worksharing-loop SIMD, or simd construct on which a reduction clause with the inscan modifier is present.
FinishThunk, and the invariant of setting and then unsetting
CurCodeDecl, was added in 7f416cc426 (2015). The invariant didn't
exist when I added this musttail codepath in ab2090d107 (2014).
Recently in 28328c3771, I started using this codepath on non-Windows
platforms, and users reported problems during release testing (PR44987).
The issue was already present for users of EH on i686-windows-msvc, so I
added a test for that case as well.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D76444
Discovered by a downstream user, we found that the CallGraph ignores
callees unless they are defined. This seems foolish, and prevents
combining the report with other reports to create unified reports.
Additionally, declarations contain information that is likely useful to
consumers of the CallGraph.
This patch implements this by splitting the includeInGraph function into
two versions, the current one plus one that is for callees only. The
only difference currently is that includeInGraph checks for a body, then
calls includeCalleeInGraph.
Differential Revision: https://reviews.llvm.org/D76435
Summary:
I've implemented them as target-specific IR intrinsics rather than
using `@llvm.experimental.vector.reduce.add`, on the grounds that the
'experimental' intrinsic doesn't currently have much code generation
benefit, and my replacements encapsulate the sign- or zero-extension
so that you don't expose the illegal MVE vector type (`<4 x i64>`) in
IR.
The machine instructions come in two versions: with and without an
input accumulator. My new IR intrinsics, like the 'experimental' one,
don't take an accumulator parameter: we represent that by just adding
on the input value using an ordinary i32 or i64 add. So if you write
the `vaddvaq` C-language intrinsic with an input accumulator of zero,
it can be optimised to VADDV, and conversely, if you write something
like `x += vaddvq(y)` then that can be combined into VADDVA.
Most of this is achieved in isel lowering, by converting these IR
intrinsics into the existing `ARMISD::VADDV` family of custom SDNode
types. For the difficult case (64-bit accumulators), isel lowering
already implements the optimization of folding an addition into a
VADDLV to make a VADDLVA; so once we've made a VADDLV, our job is
already done, except that I had to introduce a parallel set of ARMISD
nodes for the //predicated// forms of VADDLV.
For the simpler VADDV, we handle the predicated form by just leaving
the IR intrinsic alone and matching it in an ordinary dag pattern.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76491
Summary:
I've implemented these as target-specific IR intrinsics, because
they're not //quite// enough like @llvm.experimental.vector.reduce.min
(which doesn't take the extra scalar parameter). Also this keeps the
predicated and unpredicated versions looking similar, and the
floating-point minnm/maxnm versions fold into the same schema.
We had a couple of min/max reductions already implemented, from the
initial pathfinding exercise in D67158. Those were done by having
separate IR intrinsic names for the signed and unsigned integer
versions; as part of this commit, I've changed them to use a flag
parameter indicating signedness, which is how we ended up deciding
that the rest of the MVE intrinsics family ought to work. So now
hopefully the ewhole lot is consistent.
In the new llc test, the output code from the `v8f16` test functions
looks quite unpleasant, but most of it is PCS lowering (you can't pass
a `half` directly in or out of a function). In other circumstances,
where you do something else with your `half` in the same function, it
doesn't look nearly as nasty.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76490
Summary:
This patch implements the following CDE intrinsics:
int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);
uint16x8_t __arm_vreinterpretq_u16_u8 (uint8x16_t in);
int16x8_t __arm_vreinterpretq_s16_u8 (uint8x16_t in);
uint32x4_t __arm_vreinterpretq_u32_u8 (uint8x16_t in);
int32x4_t __arm_vreinterpretq_s32_u8 (uint8x16_t in);
uint64x2_t __arm_vreinterpretq_u64_u8 (uint8x16_t in);
int64x2_t __arm_vreinterpretq_s64_u8 (uint8x16_t in);
float16x8_t __arm_vreinterpretq_f16_u8 (uint8x16_t in);
float32x4_t __arm_vreinterpretq_f32_u8 (uint8x16_t in);
These intrinsics are header-only because they reuse the existing
MVE vreinterpret clang built-ins.
This set is slightly different from the published specification
(see https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf):
it includes
int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);
which was unintentionally ommitted from the spec, and
does not include
float64x2_t __arm_vreinterpretq_f64_u8 (uint8x16_t in);
The float64x2_t type requires additional implementation
effort, and we are not including it yet.
Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76300
Summary:
This patch implements the following intrinsics:
uint8x16_t __arm_vcx1q_u8 (int coproc, uint32_t imm);
T __arm_vcx1qa(int coproc, T acc, uint32_t imm);
T __arm_vcx2q(int coproc, T n, uint32_t imm);
uint8x16_t __arm_vcx2q_u8(int coproc, T n, uint32_t imm);
T __arm_vcx2qa(int coproc, T acc, U n, uint32_t imm);
T __arm_vcx3q(int coproc, T n, U m, uint32_t imm);
uint8x16_t __arm_vcx3q_u8(int coproc, T n, U m, uint32_t imm);
T __arm_vcx3qa(int coproc, T acc, U n, V m, uint32_t imm);
Most of them are polymorphic. Furthermore, some intrinsics are
polymorphic by 2 or 3 parameter types, such polymorphism is not
supported by the existing MVE/CDE tablegen backends, also we don't
really want to have a combinatorial explosion caused by 1000 different
combinations of 3 vector types. Because of this some intrinsics are
implemented as macros involving a cast of the polymorphic arguments to
uint8x16_t.
The IR intrinsics are even more restricted in terms of types: all MVE
vectors are cast to v16i8.
Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76299
Summary:
This change implements ACLE CDE intrinsics that translate to
instructions working with general-purpose registers.
The specification is available at
https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf
Each ACLE intrinsic gets a corresponding LLVM IR intrinsic (because
they have distinct function prototypes). Dual-register operands are
represented as pairs of i32 values. Because of this the instruction
selection for these intrinsics cannot be represented as TableGen
patterns and requires custom C++ code.
Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76296
Summary:
Changes:
- handle immediate invocations for constructors.
- add tests
after this patch i believe the implementation of consteval is nearly standard compliant, but IR-gen still needs to be taught not to emit consteval declarations.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: wchilders
Differential Revision: https://reviews.llvm.org/D74007
Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
The data will be put into the small data section if the data smaller than
the threshold.
Differential Revision: https://reviews.llvm.org/D57497
Suppress those diagnostics if lhs of a member expression contains
errors. Typo correction produces dependent expressions even in
non-template code, that led to spurious diagnostics before.
previous:
/tmp/t.cpp:6:17: error: use 'template' keyword to treat 'f' as a dependent template name
auto a = bilder.f<int>();
^
template
/tmp/t.cpp:6:10: error: use of undeclared identifier 'bilder'; did you mean 'builder'?
auto a = bilder.f<int>();
^~~~~~
builder
vs now:
/tmp/t.cpp:6:10: error: use of undeclared identifier 'bilder'; did you mean 'builder'?
auto a = bilder.f<int>();
^~~~~~
builder
Original patch from Ilya.
Reviewers: sammccall
Reviewed By: sammccall
Tags: #clang
Differential Revision: https://reviews.llvm.org/D65592
Summary:
The range checks performed for the vqrdmulh_lane and vqrdmulh_lane Neon
intrinsics were incorrectly using their return type as the base type for
the range check performed on their 'lane' argument.
This patch updates those intrisics to use the type of the proper reference
argument to perform the range checks.
Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio
Reviewed By: dnsampaio
Subscribers: dnsampaio, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74766
Summary:
Range checks were not properly performed in the lane arguments of Neon
intrinsics implemented based on splat operations. Calls to those
intrinsics where translated to `__builtin__shufflevector` calls directly
by the pre-processor through the arm_neon.h macros, missing the chance
for the proper range checks.
This patch enables the range check by introducing an auxiliary splat
instruction in arm_neon.td, delaying the translation to shufflevector
calls to CGBuiltin.cpp in clang after the checks were performed.
Reviewers: jmolloy, t.p.northover, rsmith, olista01, ostannard
Reviewed By: ostannard
Subscribers: ostannard, dnsampaio, danielkiss, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74619
Summary:
Some of the `*_laneq` intrinsics defined in arm_neon.td were missing the
setting of the `isLaneQ` attribute. This patch sets the attribute on the
related definitions, as they will be required to properly perform range
checks on their lane arguments.
Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio
Reviewed By: dnsampaio
Subscribers: dnsampaio, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74616
The SVE ACLE allows using a short-form for the intrinsics, e.g.
the following two declarations generate the same code:
svuint32_t svld1(svbool_t, uint32_t const *);
svuint32_t svld1_u32(svbool_t, uint32_t const *);
using the attribute:
__clang_arm_builtin_alias
so that any call to svld1(svbool_t, uint32_t const *) will
map to __builtin_sve_svld1_u32.
Reviewers: SjoerdMeijer, miyuki, efriedma, simon_tatham, rengolin
Reviewed By: SjoerdMeijer
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75861
Add a test for UsedDeclVisitor
This test is reduced from mlir/lib/Transforms/AffineDataCopyGeneration.cpp
to make sure there is no assertion due to UsedDeclVisitor.
If the ancestor device modifier is used and the value of the device
clause is evaluated to 1, the ancestor device shall be used for the
execution.
Since the reverse offloading is not supported yet, the target construct
execution is always initiated from the host, not from the device. So, if
the ancestor modifier is specified, just execute target region on the
host.
HIPToolChain::TranslateArgs call TranslateArgs of host toolchain with
the input args to get a list of derived args called DAL, then
go through the input args by itself and append them to DAL.
This assumes that the host toolchain should not append any unchanged
args to DAL, otherwise there will be duplicates since
HIPToolChain will append it again.
This works for GNU toolchain since it returns an empty list for DAL.
However, MSVC toolchain will append unchanged args to DAL, which
causes duplicate args.
This patch let MSVC toolchain not append unchanged args for HIP
offloading kind, which fixes this issue.
Differential Revision: https://reviews.llvm.org/D76032
Summary:
This is another set of instructions too complicated to be sensibly
expressed in IR by anything short of a target-specific intrinsic.
Given input vectors a,b, the instruction generates intermediate values
2*(a[0]*b[0]+a[1]+b[1]), 2*(a[2]*b[2]+a[3]+b[3]), etc; takes the high
half of each double-width values, and overwrites half the lanes in the
output vector c, which you therefore have to provide the input value
of. Optionally you can swap the elements of b so that the are things
like a[0]*b[1]+a[1]*b[0]; optionally you can round to nearest when
taking the high half; and optionally you can take the difference
rather than sum of the two products. Finally, saturation is applied
when converting back to a single-width vector lane.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76359
Summary:
These are complicated integer multiply+add instructions with extra
saturation, taking the high half of a double-width product, and
optional rounding. There's no sensible way to represent that in
standard IR, so I've converted the clang builtins directly to
target-specific intrinsics.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76123
Summary:
These instructions compute multiply+add in integers, with one of the
operands being a splat of a scalar. (VMLA and VMLAS differ in whether
the splat operand is a multiplier or the addend.)
I've represented these in IR using existing standard IR operations for
the unpredicated forms. The predicated forms are done with target-
specific intrinsics, as usual.
When operating on n-bit vector lanes, only the bottom n bits of the
i32 scalar operand are used. So we have to tell that to isel lowering,
to allow it to remove a pointless sign- or zero-extension instruction
on that input register. That's done in `PerformIntrinsicCombine`, but
first I had to enable `PerformIntrinsicCombine` for MVE targets
(previously all the intrinsics it handled were for NEON), and make it
a method of `ARMTargetLowering` so that it can get at
`SimplifyDemandedBits`.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76122
Summary:
[Clang] Attribute to allow defining undef global variables
Initializing global variables is very cheap on hosted implementations. The
C semantics of zero initializing globals work very well there. It is not
necessarily cheap on freestanding implementations. Where there is no loader
available, code must be emitted near the start point to write the appropriate
values into memory.
At present, external variables can be declared in C++ and definitions provided
in assembly (or IR) to achive this effect. This patch provides an attribute in
order to remove this reason for writing assembly for performance sensitive
freestanding implementations.
A close analogue in tree is LDS memory for amdgcn, where the kernel is
responsible for initializing the memory after it starts executing on the gpu.
Uninitalized variables in LDS are observably cheaper than zero initialized.
Patch is loosely based on the cuda __shared__ and opencl __local variable
implementation which also produces undef global variables.
Reviewers: kcc, rjmccall, rsmith, glider, vitalybuka, pcc, eugenis, vlad.tsyrklevich, jdoerfert, gregrodgers, jfb, aaron.ballman
Reviewed By: rjmccall, aaron.ballman
Subscribers: Anastasia, aaron.ballman, davidb, Quuxplusone, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74361
Sizeless types can't be used with "new", so it doesn't make sense
to use them with "delete" either. The SVE ACLE therefore doesn't
allow that.
This is slightly stronger than for normal incomplete types, since:
struct S;
void f(S *s) { delete s; }
is (by necessity) just a default-on warning rather than an error.
Differential Revision: https://reviews.llvm.org/D76219
new-expressions for a type T require sizeof(T) to be computable,
so the SVE ACLE does not allow them for sizeless types. At the moment:
auto f() { return new __SVInt8_t; }
creates a call to operator new with a zero size:
%call = call noalias nonnull i8* @_Znwm(i64 0)
This patch reports an appropriate error instead.
Differential Revision: https://reviews.llvm.org/D76218
This flag is used by avr-gcc (starting with v10) to set the width of the
double type. The double type is by default interpreted as a 32-bit
floating point number in avr-gcc instead of a 64-bit floating point
number as is common on other architectures. Starting with GCC 10, a new
option has been added to control this behavior:
https://gcc.gnu.org/wiki/avr-gcc#Deviations_from_the_Standard
This commit keeps the default double at 32 bits but adds support for the
-mdouble flag (-mdouble=32 and -mdouble=64) to control this behavior.
Differential Revision: https://reviews.llvm.org/D76181
In the current SVE ACLE spec, the usual rules for throwing and
catching incomplete types also apply to sizeless types. However,
throwing pointers to sizeless types should not pose any real difficulty,
so as an extension, the clang implementation allows that.
This patch enforces these rules for catch statements.
Differential Revision: https://reviews.llvm.org/D76090
Summary:
The same rules for throwing and catching incomplete types also apply
to sizeless types. This patch enforces that for throw statements.
It also make sure that we use "sizeless type" rather "incomplete type"
in the associated message. (Both are correct, but "sizeless type" is
more specific and hopefully more user-friendly.)
The SVE ACLE simply extends the rule for incomplete types to
sizeless types. However, throwing pointers to sizeless types
should not pose any real difficulty, so as an extension,
the clang implementation allows that.
Reviewers: sdesmalen, efriedma, rovka, rjmccall
Subscribers: tschuett, rkruppe, psnobl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76088
In the current SVE ACLE spec, the usual rules for throwing and
catching incomplete types also apply to sizeless types. However,
throwing pointers to sizeless types should not pose any real difficulty,
so as an extension, the clang implementation allows that.
This patch enforces these rules for explicit exception specs.
Differential Revision: https://reviews.llvm.org/D76087
This patch completes a trio of changes related to arrays of
sizeless types. It rejects various forms of arithmetic on
pointers to sizeless types, in the same way as for other
incomplete types.
Differential Revision: https://reviews.llvm.org/D76086
clang currently accepts:
__SVInt8_t &foo1(__SVInt8_t *x) { return *x; }
__SVInt8_t &foo2(__SVInt8_t *x) { return x[1]; }
The first function is valid ACLE code and generates correct LLVM IR
(and assembly code). But the second function is invalid for the
same reason that arrays of sizeless types are. Trying to code-generate
the function leads to:
llvm/include/llvm/Support/TypeSize.h:126: uint64_t llvm::TypeSize::getFixedSize() const: Assertion `!IsScalable && "Request for a fixed size on a s
calable object"' failed.
Another problem is that:
template<typename T>
constexpr __SIZE_TYPE__ f(T *x) { return &x[1] - x; }
typedef int arr1[f((int *)0) - 1];
typedef int arr2[f((__SVInt8_t *)0) - 1];
produces:
a.cpp:2:48: warning: subtraction of pointers to type '__SVInt8_t' of zero size has undefined behavior [-Wpointer-arith]
constexpr __SIZE_TYPE__ f(T *x) { return &x[1] - x; }
~~~~~ ^ ~
a.cpp:4:18: note: in instantiation of function template specialization 'f<__SVInt8_t>' requested here
typedef int arr2[f((__SVInt8_t *)0) - 1];
This patch reports an appropriate diagnostic instead.
Differential Revision: https://reviews.llvm.org/D76084
Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
- y: SVE registers Z0-Z7
- Upl: One of the low eight SVE predicate registers (P0-P7)
- Upa: Full range of SVE predicate registers (P0-P15)
Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin
Reviewed By: efriedma
Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75690
- libclang_rt.profile should be added when -fcs-profile-generate is on thecommand line.
- OPT_fno_profile_instr_generate was used as a negative for OPT_fprofile_generate. Fix it to use OPT_fno_profile_generate.
Differential Revision: https://reviews.llvm.org/D75274
TryAnnotateTypeConstraint could annotate a template-id which doesn't end up being a type-constraint,
in which case control flow would incorrectly flow into ParseImplicitInt.
Reenter the loop in this case.
Enable relevant tests for C++20. This required disabling typo-correction during TryAnnotateTypeConstraint
and changing a test case which is broken due to a separate bug (will be reported and handled separately).
Summary:
Run StackSafetyAnalysis at the end of the IR pipeline and annotate
proven safe allocas with !stack-safe metadata. Do not instrument such
allocas in the AArch64StackTagging pass.
Reviewers: pcc, vitalybuka, ostannard
Reviewed By: vitalybuka
Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, gilang, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73513
This patch adds 'q' to mean 'scalable vector' in the builtin
type string, and for SVE will return the matching builtin
type as defined in the C/C++ language extensions for SVE.
This patch also adds some scaffolding to generate the arm_sve.h
header file, and some builtin definitions (+CodeGen) to be able
to implement some simple masked load intrinsics that use the
ACLE types, such as:
svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) {
return svld1_s8(pg, base);
}
Reviewers: efriedma, rjmccall, rovka, rsandifo-arm, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75298
transparent context.
(The same crash would happen if a class template with a friend was
declared in an 'export' block, but there are more issues with that
case.)
Avoid copying of the orignal variable if it is going to be marked as
firstprivate in task regions. For taskloops, still need to copy the
non-trvially copyable variables to correctly construct them upon task
creation.
This reapplies the following patch, which was reverted because it caused
neon CodeGen tests to fail:
https://reviews.llvm.org/rGa6150b48cea00ab31e9335cc73770327acc4cb3a
I've added checks to detect half precision neon vectors and avoid
promiting them to vectors of floats.
See the discussion here: https://reviews.llvm.org/rG825235c140e7
Original commit message:
This fixes an assertion in Sema::CreateBuiltinBinOp that fails when one
of the vector operand's element type is a typedef of __fp16.
rdar://problem/55983556
The SVE ACLE doesn't allow arrays of sizeless types. At the moment
clang accepts the TU:
__SVInt8_t x[2];
but trying to code-generate it triggers the LLVM assertion:
llvm/lib/IR/Type.cpp:588: static llvm::ArrayType* llvm::ArrayType::get(llvm::Type*, uint64_t): Assertion `isValidElementType(ElementType) && "Invalid type for array element!"' failed.
This patch reports an appropriate error instead.
The rules are slightly more restrictive than for general incomplete types.
For example:
struct s;
typedef struct s arr[2];
is valid as far as it goes, whereas arrays of sizeless types are
invalid in all contexts. BuildArrayType therefore needs a specific
check for isSizelessType in addition to the usual handling of
incomplete types.
Differential Revision: https://reviews.llvm.org/D76082
Since fields can't have sizeless type, it also doesn't make sense
to capture sizeless types by value in lambda expressions. This patch
makes sure that we diagnose that and that we use "sizeless type" rather
"incomplete type" in the associated message. (Both are correct, but
"sizeless type" is more specific and hopefully more user-friendly.)
Differential Revision: https://reviews.llvm.org/D75738
The SVE ACLE doesn't allow fields to have sizeless type. At the moment
clang accepts things like:
struct s { __SVInt8_t x; } y;
but trying to code-generate it leads to LLVM asserts like:
llvm/include/llvm/Support/TypeSize.h:126: uint64_t llvm::TypeSize::getFixedSize() const: Assertion `!IsScalable && "Request for a fixed size on a scalable object"' failed.
This patch adds an associated clang diagnostic.
Differential Revision: https://reviews.llvm.org/D75737
This is another intermediate step for PR44213
(https://bugs.llvm.org/show_bug.cgi?id=44213).
This stores the SDK *name* in the debug info, to make it possible to
`-fdebug-prefix-map`-replace the sysroot with a recognizable string
and allowing the debugger to find a fitting SDK relative to itself,
not the machine the executable was compiled on.
rdar://problem/51645582
MS ABI has slightly different rules for when destructors are implicitly
defined and when the 'delete this' is checked that are out of scope for
the intent of this test.
Summary:
The parsing of GNU C extended asm statements was a little brittle and
had a few issues:
- It was using Parse::ParseTypeQualifierListOpt to parse the `volatile`
qualifier. That parser is really meant for TypeQualifiers; an asm
statement doesn't really have a type qualifier. This is still maybe
nice to have, but not necessary. We now can check for the `volatile`
token by properly expanding the grammer, rather than abusing
Parse::ParseTypeQualifierListOpt.
- The parsing of `goto` was position dependent, so `asm goto volatile`
wouldn't parse. The qualifiers should be position independent to one
another. Now they are.
- We would warn on duplicate `volatile`, but the parse error for
duplicate `goto` was a generic parse error and wasn't clear.
- We need to add support for the recent GNU C extension `asm inline`.
Adding support to the parser with the above issues highlighted the
need for this refactoring.
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
Reviewers: aaron.ballman
Reviewed By: aaron.ballman
Subscribers: aheejin, jfb, nathanchance, cfe-commits, echristo, efriedma, rsmith, chandlerc, craig.topper, erichkeane, jyu2, void, srhines
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75563
function and an overridden function until we know whether the overriding
function is deleted.
We previously did these checks when we first built the declaration,
which was too soon in some cases. We now defer all these checks to the
end of the class.
Also add missing check that a consteval function cannot override a
non-consteval function and vice versa.
clang accepts a TU containing just:
__SVInt8_t x;
However, sizeless types are not allowed to have static or thread-local
storage duration and trying to code-generate the TU triggers an LLVM
fatal error:
Globals cannot contain scalable vectors
<vscale x 16 x i8>* @x
fatal error: error in backend: Broken module found, compilation aborted!
This patch adds an associated clang diagnostic.
Differential Revision: https://reviews.llvm.org/D75736
It would be difficult to guarantee atomicity for sizeless types,
so the SVE ACLE makes atomic sizeless types invalid. As it happens,
we already rejected them before the patch, but for the wrong reason:
error: _Atomic cannot be applied to type 'svint8_t' (aka '__SVInt8_t')
which is not trivially copyable
The SVE types should be treated as trivially copyable; a later
patch fixes that.
Differential Revision: https://reviews.llvm.org/D75734
A previous patch rejected alignof for sizeless types. This patch
extends that to cover the "aligned" attribute and _Alignas. Since
sizeless types are not meant to be used for long-term data, cannot
be used in aggregates, and cannot have static storage duration,
there shouldn't be any need to fiddle with their alignment.
Like with alignof, this is a conservative position that can be
relaxed in future if it turns out to be too restrictive.
Differential Revision: https://reviews.llvm.org/D75573
clang current accepts:
void foo1(__SVInt8_t *x, __SVInt8_t *y) { *x = *y; }
void foo2(__SVInt8_t *x, __SVInt8_t *y) {
memcpy(y, x, sizeof(__SVInt8_t));
}
The first function is valid ACLE code and generates correct LLVM IR.
However, the second function is invalid ACLE code and generates a
zero-length memcpy. The point of this patch is to reject the use
of sizeof in the second case instead.
There's no similar wrong-code bug for alignof. However, the SVE ACLE
conservatively treats alignof in the same way as sizeof, just as the
C++ standard does for incomplete types. The idea is that layout of
sizeless types is an implementation property and isn't defined at
the language level.
Implementation-wise, the patch adds a new CompleteTypeKind enum
that controls whether RequireCompleteType & friends accept sizeless
built-in types. For now the default is to maintain the status quo
and accept sizeless types. However, the end of the series will flip
the default and remove the Default enum value.
The patch also adds new ...CompleteSized... wrappers that callers can
use if they explicitly want to reject sizeless types. The callers then
use diagnostics that have an extra 0/1 parameter to indicats whether
the type is sizeless or not.
The idea is to have three cases:
1. calls that explicitly reject sizeless types, with a tweaked diagnostic
for the sizeless case
2. calls that explicitly allow sizeless types
3. normal/old-style calls that don't make an explicit choice either way
Once the default is flipped, the 3. calls will conservatively reject
sizeless types, using the same diagnostic as for other incomplete types.
Differential Revision: https://reviews.llvm.org/D75572
This fixes an issue with clang issuing a warning about unknown CUDA SDK if it's
detected during non-CUDA compilation.
Differential Revision: https://reviews.llvm.org/D76030
This patch adds C and C++ tests for various uses of SVE types.
The tests cover valid uses that are already (correctly) accepted and
invalid uses that are already (correctly) rejected. Later patches
will expand the tests as they fix other cases.[*]
Some of the tests for invalid uses aren't obviously related to
scalable vectors. Part of the reason for having them is to make
sure that the quality of the error message doesn't regress once/if
the types are treated as incomplete types.
[*] These later patches all fix invalid uses that are being incorrectly
accepted. I don't know of any cases in which valid uses are being
incorrectly rejected. In other words, this series is all about
diagnosing invalid code rather than enabling something new.
Differential Revision: https://reviews.llvm.org/D75571
Summary:
This adds the ACLE intrinsic family for the VFMA and VFMS
instructions, which perform fused multiply-add on vectors of floats.
I've represented the unpredicated versions in IR using the cross-
platform `@llvm.fma` IR intrinsic. We already had isel rules to
convert one of those into a vector VFMA in the simplest possible way;
but we didn't have rules to detect a negated argument and turn it into
VFMS, or rules to detect a splat argument and turn it into one of the
two vector/scalar forms of the instruction. Now we have all of those.
The predicated form uses a target-specific intrinsic as usual, but
I've stuck to just one, for a predicated FMA. The subtraction and
splat versions are code-generated by passing an fneg or a splat as one
of its operands, the same way as the unpredicated version.
In arm_mve_defs.h, I've had to introduce a tiny extra piece of
infrastructure: a record `id` for use in codegen dags which implements
the identity function. (Just because you can't declare a Tablegen
value of type dag which is //only// a `$varname`: you have to wrap it
in something. Now I can write `(id $varname)` to get the same effect.)
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75998
Device-side compilation does not support some features and we need to
filter them out when command line options enable them for the host.
We're already doing this in various places in the regular clang driver,
but clang-cl mode constructs cc1 options independently and needs to
implement the filtering, too.
Differential Revision: https://reviews.llvm.org/D75310
Summary:
The change is to fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode.
The purpose is to provide the support of LTO for swift and Objective-C mixed project.
Reviewers: rjmccall, ahatanak, steven_wu
Reviewed By: rjmccall, steven_wu
Subscribers: manmanren, mehdi_amini, hiraditya, dexonsmith, llvm-commits, jinlin
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71219
a dependent context.
This matches the GCC behavior.
We track the enclosing template depth when determining whether a
statement expression is within a dependent context; there doesn't appear
to be any other reliable way to determine this.
We previously assumed they were neither value- nor
instantiation-dependent under any circumstances, which would lead to
crashes and other misbehavior.
We would assign the incorrect DeclContext when transforming the RequiresExprBodyDecl, causing incorrect
handling of 'this' inside RequiresExprBodyDecls (bug #45162).
Assign the current context as the DeclContext of the transformed decl.
Fix a bug in IRGen where it wasn't destructing compound literals in C
that are ObjC pointer arrays or non-trivial structs. Also diagnose jumps
that enter or exit the lifetime of the compound literals.
rdar://problem/51867864
Differential Revision: https://reviews.llvm.org/D64464
Summary:
Outputs from an asm goto block cannot be used on the indirect branch.
It's not supported and may result in invalid code generation.
Reviewers: jyknight, nickdesaulniers, hfinkel
Reviewed By: nickdesaulniers
Subscribers: martong, cfe-commits, rnk, craig.topper, hiraditya, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71314
isSameEntity was missing constraints checking, causing constrained overloads
to not travel well accross serialization. (bug #45115)
Add constraints checking to isSameEntity.
As per comment on https://reviews.llvm.org/D72860, it is suggested to
revert this change in the meantime, since it has introduced regression.
This reverts commit 83f4c3af02.