Summary:
omp.h header file defines omp_null_allocator as a predefined allocator,
need to consider it also as a predefined allocator.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79186
Summary:
Conflicting types for the same section name defined in clang section
pragmas and GNU-style section attributes were not properly captured by
Clang's Sema. The lack of diagnostics was caused by the fact the section
specification coming from attributes was handled by Sema as implicit,
even though explicitly defined by the user.
This patch enables the diagnostics for section type conflicts between
those specifications by making sure sections defined in section
attributes are correctly handled as explicit.
Reviewers: hans, rnk, javed.absar
Reviewed By: rnk
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78573
Summary:
Section names used in clang section pragmas were not validated against
previously defined sections, causing section type conflicts to be
ignored by Sema.
This patch enables Clang to capture these section type conflicts by
using the existing Sema's UnifySection method to validate section names
from clang section pragmas.
Reviewers: hans, rnk, javed.absar
Reviewed By: rnk
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78572
Summary: systemd recently added a clang-format file. One issue I
encountered in using clang-format on systemd is that systemd does
not add a space before the parens of their foreach macros but
clang-format always adds a space. This does not seem to be
configurable in clang-format. This revision adds the
ControlStatementsExceptForEachMacros option to SpaceBeforeParens
which puts a space before all control statement parens except
ForEach macros. This drastically reduces the amount of changes
when running clang-format on systemd's source code.
Reviewers: MyDeveloperDay, krasimir, mitchell-stellar
Reviewed By: MyDeveloperDay
Subscribers: cfe-commits
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D78869
The built-in SVE types are supposed to be treated as opaque types.
This means that for initialisation purposes they should be treated
as a single unit, much like a scalar type.
However, as Eli pointed out, actually using "scalar" in the diagnostics
is likely to cause confusion, given the types are logically vectors.
The patch therefore uses custom diagnostics or generalises existing
ones. Some of the messages use the word "indivisible" to try to make
it clear(er) that these types can't be initialised elementwise.
I don't think it's possible to trigger warn_braces_around_(scalar_)init
for sizeless types as things stand, since the types can't be used as
members or elements of more complex types. But it seemed better to be
consistent with ext_many_braces_around_(scalar_)init, so the patch
changes it anyway.
Differential Revision: https://reviews.llvm.org/D76689
I have a follow-on patch that uses an alternative wording for
ext_excess_initializers in some cases. This patch puts it and
a couple of related warnings under their own -W option in order
to avoid a regression in Misc/warning-flags.c.
Differential Revision: https://reviews.llvm.org/D79244
I'm not sure what the conventions are for this documentation. The
format seems limiting. I don't see how to refer to other flags, or
mark flags as deprecated. The rst I believe these generate seems to be
in source, and out of date.
Summary:
I was surprised to see the LocalOffset can exceed uint32_t, but it
does happen and lead to crashes in one of our internal huge TU with a large
preamble.
with this patch, the crash is gone.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79397
* svdupq builtins that duplicate scalars to every quadword of a vector
are defined using builtins for svld1rq (load and replicate quadword).
* svdupq builtins that duplicate boolean values to fill a predicate vector
are defined using `svcmpne`.
Reviewers: SjoerdMeijer, efriedma, ctetreau
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78750
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79224
test cases
Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff
Differential Revision: https://reviews.llvm.org/D72841
This reverts commit 85dc033cac, and makes
corrections to the test cases that failed on buildbots.
Summary:
Added basic support for 'task' modifier in the reduction clauses in
non-simd parallel and worksharing constructs.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78738
Summary:
`ClangFormatStyleOptions.rst` should ALWAYS be autogenerated from Format.h using `clang/docs/tools/dump_format_style.py` if not its liable to get removed leaving options undocumented.
This revision reworks the documentation for {D73354} {D73768} to ensure we can continue to regenerated
Fix other minor changes that ensure the documentation remains consistent (Format.h obviously got re clang-formatted after the rst had been regenerated previously)
Reviewed By: krasimir
Subscribers: cfe-commits
Tags: #clang, #clang-format
Differential Revision: https://reviews.llvm.org/D79095
Since the _ExtInt type got into the repo, we've discovered that the ABI
implications weren't completely understood. The other architectures are
going to be audited (see D79118), however downstream targets aren't
going to benefit from this audit.
This patch disables the _ExtInt type by default and makes the
target-info an opt-in. As it is audited, I'll re-enable these for all
of our default targets.
Summary:
Objective-C Class objects can be used to do a dynamic dispatch on
class methods. The analyzer had a few places where we tried to overcome
the dynamic nature of it and still guess the actual function that
is being called. That was done mostly using some simple heuristics
covering the most widespread cases (e.g. [[self class] classmethod]).
This solution introduces a way to track types represented by Class
objects and work with that instead of direct AST matching.
rdar://problem/50739539
Differential Revision: https://reviews.llvm.org/D78286
Fix a few bugs where we would fail to properly determine header to
module correspondence when determining whether to suggest a #include or
import, and suggest a #include more often in language modes where there
is no import syntax. Generally, if the target is in a header with
include guards or #pragma once, we should suggest either #including or
importing that header, and not importing a module that happens to
textually include it.
In passing, improve the notes we attach to the corresponding
diagnostics: calling an entity that we couldn't see "previous" is
confusing.
Prior to this change, for a few compiler-rt libraries such as ubsan and
the profile library, Clang would embed "-defaultlib:path/to/rt-arch.lib"
into the .drective section of every object compiled with
-finstr-profile-generate or -fsanitize=ubsan as appropriate.
These paths assume that the link step will run from the same working
directory as the compile step. There is also evidence that sometimes the
paths become absolute, such as when clang is run from a different drive
letter from the current working directory. This is fragile, and I'd like
to get away from having paths embedded in the object if possible. Long
ago it was suggested that we use this for ASan, and apparently I felt
the same way back then:
https://reviews.llvm.org/D4428#56536
This is also consistent with how all other autolinking usage works for
PS4, Mac, and Windows: they all use basenames, not paths.
To keep things working for people using the standard GCC driver
workflow, the driver now adds the resource directory to the linker
library search path when it calls the linker. This is enough to make
check-ubsan pass, and seems like a generally good thing.
Users that invoke the linker directly (most clang-cl users) will have to
add clang's resource library directory to their linker search path in
their build system. I'm not sure where I can document this. Ideally I'd
also do it in the MSBuild files, but I can't figure out where they go.
I'd like to start with this for now.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D65543
When passing a value of a struct/union type from secure to non-secure
state (that is returning from a CMSE entry function or passing an
argument to CMSE-non-secure call), there is a potential sensitive
information leak via the padding bits in the structure. It is not
possible in the general case to ensure those bits are cleared by using
Standard C/C++.
This patch makes the compiler emit code to clear such padding
bits. Since type information is lost in LLVM IR, the code generation
is done by Clang.
For each interesting record type, we build a bitmask, in which all the
bits, corresponding to user declared members, are set. Values of
record types are returned by coercing them to an integer. After the
coercion, the coerced value is masked (with bitwise AND) and then
returned by the function. In a similar manner, values of record types
are passed as arguments by coercing them to an array of integers, and
the coerced values themselves are masked.
For union types, we effectively clear only bits, which aren't part of
any member, since we don't know which is the currently active one.
The compiler will issue a warning, whenever a union is passed to
non-secure state.
Values of half-precision floating-point types are passed in the least
significant bits of a 32-bit register (GPR or FPR) with the most
significant bits unspecified. Since this is also a potential leak of
sensitive information, this patch also clears those unspecified bits.
Differential Revision: https://reviews.llvm.org/D76369
This patch adds builtins for:
- svmad, svmla, svmls, svmsb
svnmad, svnmla, svnmls, svnmsb
svmla_lane, svmls_lane
These builtins come in several flavours:
- Merge into first source vector (`_m`)
- False lanes are undef (`_x`)
- False lanes are zeroed (`_z`)
And can also have `_n` to indicate the last operand is a scalar.
For example:
svint32_t svmla[_n_s32]_z(svbool_t pg, svint32_t op1, svint32_t op2, int32_t op3)
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D78960
The svlen builtins return the number of elements in a vector
and are implemented using `llvm.vscale`.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D78755
Currently this asserts on anything other than errors. In one
workaround scenario, AMDGPU emits DiagnosticInfoUnsupported as a
warning for functions that can't be correctly codegened, but should
never be executed.
`getHasRegParm()` was working under the assumption that the RegParm
bits are the last field, which is no longer true, after adding the
`NoCfCheck` and `CmseNSCall` fields.
This causes a spurious "regparm 0" attribute to appear when a function
type is declared with either `__attribute__((nocf_check))` or
`__attribute__((cmse_nonsecure_call))`.
Differential Revision: https://reviews.llvm.org/D77270
Summary:
Before this PR, `modernize-use-using` would transform the typedef in
```
template <int A>
struct InjectedClassName {
typedef InjectedClassName b;
};
```
into `using b = InjectedClassName<A>;` and
```
template <int>
struct InjectedClassNameWithUnnamedArgument {
typedef InjectedClassNameWithUnnamedArgument b;
};
```
into `using b = InjectedClassNameWithUnnamedArgument<>;`.
The first fixit is surprising because its different than the code
before, but the second fixit doesn't even compile.
This PR adds an option to the TypePrinter to print InjectedClassNameType without
template parameters (i.e. as written).
Reviewers: aaron.ballman, alexfh, hokein, njames93
Subscribers: xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77979
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.
This patch adds the flag IsInsertOp1SVALL to insert SV_ALL as the
second operand.
Reviewers: efriedma, SjoerdMeijer
Reviewed By: SjoerdMeijer
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78401
Summary:
Add an option to enable on-demand parsing of needed ASTs during CTU analysis.
Two options are introduced. CTUOnDemandParsing enables the feature, and
CTUOnDemandParsingDatabase specifies the path to a compilation database, which
has all the necessary information to generate the ASTs.
Reviewers: martong, balazske, Szelethus, xazax.hun
Subscribers: ormris, mgorny, whisperity, xazax.hun, baloghadamsoftware, szepet, rnkovacs, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, Charusso, steakhal, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75665
Expose llvm fence instruction as clang builtin for AMDGPU target
__builtin_amdgcn_fence(unsigned int memoryOrdering, const char *syncScope)
The first argument of this builtin is one of the memory-ordering specifiers
__ATOMIC_ACQUIRE, __ATOMIC_RELEASE, __ATOMIC_ACQ_REL, or __ATOMIC_SEQ_CST
following C++11 memory model semantics. This is mapped to corresponding
LLVM atomic memory ordering for the fence instruction using LLVM atomic C
ABI. The second argument is an AMDGPU-specific synchronization scope
defined as string.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D75917
Some ACLE builtins leave out the argument to specify the predicate
pattern, which is expected to be expanded to an SV_ALL pattern.
This patch adds the flag IsAppendSVALL to append SV_ALL as the final
operand.
Reviewers: SjoerdMeijer, efriedma, rovka, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77597
Summary:
Specifically make some simple refactorings to get PointerUnion.h and
Twine.h out of the public includes. While here, trim out a lot of
transitive includes as well.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78870
When a category/extension doesn't repeat a type bound, corresponding
type parameter is substituted with `id` when used as a type argument. As
a result, in the added test case it was causing errors like
> type argument 'T' (aka 'id') does not satisfy the bound ('id<NSCopying>') of type parameter 'T'
We are already checking that type parameters should be consistent
everywhere (see `checkTypeParamListConsistency`) and update
`ObjCTypeParamDecl` to have correct underlying type. And when we use the
type parameter as a method return type or a method parameter type, it is
substituted to the bounded type. But when we use the type parameter as a
type argument, we check `ObjCTypeParamType` that wasn't updated and
remains `id`.
Fix by updating not only `ObjCTypeParamDecl` UnderlyingType but also
TypeForDecl as we use the underlying type to create a canonical type for
`ObjCTypeParamType` (see `ASTContext::getObjCTypeParamType`).
This is a different approach to fixing the issue. The previous one was
02c2ab3d88 which was reverted in
4c539e8da1. The problem with the previous
approach was that `ObjCTypeParamType::desugar` was returning underlying
type for `ObjCTypeParamDecl` without applying any protocols stored in
`ObjCTypeParamType`. It caused inconsistencies in comparing types before
and after desugaring.
Re-applying after fixing intermittent test failures.
rdar://problem/54329242
Reviewed By: erik.pilkington
Differential Revision: https://reviews.llvm.org/D72872
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)
No IR types or C Types are needed for this extension.
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin
Reviewed By: kmclaughlin
Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77871
The IsReverseCompare flag tells CGBuiltin to swap the operands,
so that a LT/LE intrinsics can be expressed in terms of GE/GT
intrinsics.
This patch also adds builtins for the wide-variants of the compares.
Reviewers: SjoerdMeijer, efriedma, ctetreau
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78747
This patch also adds the enum `sv_prfop` for the prefetch operation specifier
and checks to ensure the passed enum values are valid.
Reviewers: SjoerdMeijer, efriedma, ctetreau
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78674
Currently, both `warn_impcast_integer_float_precision_constant` and
`warn_impcast_integer_float_precision` are covered by
-Wimplicit-int-float-conversion, but only the ..._constant warning is on
by default.
`warn_impcast_integer_float_precision_constant` likely flags real problems
while `warn_impcast_integer_float_precision` may flag legitimate use
cases (for example, `int` used with limited range supported by `float`).
If -Wno-implicit-int-float-conversion is used, currently there is no way
to restore the ..._constant warning. This patch adds
-Wimplicit-const-int-float-conversion to address the issue. (Similar to
the reasoning in https://reviews.llvm.org/D64666#1598194)
Adapted from a patch by Brooks Moses.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D78661
Summary:
When using -ftrivial-auto-var-init=* options to initiate automatic
variables in a file, to disable initialization on some variables,
currently we have to manually annotate the variables with uninitialized
attribute, such as
int dont_initialize_me __attribute((uninitialized));
Making pragma clang attribute to support this attribute would make
annotating variables much easier, and could be particular useful for
bisection efforts, e.g.
void use(void*);
void buggy() {
int arr[256];
int boom;
float bam;
struct { int oops; } oops;
union { int oof; float aaaaa; } oof;
use(&arr);
use(&boom);
use(&bam);
use(&oops);
use(&oof);
}
Reviewers: jfb, rjmccall, aaron.ballman
Reviewed By: jfb, aaron.ballman
Subscribers: aaron.ballman, george.burgess.iv, dexonsmith, MaskRay, phosek, hubert.reinterpretcast, gbiv, manojgupta, llozano, srhines, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78693
This is a code clean up of the PropertyAttributeKind and
ObjCPropertyAttributeKind enums in ObjCPropertyDecl and ObjCDeclSpec that are
exactly identical. This non-functional change consolidates these enums
into one. The changes are to many files across clang (and comments in LLVM) so
that everything refers to the new consolidated enum in DeclObjCCommon.h.
2nd Landing Attempt...
Differential Revision: https://reviews.llvm.org/D77233
This patch changes the codegen of the builtins for contiguous loads
to map onto the SVE specific IR intrinsics llvm.aarch64.sve.ld1/st1.
Reviewers: SjoerdMeijer, efriedma, kmclaughlin, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78673
This patch changes the FP conversion intrinsics to take a predicate
that matches the number of lanes for the vector with the widest element
type as opposed to using <vscale x 16 x i1>.
For example:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f16(<vscale x 4 x float>, <vscale x 4 x i1>, <vscale x 8 x half>)```
now uses <vscale x 4 x i1> instead of <vscale x 16 x i1>
And similar for:
```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f64(<vscale x 4 x float>, <vscale x 2 x i1>, <vscale x 2 x double>)```
where the predicate now matches the wider type, so <vscale x 2 x i1>.
Reviewers: efriedma, SjoerdMeijer, paulwalker-arm, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78402
This adds the flag IsOverloadCvt which tells CGBulitin to use
the result type and the type of the last operand as the
overloaded types for the LLVM IR intrinsic.
This also adds the flag IsFPConvert, which is needed to avoid
converting the predicate of the operation from svbool_t to
a predicate with fewer lanes, as the LLVM IR intrinsics use
the <vscale x 16 x i1> as the predicate.
Reviewers: SjoerdMeijer, efriedma
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78239
This is a code clean up of the PropertyAttributeKind and
ObjCPropertyAttributeKind enums in ObjCPropertyDecl and ObjCDeclSpec that are
exactly identical. This non-functional change consolidates these enums
into one. The changes are to many files across clang (and comments in LLVM) so
that everything refers to the new consolidated enum in DeclObjCCommon.h.
Differential Revision: https://reviews.llvm.org/D77233
This also adds the IsOverloadWhileRW flag which tells CGBuiltin to use
the result predicate type and the first pointer type as the
overloaded types for the LLVM IR intrinsic.
Reviewers: SjoerdMeijer, efriedma
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78238
This also adds the IsOverloadWhile flag which tells CGBuiltin to use
both the default type (predicate) and the type of the second operand
(scalar) as the overloaded types for the LLMV IR intrinsic.
Reviewers: SjoerdMeijer, efriedma, rovka
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77595
Summary:
We extend the behavior for local functions and methods of local classes
to lambdas in variable initializers. The initializer is not a separate
scope, but we treat it as such.
We also remove the (faulty) instantiation of default arguments in
TreeTransform::TransformLambdaExpr, because it doesn't do proper
initialization, and if it did, we would do it twice (and thus also emit
eventual errors twice).
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D76038
Add the IsOverloadNone flag to tell CGBuiltin that it does not have
an overloaded type. This is used for e.g. svpfalse which does
not take any arguments and always returns a svbool_t.
This patch also adds builtins for svcntb_pat, svcnth_pat, svcntw_pat
and svcntd_pat, as those don't require custom codegen.
Reviewers: SjoerdMeijer, efriedma, rovka
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77596
Summary:
Even when BreakBeforeBinaryOperators is set, AlignOperands kept
aligning the beginning of the line, even when it could align the
actual operands (e.g. after an assignment).
With this patch, there is an option to actually align the operands, so
that the operator gets right-aligned with the equal sign or return
operator:
int aaaaa = bbbbbb
+ cccccc;
return aaaaaaa
&& bbbbbbb;
This not happen in parentheses, to avoid 'breaking' the indentation:
if (aaaaa
&& bbbbb)
return;
Reviewers: krasimir, djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D32478
The ACLE has builtins that take a scalar value that is to be expanded
into a vector by the operation. While the ISA may have an instruction
that takes an immediate or a scalar to represent this, the LLVM IR
intrinsic may not, so Clang will have to splat the scalar value.
This patch also adds the _n forms for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.
Reviewers: SjoerdMeijer, efriedma, rovka
Reviewed By: SjoerdMeijer
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77594
Summary:
Change the default ABI to be compatible with GCC. For 32-bit ELF
targets other than Linux, Clang now returns small structs in registers
r3/r4. This affects FreeBSD, NetBSD, OpenBSD. There is no change for
32-bit Linux, where Clang continues to return all structs in memory.
Add clang options -maix-struct-return (to return structs in memory) and
-msvr4-struct-return (to return structs in registers) to be compatible
with gcc. These options are only for PPC32; reject them on PPC64 and
other targets. The options are like -fpcc-struct-return and
-freg-struct-return for X86_32, and use similar code.
To actually return a struct in registers, coerce it to an integer of the
same size. LLVM may optimize the code to remove unnecessary accesses to
memory, and will return i32 in r3 or i64 in r3:r4.
Fixes PR#40736
Patch by George Koehler!
Reviewed By: jhibbits, nemanjai
Differential Revision: https://reviews.llvm.org/D73290
Summary:
This is achieved by calculating newly added includes and implicitly
parsing them as if they were part of the main file.
This also gets rid of the need for consistent preamble reads.
Reviewers: sammccall
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, mgrang, arphaman, jfb, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77392
This fixes a common mistake (the 3 should be @3): NSNumber *n = 3. This extends
an existing check for NSString. Also, this only errs if the initializer isn't a
null pointer constant, so NSNumber *n = 0; continues to work. rdar://47029572
Differential revision: https://reviews.llvm.org/D78066
This implements zeroing of false lanes for binary operations,
where instead of merging into the first operand vector (_m)
a `select` is placed on the first input vector. This approach
easily translates to the use of the `zeroing movprfx` instruction.
This patch also adds builtins for svabd, svadd, svdiv, svdivr,
svmax, svmin, svmul, svmulh, svub and svsubr.
Reviewers: SjoerdMeijer, efriedma, rovka
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77593
Builtins that have the merge type MergeAnyExp or MergeZeroExp,
merge into a 'undef' or 'zero' vector respectively, which enables the
_x and _z behaviour for unary operations.
This patch also adds builtins for svabs and svneg.
Reviewers: SjoerdMeijer, efriedma, rovka
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77591
Adds another bunch of of intrinsics that take immediates with
varying ranges based, some being a complex rotation immediate
which are a set of allowed immediates rather than a range.
svmla_lane: lane immediate ranging 0..(128/(1*sizeinbits(elt)) - 1)
svcmla_lane: lane immediate ranging 0..(128/(2*sizeinbits(elt)) - 1)
svdot_lane: lane immediate ranging 0..(128/(4*sizeinbits(elt)) - 1)
svcadd: complex rotate immediate [90, 270]
svcmla:
svcmla_lane: complex rotate immediate [0, 90, 180, 270]
Reviewers: efriedma, SjoerdMeijer, rovka
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76680
This patch adds a number of intrinsics that take immediates with
varying ranges based on the element size one of the operands.
svext: immediate ranging 0 to (2048/sizeinbits(elt) - 1)
svasrd: immediate ranging 1..sizeinbits(elt)
svqshlu: immediate ranging 1..sizeinbits(elt)/2
ftmad: immediate ranging 0..(sizeinbits(elt) - 1)
Reviewers: efriedma, SjoerdMeijer, rovka, rengolin
Reviewed By: SjoerdMeijer
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76679
modules too.
This more accurately reflects the semantics of this flag, as distinct
from "IsAvailable", which (in an explicit modules world) only describes
whether a module is buildable, not whether it's importable.
Summary:
This flag has been deprecated, with an on-by-default warning encouraging
users to explicitly specify whether they mean "all" or ubsan for 5 years
(released in Clang 3.7). Change it to mean what we wanted and
undeprecate it.
Also make the argument to -fsanitize-trap optional, and likewise default
it to 'all', and express the aliases for these flags in the .td file
rather than in code. (Plus documentation updates for the above.)
Reviewers: kcc
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77753
This reverts commit 61ba1481e2.
I'm reverting this because it breaks the lldb build with
incomplete switch coverage warnings. I would fix it forward,
but am not familiar enough with lldb to determine the correct
fix.
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
Introduction/Motivation:
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax.
Integers of non-power-of-two aren't particularly interesting or useful
on most hardware, so much so that no language in Clang has been
motivated to expose it before.
However, in the case of FPGA hardware normal integer types where the
full bitwidth isn't used, is extremely wasteful and has severe
performance/space concerns. Because of this, Intel has introduced this
functionality in the High Level Synthesis compiler[0]
under the name "Arbitrary Precision Integer" (ap_int for short). This
has been extremely useful and effective for our users, permitting them
to optimize their storage and operation space on an architecture where
both can be extremely expensive.
We are proposing upstreaming a more palatable version of this to the
community, in the form of this proposal and accompanying patch. We are
proposing the syntax _ExtInt(N). We intend to propose this to the WG14
committee[1], and the underscore-capital seems like the active direction
for a WG14 paper's acceptance. An alternative that Richard Smith
suggested on the initial review was __int(N), however we believe that
is much less acceptable by WG14. We considered _Int, however _Int is
used as an identifier in libstdc++ and there is no good way to fall
back to an identifier (since _Int(5) is indistinguishable from an
unnamed initializer of a template type named _Int).
[0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html)
[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf
Differential Revision: https://reviews.llvm.org/D73967
Summary:
Clang uses 32-bit integers for storing bit offsets from the beginning of
the file that results in 512M limit on AST file. This diff replaces
absolute offsets with relative offsets from the beginning of
corresponding data structure when it is possible. And uses 64-bit
offsets for DeclOffests and TypeOffssts because these coder AST
section may easily exceeds 512M alone.
This diff breaks AST file format compatibility so VERSION_MAJOR bumped.
Test Plan:
Existing clang AST serialization tests
Tested on clangd with ~700M and ~900M preamble files
check-clang with ubsan
Reviewers: rsmith, dexonsmith
Subscribers: ilya-biryukov, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76594
Summary:
Clang uses 32-bit integers for storing bit offsets from the beginning of
the file that results in 512M limit on AST file. This diff replaces
absolute offsets with relative offsets from the beginning of
corresponding data structure when it is possible. And uses 64-bit
offsets for DeclOffests and TypeOffssts because these coder AST
section may easily exceeds 512M alone.
This diff breaks AST file format compatibility so VERSION_MAJOR bumped.
Test Plan:
Existing clang AST serialization tests
Tested on clangd with ~700M and ~900M preamble files
Reviewers: rsmith, dexonsmith
Subscribers: ilya-biryukov, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76594
Summary:
We forgot to initialize the NumExpr member in one of the constructors,
which leads crashes in preamble serialization.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78284
This is needed to fix the reason
0a2be46cfd (Modules: Invalidate out-of-date PCMs as they're
discovered) and 5b44a4b07fc1d ([modules] Do not cache invalid state for
modules that we attempted to load.) were reverted.
These patches changed Clang to use `isVolatile` when loading modules.
This had the side effect of not using mmap when loading modules, and
thus greatly increased memory usage.
The reason it wasn't using mmap is because `MemoryBuffer` plays some
games with file size when you request null termination, and it has to
disable these when `isVolatile` is set as the size may change by the
time it's mmapped. Clang by default passes
`RequiresNullTerminator = true`, and `shouldUseMmap` ignored if
`RequiresNullTerminator` was even requested.
This patch adds `RequiresNullTerminator` to the `FileManager` interface
so Clang can use it when loading modules, and changes `shouldUseMmap` to
only take volatility into account if `RequiresNullTerminator` is true.
This is fine as both `mmap` and a `read` loop are vulnerable to
modifying the file while reading, but are immune to the rename Clang
does when replacing a module file.
Differential Revision: https://reviews.llvm.org/D77772
MSVC appears to instantiate the virtual members of FoldingSet when
instantiating the class definition, thereby requiring the element type
to be defined so that its hash function is known.
This is intended to be a temporary fix; ideally, FoldingSet should not
require this.
Summary:
Previously, we treated CXXUuidofExpr as quite a special case: it was the
only kind of expression that could be a canonical template argument, it
could be a constant lvalue base object, and so on. In addition, we
represented the UUID value as a string, whose source form we did not
preserve faithfully, and that we partially parsed in multiple different
places.
With this patch, we create an MSGuidDecl object to represent the
implicit object of type 'struct _GUID' created by a UuidAttr. Each
UuidAttr holds a pointer to its 'struct _GUID' and its original
(as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves
like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue
representation of the GUID on the MSGuidDecl and use it from constant
evaluation where needed.
This allows removing a lot of the special-case logic to handle these
expressions. Unfortunately, many parts of Clang assume there are only
a couple of interesting kinds of ValueDecl, so the total amount of
special-case logic is not really reduced very much.
This fixes a few bugs and issues:
* PR38490: we now support reading from GUID objects returned from
__uuidof during constant evaluation.
* Our Itanium mangling for a non-instantiation-dependent template
argument involving __uuidof no longer depends on which CXXUuidofExpr
template argument we happened to see first.
* We now predeclare ::_GUID, and permit use of __uuidof without
any header inclusion, better matching MSVC's behavior. We do not
predefine ::__s_GUID, though; that seems like a step too far.
* Our IR representation for GUID constants now uses the correct IR type
wherever possible. We will still fall back to using the
{i32, i16, i16, [8 x i8]}
layout if a definition of struct _GUID is not available. This is not
ideal: in principle the two layouts could have different padding.
Reviewers: rnk, jdoerfert
Subscribers: arphaman, cfe-commits, aeubanks
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78171
Summary:
This patch adds support for importing fixed point literals, following
up to https://reviews.llvm.org/D46915 specifically for importing AST.
Reviewers: martong, leonardchan, ebevhan, a.sidorin, shafik
Reviewed By: martong
Subscribers: balazske, rnkovacs, teemperor, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77721
Summary:
This patch adds the EXPR_FIXEDPOINT_LITERAL AST
code to serialize FixedPointLiterals. They were previously
being serialized with the code for integer literals, which
doesn't work properly.
Reviewers: leonardchan, rjmccall
Reviewed By: leonardchan, rjmccall
Subscribers: vabridgers, JesperAntonsson, bjope, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57226
Summary:
This patch adds a mechanism to easily add range checks for a builtin's
immediate operands. This patch is tested with the qdech intrinsic, which takes
both an enum for the predicate pattern, as well as an immediate for the
multiplier.
Reviewers: efriedma, SjoerdMeijer, rovka
Reviewed By: efriedma, SjoerdMeijer
Subscribers: mgorny, tschuett, mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76678
Summary:
This issue was introduced when reworking D75861. The bug isn't
actually hit with current unit tests because the contiguous loads/stores
infer the EltType and the MemEltType from the pointer and result, rather
than using the flags. But it will be needed for other intrinsics, such as
gather/scatter.
Reviewers: SjoerdMeijer, Andrzej
Reviewed By: SjoerdMeijer
Subscribers: andwar, tschuett, cfe-commits, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76617
This adds builtins for all contiguous loads/stores, including
non-temporal, first-faulting and non-faulting.
Reviewers: efriedma, SjoerdMeijer
Reviewed By: SjoerdMeijer
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76238
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.
Differential revision: https://reviews.llvm.org/D78016
I didn't realize HIP was a distinct offloading kind, so the subtarget
was looking for -march, which isn't correct for HIP. We also have the
possibility of different denormal defaults in the case of multiple
offload targets, so we need to thread the JobAction through the target
hook.
Summary:
Use spaces instead of tabs for alignment with UT_ForContinuationAndIndentation to make the code aligned for any tab/indent width.
Fixes https://bugs.llvm.org/show_bug.cgi?id=38381
Reviewed By: MyDeveloperDay
Patch By: fickert
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D75034
Summary:
This commit adds two command-line options to clang.
These options let the user decide which functions will receive SanitizerCoverage instrumentation.
This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing.
Patch by Yannis Juglaret of DGA-MI, Rennes, France
libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy.
Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions.
The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`.
Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists:
```
# (a)
src:*
fun:*
# (b)
src:SRC/*
fun:*
# (c)
src:SRC/src/woff2_dec.cc
fun:*
```
Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points.
For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s.
We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction.
The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise.
Reviewers: kcc, morehouse, vitalybuka
Reviewed By: morehouse, vitalybuka
Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D63616
Currently the library is separately linked, but this isn't correct to
implement fast math flags correctly. Each module should get the
version of the library appropriate for its combination of fast math
and related flags, with the attributes propagated into its functions
and internalized.
HIP already maintains the list of libraries, but this is not used for
OpenCL. Unfortunately, HIP uses a separate --hip-device-lib argument,
despite both languages using the same bitcode library. Eventually
these two searches need to be merged.
An additional problem is there are 3 different locations the libraries
are installed, depending on which build is used. This also needs to be
consolidated (or at least the search logic needs to deal with this
unnecessary complexity).
This improves the message by adding the missing 'x' prefix in register
names, such that the messages say for example 'Reserve the x10 register',
instead of 'Reserve the 10 register'.
This replaces the ChildrenGetter inside the DominatorTree with
GraphTraits over a GraphDiff object, an object which encapsulated the
view of the previous CFG.
This also simplifies the extentions in clang which use DominatorTree, as
GraphDiff also filters nullptrs.
Re-land a90374988e after moving CFGDiff.h
to Support.
Differential Revision: https://reviews.llvm.org/D77341
This reverts commit a90374988e and 5da1671bf8.
A new dependency is introduced here from Support to IR which seems like
a layering violation. It also breaks the MLIR build at the moment.
Summary:
This replaces the ChildrenGetter inside the DominatorTree with
GraphTraits over a GraphDiff object, an object which encapsulated the
view of the previous CFG.
This also simplifies the extentions in clang which use DominatorTree, as
GraphDiff also filters nullptrs.
Reviewers: kuhar, dblaikie, NutshellySima
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77341
In the MS C++ ABI, the complete destructor variant for a class with
virtual bases is emitted whereever it is needed, instead of directly
alongside the base destructor variant. The complete destructor calls the
base destructor of the current class and the base destructors of each
virtual base. In order for this to work reliably, translation units that
use the destructor of a class also need to mark destructors of virtual
bases of that class used.
Fixes PR38521
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D77081
This adds support for enabling experimental/unratified RISC-V ISA
extensions in the -march string in the case where an explicit version
number has been declared, and the -menable-experimental-extensions flag
has been provided.
This follows the design as discussed on the mailing lists in the
following RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-January/138364.html
Since the RISC-V toolchain definition currently rejects any extension
with an explicit version number, the parsing logic has been tweaked to
support this, and to allow standard extensions to have their versions
checked in future patches.
The bitmanip 'b' extension has been added as a first use of this support,
it should easily extend to other as yet unratified extensions (such as
the vector 'v' extension).
Differential Revision: https://reviews.llvm.org/D73891
Exactly what it says on the tin! The included testfile demonstrates why this is
important -- for C++ dynamic memory operators, we don't always recognize custom,
or even standard-specified new/delete operators as CXXAllocatorCall or
CXXDeallocatorCall.
Differential Revision: https://reviews.llvm.org/D77391
Exactly what it says on the tin! There is no reason I think not to have this.
Also, I added test files for checkers that emit warning under the wrong name.
Differential Revision: https://reviews.llvm.org/D76605
Summary: Requires hasCastKind arguments to have `CK_` prefixed to bring it in line with the documentation and other matchers that take enumerations.
Reviewers: klimek, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77503
Summary:
The option `-mpad-max-prefix-size` performs some checking and delegate to MC option `-x86-pad-max-prefix-size`. This option is designed for eliminate NOPs when we need to align something by adding redundant prefixes to instructions, e.g. it can be used along with `-malign-branch`, `-malign-branch-boundary` to prefix padding branch.
It has similar (but slightly different) effect as GAS's option `-malign-branch-prefix-size`, e.g. `-mpad-max-prefix-size` can also elminate NOPs emitted by align directive, so we use a different name here. I remove the option `-malign-branch-prefix-size` since is unimplemented and not needed. If we need to be compatible with GAS, we can make `-malign-branch-prefix-size` an alias for this option later.
Reviewers: jyknight, reames, MaskRay, craig.topper, LuoYuanke
Reviewed By: MaskRay, LuoYuanke
Subscribers: annita.zhang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77628
Now compiler defines 5 sets of constants to represent rounding mode.
These are:
1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.
2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.
3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.
4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.
5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.
The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.
This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.
Differential Revision: https://reviews.llvm.org/D77379
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.
Differential Revision: https://reviews.llvm.org/D77670
Instead of hardcoding individual GPU mappings in multiple functions, keep them
all in one table and use it to look up the mappings.
We also don't care about 'virtual' architecture much, so the API is trimmed down
down to a simpler GPU->Virtual arch name lookup.
Differential Revision: https://reviews.llvm.org/D77665
Summary:
Previously, clang emitted a less-usefull diagnostic and didnt recover
well when the keywords is used as identifier in function paramter.
```
void foo(int case, int x); // previously we drop all parameters after
`int case`.
```
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77633
Summary:
This revision simplifies the representation of edits in rewrite rules. The
simplified form is more general, allowing the user more flexibility in building
custom edit specifications.
The changes extend the API, without changing the signature of existing
functions. So this only risks breaking users that directly accessed the
`RewriteRule` struct.
Reviewers: gribozavr2
Subscribers: jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77419
By default, all traits in the OpenMP context selector have to match for
it to be acceptable. Though, we sometimes want a single property out of
multiple to match (=any) or no match at all (=none). We offer these
choices as extensions via
`implementation={extension(match_{all,any,none})}`
to the user. The choice will affect the entire context selector not only
the traits following the match property.
The first user will be D75788. There we can replace
```
#pragma omp begin declare variant match(device={arch(nvptx64)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
// TODO: Hack until we support an extension to the match clause that allows "or".
#undef __CLANG_CUDA_CMATH_H__
#undef __CUDA__
#pragma omp end declare variant
#pragma omp begin declare variant match(device={arch(nvptx)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
with the much simpler
```
#pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)})
#define __CUDA__
#include <__clang_cuda_cmath.h>
#undef __CUDA__
#pragma omp end declare variant
```
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D77414
If we have a function definition in `omp begin/end declare variant` it
is a specialization of a base function with the same name and
"compatible" type. Before, we just created a declaration for the base.
With this patch we try to find an existing declaration first and only
create a new one if we did not find any with a compatible type. This is
preferable as we can tolerate slight mismatches, especially if the
specialized version is "more constrained", e.g., constexpr.
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D77252
Implemented codegen for the iterator expression in the depend clauses.
Iterator construct is emitted the following way:
iterator(cnt1, cnt2, ...), in : <dep>
<TotalNumDeps> = <cnt1_size> * <cnt2_size> * ...;
kmp_depend_t deps[<TotalNumDeps>];
deps_counter = 0;
for (cnt1) {
for (cnt2) {
...
deps[deps_counter].base_addr = &<dep>;
deps[deps_counter].size = sizeof(<dep>);
deps[deps_counter].flags = in;
deps_counter += 1;
...
}
}
For depobj construct the codegen is very similar, but the memory is
allocated dynamically and added extra first item reserved for internal use.
WG14 has adopted N2480 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2480.pdf)
into C2x at the meetings last week, allowing parameter names of a function
definition to be elided. This patch relaxes the error so that C++ and C2x do not
diagnose this situation, and modes before C2x will allow it as an extension.
This also adds the same feature to ObjC blocks under the assumption that ObjC
wishes to follow the C standard in this regard.
Summary:
- Use `device_builtin_surface` and `device_builtin_texture` for
surface/texture reference support. So far, both the host and device
use the same reference type, which could be revised later when
interface/implementation is stablized.
Reviewers: yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77583
Summary:
Same restrictions apply as in the other direction: macro arguments are
not supported yet, only full macro expansions can be mapped.
Taking over from https://reviews.llvm.org/D72581.
Reviewers: gribozavr2, sammccall
Reviewed By: gribozavr2
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77209
constructor with default arguments.
We used to try to rebuild the call as a call to the faked-up inherited
constructor, which is only a placeholder and lacks (for example) default
arguments. Instead, build the call by reference to the original
constructor.
In passing, add a note to say where a call that recursively uses a
default argument from within itself occurs. This is usually pretty
obvious, but still at least somewhat useful, and would have saved
significant debugging time for this particular bug.
Zero sized bit-fields aren't included in the CGRecordLayout, so we shouldn't be
calling EmitLValueForField for them. rdar://60695105
Differential revision: https://reviews.llvm.org/D76782
Summary:
This adds support for giving hints when using dynamic matchers with enum args. Take this query, I couldn't figure out why the matcher wasn't working:
(Turns out it needed to be "IntegralToBoolean", but thats another bug itself)
```
clang-query> match implicitCastExpr(hasCastKind("CK_IntegralToBoolean"))
1:1: Error parsing argument 1 for matcher implicitCastExpr.
1:18: Error building matcher hasCastKind.
1:30: Incorrect type for arg 1. (Expected = string) != (Actual = String)
```
With this patch the new behaviour looks like this:
```
clang-query> match implicitCastExpr(hasCastKind("CK_IntegralToBoolean"))
1:1: Error parsing argument 1 for matcher implicitCastExpr.
1:18: Error building matcher hasCastKind.
1:30: Unknown value 'CK_IntegralToBoolean' for arg 1; did you mean 'IntegralToBoolean'
```
There are no test cases for this yet as there simply isn't any infrastructure for testing errors reported when parsing args that I can see.
Reviewers: klimek, jdoerfert, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: aaron.ballman, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77499
Detect script locations in a more straightforward way: we don't need to
search for them because we know exactly where they are anyway.
Fix a file path escaping issue in exploded-graph-rewriter with Windows
backslashes in the path.
'REQUIRES: shell' remains in scan-build tests for now, so that to
observe the buildbot reaction on removing it in a cleaner experiment.
Patch by Denys Petrov!
Differential Revision: https://reviews.llvm.org/D76768
Saves only 36 includes of ASTContext.h and related headers.
There are two deps on ASTContext.h:
- C++ method overrides iterator types (TinyPtrVector)
- getting LangOptions
For #1, duplicate the iterator type, which is
TinyPtrVector<>::const_iterator.
For #2, add an out-of-line accessor to get the language options. Getting
the ASTContext from a Decl is already an out of line method that loops
over the parent DeclContexts, so if it is ever performance critical, the
proper fix is to pass the context (or LangOpts) into the predicate in
question.
Other changes are just header fixups.
Move function emitDeferredDiags from Sema to DeferredDiagsEmitter since it
is only used by DeferredDiagsEmitter.
Also skip visited functions to avoid exponential compile time.
Differential Revision: https://reviews.llvm.org/D77028
in the token stream.
Previously we deleted all template-id annotations at the end of each
top-level declaration. That doesn't work: we can do some lookahead and
form a template-id annotation, and then roll back that lookahead, parse,
and decide that we're missing a semicolon at the end of a top-level
declaration, before we reach the annotation token. In that situation,
we'd end up parsing the annotation token after deleting its associated
data, leading to various forms of badness.
We now only delete template-id annotations if the preprocessor can
assure us that there are no annotation tokens left in the token stream
(or if we're already at EOF). This lets us delete the annotation tokens
earlier in a lot of cases; we now clean them up at the end of each
statement and class member, not just after each top-level declaration.
This also permitted some simplification of the delay-parsed templates
cleanup code.
Move the listing of allowed clauses per OpenMP directive to the new
macro file in `llvm/Frontend/OpenMP`. Also, use a single generic macro
that specifies the directive and one allowed clause explicitly instead
of a dedicated macro per directive.
We save 800 loc and boilerplate for all new directives/clauses with no
functional change. We also need to include the macro file only once and
not once per directive.
Depends on D77112.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D77113
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Reviewed By: fghanim
Differential Revision: https://reviews.llvm.org/D77112
as invalid.
We create those when forming trivial type source information with no
associated location, which, unfortunately, we do create in some cases
(when a TreeTransform with no base location is used to transform a
QualType).
This would previously lead to rejects-valid bugs when we misinterpreted
these constructs as having no nested-name-specifier.
memchr consistent and comprehensible, and document them.
We previously allowed evaluation of memcmp on arrays of integers of any
size, so long as the call evaluated to 0, and allowed evaluation of
memchr on any array of integral type of size 1 (including enums). The
purpose of constant-evaluating these builtins is only to support
constexpr std::char_traits, so we now consistently allow them on arrays
of (possibly signed or unsigned) char only.
When a category/extension doesn't repeat a type bound, corresponding
type parameter is substituted with `id` when used as a type argument. As
a result, in the added test case it was causing errors like
> type argument 'T' (aka 'id') does not satisfy the bound ('id<NSCopying>') of type parameter 'T'
We are already checking that type parameters should be consistent
everywhere (see `checkTypeParamListConsistency`) and update
`ObjCTypeParamDecl` to have correct underlying type. And when we use the
type parameter as a method return type or a method parameter type, it is
substituted to the bounded type. But when we use the type parameter as a
type argument, we check `ObjCTypeParamType` that wasn't updated and
remains `id`.
Fix by updating not only `ObjCTypeParamDecl` UnderlyingType but also
TypeForDecl as we use the underlying type to create a canonical type for
`ObjCTypeParamType` (see `ASTContext::getObjCTypeParamType`).
This is a different approach to fixing the issue. The previous one was
02c2ab3d88 which was reverted in
4c539e8da1. The problem with the previous
approach was that `ObjCTypeParamType::desugar` was returning underlying
type for `ObjCTypeParamDecl` without applying any protocols stored in
`ObjCTypeParamType`. It caused inconsistencies in comparing types before
and after desugaring.
rdar://problem/54329242
Reviewed By: erik.pilkington
Differential Revision: https://reviews.llvm.org/D72872
See rational here: https://reviews.llvm.org/D76173#1922916
Time to compile Attr.h in isolation goes from 2.6s to 1.8s.
Original patch by Johannes, plus some additions from Reid to fix some
clang tooling targets.
Effect on transitive includes is marginal, though:
$ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \
| grep '^[-+] ' | sort | uniq -c | sort -nr
104 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/AST/OpenMPClause.h
87 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Frontend/OpenMP/OMPContext.h
19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SmallSet.h
19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SetVector.h
14 - /usr/include/c++/9/set
...
Differential Revision: https://reviews.llvm.org/D76184
Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.
More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).
Also adds a new target feature to X86: +lvi-load-hardening
The feature can be added via the clang CLI using -mlvi-hardening.
Differential Revision: https://reviews.llvm.org/D75936
Summary:
- `RegisterVar` has `void` return type and `size_t` in its variable size
parameter in HIP or CUDA 9.0+.
Reviewers: tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77398
Summary:
This matches llvm::VectorType.
It moves the size from the type bitfield into VectorType, increasing size by 8
bytes (including padding of 4). This is OK as we don't expect to create terribly
many of these types.
c.f. D77313 which enables large power-of-two sizes without growing VectorType.
Reviewers: efriedma, hokein
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77335
This pass replaces each indirect call/jump with a direct call to a thunk that looks like:
lfence
jmpq *%r11
This ensures that if the value in register %r11 was loaded from memory, then
the value in %r11 is (architecturally) correct prior to the jump.
Also adds a new target feature to X86: +lvi-cfi
("cfi" meaning control-flow integrity)
The feature can be added via clang CLI using -mlvi-cfi.
This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code.
Differential Revision: https://reviews.llvm.org/D76812
Summary:
Previously we induced a state split if there were multiple argument
constraints given for a function. This was because we called
`addTransition` inside the for loop.
The fix is to is to store the state and apply the next argument
constraint on that. And once the loop is finished we call `addTransition`.
Reviewers: NoQ, Szelethus, baloghadamsoftware
Subscribers: whisperity, xazax.hun, szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, gamesh411, C
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76790
Summary:
As defined by Arm C Language Extensions (ACLE) these macro defines
should be set to specific values depending on -mbranch-protection.
Reviewers: chill
Reviewed By: chill
Subscribers: danielkiss, cfe-commits, kristof.beyls
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77134
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Differential Revision: https://reviews.llvm.org/D77112
This API is used by LLDB to attach owning module information to
Declarations deserialized from DWARF.
Differential Revision: https://reviews.llvm.org/D75561
A question came up from a glibc maintainer as to whether it was permissible to
free a pointer marked [[clang::noescape]], and after investigation, I
determined that this is allowed. This updates the documentation in case others
have the same question.
Summary:
Swift would like to use clang's apis to emit protocol declarations.
This commits adds the public API:
```
emitObjCProtocolObject(CodeGenModule &CGM, const ObjCProtocolDecl *p);
```
rdar://60888524
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77077
Prior to this change the clang interface stubs format resembled
something ending with a symbol list like this:
Symbols:
a: { Type: Func }
This was problematic because we didn't actually want a map format and
also because we didn't like that an empty symbol list required
"Symbols: {}". That is to say without the empty {} llvm-ifs would crash
on an empty list.
With this new format it is much more clear which field is the symbol
name, and instead the [] that is used to express an empty symbol vector
is optional, ie:
Symbols:
- { Name: a, Type: Func }
or
Symbols: []
or
Symbols:
This further diverges the format from existing llvm-elftapi. This is a
good thing because although the format originally came from the same
place, they are not the same in any way.
Differential Revision: https://reviews.llvm.org/D76979
The driver enables -fdiagnostics-show-option by default, so flip the CC1
default to reduce the lengths of common CC1 command lines.
This change also makes ParseDiagnosticArgs() consistently enable
-fdiagnostics-show-option by default.
adb290d974 added a new case to
err_atomic_specifier_bad_type. The diagnostic has two %select's
controlled by the same argument, but only the first was updated to have
the new case. Add the extra case for the second %select and add a
test case that exercises the last case.
Per the documentation, this class is supposed to forward every virtual
method, but we had missed on (shouldEraseOutputFiles). This fixes using
a wrapped frontend action over the PCH generator when using
-fallow-pch-with-compiler-errors. I do not think any upstream wrapper
actions can test this.
Differential Revision: https://reviews.llvm.org/D77180
rdar://61110294
Since GlobalISel is maturing and is already on at -O0 for AArch64, it's not
completely "experimental". Create a more appropriate driver flag and make
the older option an alias for it.
Differential Revision: https://reviews.llvm.org/D77103
Summary:
CGProfilePass is run by default in certain new pass manager optimization pipeline. Assemblers other than llvm as (such as gnu as) cannot recognize the .cgprofile entries generated and emitted from this pass, causing build time error.
This patch adds new options in clang CodeGenOpts and PassBuilder options so that we can turn cgprofile off when not using integrated assembler.
Reviewers: Bigcheese, xur, george.burgess.iv, chandlerc, manojgupta
Reviewed By: manojgupta
Subscribers: manojgupta, void, hiraditya, dexonsmith, llvm-commits, tcwang, llozano
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D62627
Summary:
The basic idea is to walk through the concept definition, looking for
t.foo() where t has the constrained type.
In this patch:
- nested types are recognized and offered after ::
- variable/function members are recognized and offered after the correct
dot/arrow/colon trigger
- member functions are recognized (anything directly called). parameter
types are presumed to be the argument types. parameters are unnamed.
- result types are available when a requirement has a type constraint.
These are printed as constraints, except same_as<T> which prints as T.
Not in this patch:
- support for merging/overloading when two locations describe the same member.
The last one wins, for any given name. This is probably important...
- support for nested template members (T::x<int>)
- support for completing members of (instantiations of) template template parameters
Reviewers: nridge, saar.raz
Subscribers: mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73649
Summary:
Added basic representation and parsing/sema handling of array-shaping
operations. Array shaping expression is an expression of form ([s0]..[sn])base,
where s0, ..., sn must be a positive integer, base - a pointer. This
expression is a kind of cast operation that converts pointer expression
into an array-like kind of expression.
Reviewers: rjmccall, rsmith, jdoerfert
Subscribers: guansong, arphaman, cfe-commits, caomhin, kkwli0
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74144
Summary: We mark these decls as invalid.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77037
Iterator checkers (and planned container checkers) need the option
aggressive-binary-operation-simplification to be enabled. Without this
option they may cause assertions. To prevent such misuse, this patch adds
a preventive check which issues a warning and denies the registartion of
the checker if this option is disabled.
Differential Revision: https://reviews.llvm.org/D75171
The __builtin_stdarg_start is the legacy spelling of __builtin_va_start.
It should behave exactly the same, but for the last 9 years it would
behave subtly different for diagnostics. Follow the change from
29ad95b232 to require custom type checking.
The main purpose of introducing these builtins is to add a range
metadata [1, 1025) on the work group size loaded from dispatch
ptr, which cannot be done by source code.
Differential Revision: https://reviews.llvm.org/D76772
scope.
There are a few contexts in which we assume a name is a template name;
if such a context is one where we should perform an unqualified lookup,
and lookup finds nothing, we would form a dependent template name even
if the name is not dependent. This happens in particular for the lookup
of a pseudo-destructor.
In passing, rename ActOnDependentTemplateName to just ActOnTemplateName
given that we apply it for non-dependent template names too.
Instead of bailing out of parsing when we encounter an invalid
template-name or template arguments in a template-id, produce an
annotation token describing the invalid construct.
This avoids duplicate errors and generally allows us to recover better.
In principle we should be able to extend this to store some kinds of
invalid template-id in the AST for tooling use, but that isn't handled
as part of this change.
Summary:
This makes it easier/safer to add bits (error) to other node types without
worrying about bit layout all the time.
For now, just use to implement the ad-hoc conversion functions.
Next: remove these functions and use this directly.
Reviewers: hokein
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76939
Some checkers may not only depend on language options but also analyzer options.
To make this possible this patch changes the parameter of the shouldRegister*
function to CheckerManager to be able to query the analyzer options when
deciding whether the checker should be registered.
Differential Revision: https://reviews.llvm.org/D75271
This is the second part loosely extracted from D71179 and cleaned up.
This patch provides semantic analysis support for `omp begin/end declare
variant`, mostly as defined in OpenMP technical report 8 (TR8) [0].
The sema handling makes code generation obsolete as we generate "the
right" calls that can just be handled as usual. This handling also
applies to the existing, albeit problematic, `omp declare variant
support`. As a consequence a lot of unneeded code generation and
complexity is removed.
A major purpose of this patch is to provide proper `math.h`/`cmath`
support for OpenMP target offloading. See PR42061, PR42798, PR42799. The
current code was developed with this feature in mind, see [1].
The logic is as follows:
If we have seen a `#pragma omp begin declare variant match(<SELECTOR>)`
but not the corresponding `end declare variant`, and we find a function
definition we will:
1) Create a function declaration for the definition we were about to generate.
2) Create a function definition but with a mangled name (according to
`<SELECTOR>`).
3) Annotate the declaration with the `OMPDeclareVariantAttr`, the same
one used already for `omp declare variant`, using and the mangled
function definition as specialization for the context defined by
`<SELECTOR>`.
When a call is created we inspect it. If the target has an
`OMPDeclareVariantAttr` attribute we try to specialize the call. To this
end, all variants are checked, the best applicable one is picked and a
new call to the specialization is created. The new call is used instead
of the original one to the base function. To keep the AST printing and
tooling possible we utilize the PseudoObjectExpr. The original call is
the syntactic expression, the specialized call is the semantic
expression.
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim, aaron.ballman
Subscribers: bollu, guansong, openmp-commits, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75779
This is the first part extracted from D71179 and cleaned up.
This patch provides parsing support for `omp begin/end declare variant`,
as defined in OpenMP technical report 8 (TR8) [0].
A major purpose of this patch is to provide proper math.h/cmath support
for OpenMP target offloading. See PR42061, PR42798, PR42799. The current
code was developed with this feature in mind, see [1].
[0] https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf
[1] https://reviews.llvm.org/D61399#change-496lQkg0mhRN
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D74941
Summary:
- Even though the bindless surface/texture interfaces are promoted,
there are still code using surface/texture references. For example,
[PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
compilation issue for code using `tex2D` with texture references. For
better compatibility, this patch proposes the support of
surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
`nvcc` does use builtins for texture support. From the limited NVVM
documentation[^nvvm] and NVPTX backend texture/surface related
tests[^test], it's believed that surface/texture references are
supported by replacing their reference types, which are annotated with
`device_builtin_surface_type`/`device_builtin_texture_type`, with the
corresponding handle-like object types, `cudaSurfaceObject_t` or
`cudaTextureObject_t`, in the device-side compilation. On the host
side, that global handle variables are registered and will be
established and updated later when corresponding binding/unbinding
APIs are called[^bind]. Surface/texture references are most like
device global variables but represented in different types on the host
and device sides.
- In this patch, the following changes are proposed to support that
behavior:
+ Refine `device_builtin_surface_type` and
`device_builtin_texture_type` attributes to be applied on `Type`
decl only to check whether a variable is of the surface/texture
reference type.
+ Add hooks in code generation to replace that reference types with
the correponding object types as well as all accesses to them. In
particular, `nvvm.texsurf.handle.internal` should be used to load
object handles from global reference variables[^texsurf] as well as
metadata annotations.
+ Generate host-side registration with proper template argument
parsing.
---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later.
Reviewers: tra, rjmccall, yaxunl, a.sidorin
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76365
This reverts commit 0788acbccb.
This reverts commit c2d7a1f79cedfc9fcb518596aa839da4de0adb69: Revert "[clangd] Add test for FindTarget+RecoveryExpr (which already works). NFC"
It causes a crash on invalid code:
class X {
decltype(unresolved()) foo;
};
constexpr int s = sizeof(X);
Originally commited in rG57b8a407493c34c3680e7e1e4cb82e097f43744a, but
it broke the modules bot. This is solved by putting the contructors of
the CheckerManager class to the Frontend library.
Differential Revision: https://reviews.llvm.org/D75360
Summary:
This patch implements the following CDE intrinsics:
T __arm_vcx1q_m(int coproc, T inactive, uint32_t imm, mve_pred_t p);
T __arm_vcx2q_m(int coproc, T inactive, U n, uint32_t imm, mve_pred_t p);
T __arm_vcx3q_m(int coproc, T inactive, U n, V m, uint32_t imm, mve_pred_t p);
T __arm_vcx1qa_m(int coproc, T acc, uint32_t imm, mve_pred_t p);
T __arm_vcx2qa_m(int coproc, T acc, U n, uint32_t imm, mve_pred_t p);
T __arm_vcx3qa_m(int coproc, T acc, U n, V m, uint32_t imm, mve_pred_t p);
The intrinsics are not part of the released ACLE spec, but internally at
Arm we have reached consensus to add them to the next ACLE release.
Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76610
Summary:
This patch adds a virtual method `getCPUCacheLineSize()` to `TargetInfo`. Currently, I've only implemented the method in `X86TargetInfo`. It's extremely important that each CPU's cache line size correct (e.g., we can't just define it as `64` across the board) so, it has been a little slow getting to this point.
I'll work on the ARM CPUs next, but that will probably come later in a different patch.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74918
to reduce spurios changes in patches after clang-formatting them. In
particular, these files contain long enums that clang-format reformats
in their entirety if e.g. an element is added.
Reviews having this problem include https://reviews.llvm.org/D76342 and
https://reviews.llvm.org/D71447.
In order to support non-user-named kernels, SYCL needs some way in the
integration headers to name the kernel object themselves. Initially, the
design considered just RTTI naming of the lambdas, this results in a
quite unstable situation in light of some device/host macros.
Additionally, this ends up needing to use RTTI, which is a burden on the
implementation and typically unsupported.
Instead, we've introduced a builtin, __builtin_unique_stable_name, which
takes a type or expression, and results in a constexpr constant
character array that uniquely represents the type (or type of the
expression) being passed to it.
The implementation accomplishes that simply by using a slightly modified
version of the Itanium Mangling. The one exception is when mangling
lambdas, instead of appending the index of the lambda in the function,
it appends the macro-expansion back-trace of the lambda itself in the
form LINE->COL[~LINE->COL...].
Differential Revision: https://reviews.llvm.org/D76620
The argument after -Xarch_device will be added to the arguments for CUDA/HIP
device compilation and will be removed for host compilation.
The argument after -Xarch_host will be added to the arguments for CUDA/HIP
host compilation and will be removed for device compilation.
Differential Revision: https://reviews.llvm.org/D76520
Summary:
This clears the way for adding an Error dependence bit to Type and having it
mostly-automatically propagated.
Reviewers: hokein
Subscribers: jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76424
Normally clang avoids creating expressions when it encounters semantic
errors, even if the parser knows which expression to produce.
This works well for the compiler. However, this is not ideal for
source-level tools that have to deal with broken code, e.g. clangd is
not able to provide navigation features even for names that compiler
knows how to resolve.
The new RecoveryExpr aims to capture the minimal set of information
useful for the tools that need to deal with incorrect code:
source range of the expression being dropped,
subexpressions of the expression.
We aim to make constructing RecoveryExprs as simple as possible to
ensure writing code to avoid dropping expressions is easy.
Producing RecoveryExprs can result in new code paths being taken in the
frontend. In particular, clang can produce some new diagnostics now and
we aim to suppress bogus ones based on Expr::containsErrors.
We deliberately produce RecoveryExprs only in the parser for now to
minimize the code affected by this patch. Producing RecoveryExprs in
Sema potentially allows to preserve more information (e.g. type of an
expression), but also results in more code being affected. E.g.
SFINAE checks will have to take presence of RecoveryExprs into account.
Initial implementation only works in C++ mode, as it relies on compiler
postponing diagnostics on dependent expressions. C and ObjC often do not
do this, so they require more work to make sure we do not produce too
many bogus diagnostics on the new expressions.
See documentation of RecoveryExpr for more details.
original patch from Ilya
This change is based on https://reviews.llvm.org/D61722
Reviewers: sammccall, rsmith
Reviewed By: sammccall, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69330
TableGen and .def files (which are meant to be used with the preprocessor) come
with obvious downsides. One of those issues is that generated switch-case
branches have to be identical. This pushes corner cases either to an outer code
block, or into the generated code.
Inspect the removed code in AnalysisConsumer::DigestAnalyzerOptions. You can see
how corner cases like a not existing output file, the analysis output type being
set to PD_NONE, or whether to complement the output with additional diagnostics
on stderr lay around the preprocessor generated code. This is a bit problematic,
as to how to deal with such errors is not in the hands of the users of this
interface (those implementing output types, like PlistDiagnostics etc).
This patch changes this by moving these corner cases into the generated code,
more specifically, into the called functions. In addition, I introduced a new
output type for convenience purposes, PD_TEXT_MINIMAL, which always existed
conceptually, but never in the actual Analyses.def file. This refactoring
allowed me to move TextDiagnostics (renamed from ClangDiagPathDiagConsumer) to
its own file, which it really deserved.
Also, those that had the misfortune to gaze upon Analyses.def will probably
enjoy the sight that a clang-format did on it.
Differential Revision: https://reviews.llvm.org/D76509
Its been a while since my CheckerRegistry related patches landed, allow me to
refresh your memory:
During compilation, TblGen turns
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td into
(build directory)/tools/clang/include/clang/StaticAnalyzer/Checkers/Checkers.inc.
This is a file that contains the full name of the checkers, their options, etc.
The class that is responsible for parsing this file is CheckerRegistry. The job
of this class is to establish what checkers are available for the analyzer (even
from plugins and statically linked but non-tblgen generated files!), and
calculate which ones should be turned on according to the analyzer's invocation.
CheckerManager is the class that is responsible for the construction and storage
of checkers. This process works by first creating a CheckerRegistry object, and
passing itself to CheckerRegistry::initializeManager(CheckerManager&), which
will call the checker registry functions (for example registerMallocChecker) on
it.
The big problem here is that these two classes lie in two different libraries,
so their interaction is pretty awkward. This used to be far worse, but I
refactored much of it, which made things better but nowhere near perfect.
---
This patch changes how the above mentioned two classes interact. CheckerRegistry
is mainly used by CheckerManager, and they are so intertwined, it makes a lot of
sense to turn in into a field, instead of a one-time local variable. This has
additional benefits: much of the information that CheckerRegistry conveniently
holds is no longer thrown away right after the analyzer's initialization, and
opens the possibility to pass CheckerManager in the shouldRegister* function
rather then LangOptions (D75271).
There are a few problems with this. CheckerManager isn't the only user, when we
honor help flags like -analyzer-checker-help, we only have access to a
CompilerInstance class, that is before the point of parsing the AST.
CheckerManager makes little sense without ASTContext, so I made some changes and
added new constructors to make it constructible for the use of help flags.
Differential Revision: https://reviews.llvm.org/D75360
Summary:
Copy of https://reviews.llvm.org/D72446, submitting with Ilya's permission.
Only used to assign roles to child nodes for now. This is more efficient
than doing range-based queries.
In the future, will be exposed in the public API of syntax trees.
Reviewers: gribozavr2
Reviewed By: gribozavr2
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76355
This makes it possible for plugin attributes to actually do something, and also
removes a lot of boilerplate for simple attributes in SemaDeclAttr.cpp.
Differential Revision: https://reviews.llvm.org/D31342
This is failing on several buildbots with some inexplicable (to me,
right now) crashes. Let's see if this change is adequate to unblock the
buildbots & further understanding can be gained later.
This fix doesn't seem to be right (function_ref can/should be passed by
value) so I'm reverted it to see if the buildbots decide to explain
what's wrong.
This reverts commit 857bf5da35.
Extract common code to a function. To prepare for
adding an option for CUDA/HIP host and device only
option.
Differential Revision: https://reviews.llvm.org/D76455
There are a few places with unexpected indents that trip over sphinx and
other syntax errors.
Also, the C++ syntax highlighting does not work for
class [[gsl::Owner(int)]] IntOwner {
Use a regular code:: block instead.
There are a few other warnings errors remaining, of the form
'Duplicate explicit target name: "cmdoption-clang--prefix"'. They seem
to be caused by the following
.. option:: -B<dir>, --prefix <arg>, --prefix=<arg>
I am no Restructured Text expert, but it seems like sphinx 1.8.5
tries to generate the same target for the --prefix <arg> and
--prefix=<arg>. This pops up in a lot of places and I am not sure how to
best resolve it
Reviewers: jfb, Bigcheese, dexonsmith, rjmccall
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D76534
Summary:
Since the conditional operator cannot be used with vector conditions
in C, we need a builtin to be able to express this operation in C
source.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76538
accept as an extension.
This attempts to accept the same cases a GCC, plus cases where a
comparison is rewritten to an operator== with an integral but non-bool
return type; this is sufficient to avoid most problems with various
major open-source projects (such as ICU) and appears to fix all but one
of the comparison-related C++20 build breaks in LLVM.
This approach is being pursued for standardization.
region.
According to OpenMP 5.0, exactly one scan directive must appear in the loop body of an enclosing worksharing-loop, worksharing-loop SIMD, or simd construct on which a reduction clause with the inscan modifier is present.
Discovered by a downstream user, we found that the CallGraph ignores
callees unless they are defined. This seems foolish, and prevents
combining the report with other reports to create unified reports.
Additionally, declarations contain information that is likely useful to
consumers of the CallGraph.
This patch implements this by splitting the includeInGraph function into
two versions, the current one plus one that is for callees only. The
only difference currently is that includeInGraph checks for a body, then
calls includeCalleeInGraph.
Differential Revision: https://reviews.llvm.org/D76435
Summary:
I've implemented them as target-specific IR intrinsics rather than
using `@llvm.experimental.vector.reduce.add`, on the grounds that the
'experimental' intrinsic doesn't currently have much code generation
benefit, and my replacements encapsulate the sign- or zero-extension
so that you don't expose the illegal MVE vector type (`<4 x i64>`) in
IR.
The machine instructions come in two versions: with and without an
input accumulator. My new IR intrinsics, like the 'experimental' one,
don't take an accumulator parameter: we represent that by just adding
on the input value using an ordinary i32 or i64 add. So if you write
the `vaddvaq` C-language intrinsic with an input accumulator of zero,
it can be optimised to VADDV, and conversely, if you write something
like `x += vaddvq(y)` then that can be combined into VADDVA.
Most of this is achieved in isel lowering, by converting these IR
intrinsics into the existing `ARMISD::VADDV` family of custom SDNode
types. For the difficult case (64-bit accumulators), isel lowering
already implements the optimization of folding an addition into a
VADDLV to make a VADDLVA; so once we've made a VADDLV, our job is
already done, except that I had to introduce a parallel set of ARMISD
nodes for the //predicated// forms of VADDLV.
For the simpler VADDV, we handle the predicated form by just leaving
the IR intrinsic alone and matching it in an ordinary dag pattern.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76491
Summary:
I've implemented these as target-specific IR intrinsics, because
they're not //quite// enough like @llvm.experimental.vector.reduce.min
(which doesn't take the extra scalar parameter). Also this keeps the
predicated and unpredicated versions looking similar, and the
floating-point minnm/maxnm versions fold into the same schema.
We had a couple of min/max reductions already implemented, from the
initial pathfinding exercise in D67158. Those were done by having
separate IR intrinsic names for the signed and unsigned integer
versions; as part of this commit, I've changed them to use a flag
parameter indicating signedness, which is how we ended up deciding
that the rest of the MVE intrinsics family ought to work. So now
hopefully the ewhole lot is consistent.
In the new llc test, the output code from the `v8f16` test functions
looks quite unpleasant, but most of it is PCS lowering (you can't pass
a `half` directly in or out of a function). In other circumstances,
where you do something else with your `half` in the same function, it
doesn't look nearly as nasty.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76490
Summary:
This patch implements the following CDE intrinsics:
int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);
uint16x8_t __arm_vreinterpretq_u16_u8 (uint8x16_t in);
int16x8_t __arm_vreinterpretq_s16_u8 (uint8x16_t in);
uint32x4_t __arm_vreinterpretq_u32_u8 (uint8x16_t in);
int32x4_t __arm_vreinterpretq_s32_u8 (uint8x16_t in);
uint64x2_t __arm_vreinterpretq_u64_u8 (uint8x16_t in);
int64x2_t __arm_vreinterpretq_s64_u8 (uint8x16_t in);
float16x8_t __arm_vreinterpretq_f16_u8 (uint8x16_t in);
float32x4_t __arm_vreinterpretq_f32_u8 (uint8x16_t in);
These intrinsics are header-only because they reuse the existing
MVE vreinterpret clang built-ins.
This set is slightly different from the published specification
(see https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf):
it includes
int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in);
which was unintentionally ommitted from the spec, and
does not include
float64x2_t __arm_vreinterpretq_f64_u8 (uint8x16_t in);
The float64x2_t type requires additional implementation
effort, and we are not including it yet.
Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76300
Summary:
This patch implements the following intrinsics:
uint8x16_t __arm_vcx1q_u8 (int coproc, uint32_t imm);
T __arm_vcx1qa(int coproc, T acc, uint32_t imm);
T __arm_vcx2q(int coproc, T n, uint32_t imm);
uint8x16_t __arm_vcx2q_u8(int coproc, T n, uint32_t imm);
T __arm_vcx2qa(int coproc, T acc, U n, uint32_t imm);
T __arm_vcx3q(int coproc, T n, U m, uint32_t imm);
uint8x16_t __arm_vcx3q_u8(int coproc, T n, U m, uint32_t imm);
T __arm_vcx3qa(int coproc, T acc, U n, V m, uint32_t imm);
Most of them are polymorphic. Furthermore, some intrinsics are
polymorphic by 2 or 3 parameter types, such polymorphism is not
supported by the existing MVE/CDE tablegen backends, also we don't
really want to have a combinatorial explosion caused by 1000 different
combinations of 3 vector types. Because of this some intrinsics are
implemented as macros involving a cast of the polymorphic arguments to
uint8x16_t.
The IR intrinsics are even more restricted in terms of types: all MVE
vectors are cast to v16i8.
Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76299