Generate btf_tag annotations for record fields. The annotations
are represented as an DINodeArray in DebugInfo.
Differential Revision: https://reviews.llvm.org/D106616
Partially reverts 85157c0079, which had removed these builtins and intrinsics
in favor of normal codegen patterns. It turns out that it is possible for the
patterns to be split over multiple basic blocks, however, which means that DAG
ISel is not able to select them to the pmin/pmax instructions. To make sure the
SIMD intrinsics generate the correct instructions in these cases, reintroduce
the clang builtins and corresponding LLVM intrinsics, but also keep the normal
pattern matching as well.
Differential Revision: https://reviews.llvm.org/D108387
On some platforms, negative shift values mean to shift in the opposite
direction, but this is not true with WebAssembly. To avoid confusion, make the
shift values in the shift intrinsics unsigned.
Differential Revision: https://reviews.llvm.org/D108415
For each SIMD intrinsic function that takes or returns a scalar signed integer
value, ensure there is a corresponding intrinsic that returns or an
unsigned value. This is a convenience for users who use -Wsign-conversion so
they don't have to insert explicit casts, especially when the intrinsic
arguments are integer literals that fit into the unsigned integer type but not
the signed type.
Differential Revision: https://reviews.llvm.org/D108412
Remove redundant fields and replace pointer with virtual function
Of fourteen fields, three are dead and four can be computed from the
remainder. This leaves a couple of currently dead fields in place as
they are expected to be used from the deviceRTL shortly. Two of the
fields that can be computed are only used from codegen and require a
log2() implementation so are inlined into codegen instead.
This change leaves the new methods in the same location in the struct
as the previous fields for convenience at review.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108380
This implements P2362, which has not yet been approved by the
C++ committee, but because wide-multi character literals are
implementation defined, clang might not have to wait for WG21.
This change is also being applied in C mode as the behavior is
implementation-defined in C as well and there's no benefit to
having different rules between the languages.
The other part of P2362, making non-representable character
literals ill-formed, is already implemented by clang
When calculating the name to display for inline namespaces, we have
custom logic to try to hide redundant inline namespaces from the
diagnostic. Calculating these redundancies requires performing a lookup
in the parent declaration context, but that lookup should not try to
look through transparent declaration contexts, like linkage
specifications. Instead, loop up the declaration context chain until we
find a non-transparent context and use that instead.
This fixes PR49954.
The purpose of __attribute__((disable_sanitizer_instrumentation)) is to
prevent all kinds of sanitizer instrumentation applied to a certain
function, Objective-C method, or global variable.
The no_sanitize(...) attribute drops instrumentation checks, but may
still insert code preventing false positive reports. In some cases
though (e.g. when building Linux kernel with -fsanitize=kernel-memory
or -fsanitize=thread) the users may want to avoid any kind of
instrumentation.
Differential Revision: https://reviews.llvm.org/D108029
C++ for OpenCL version 2021 and later are expected to consist of a
major version number only. Therefore, a different constructor for
`VersionTuple` needs to be called when reporting language version.
Differential Revision: https://reviews.llvm.org/D108379
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.
It is NFC for non-HIP languages.
Differential Revision: https://reviews.llvm.org/D102405
Produce remarks when atomic instructions are expanded into hardware instructions
in SIISelLowering.cpp. Currently, these remarks are only emitted for atomic fadd
instructions.
Differential Revision: https://reviews.llvm.org/D108150
This patch implements the builtins for cmplxl by utilising
__builtin_complex. This builtin is implemented to match XL
functionality.
Differential revision: https://reviews.llvm.org/D107138
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations.
Differential Revision: https://reviews.llvm.org/D106615
Since they are bitmasks, it will be more common for them to be used and
potentially extended to 64-bit integers as unsigned values rather than signed
values.
Differential Revision: https://reviews.llvm.org/D108401
A new rule is added in 5.0:
If a list item appears in a reduction, lastprivate or linear clause
on a combined target construct then it is treated as if it also appears
in a map clause with a map-type of tofrom.
Currently map clauses for all capture variables are added implicitly.
But missing for list item of expression for array elements or array
sections.
The change is to add implicit map clause for array of elements used in
reduction clause. Skip adding map clause if the expression is not
mappable.
Noted: For linear and lastprivate, since only variable name is
accepted, the map has been added though capture variables.
To do so:
During the mappable checking, if error, ignore diagnose and skip
adding implicit map clause.
The changes:
1> Add code to generate implicit map in ActOnOpenMPExecutableDirective,
for omp 5.0 and up.
2> Add extra default parameter NoDiagnose in ActOnOpenMPMapClause:
Use that to skip error as well as skip adding implicit map during the
mappable checking.
Note: there are only tow places need to be check for NoDiagnose. Rest
of them either the check is for < omp 5.0 or the error already generated for
reduction clause.
Differential Revision: https://reviews.llvm.org/D108132
Android enables zero initialisation globally by default, but also allows
subprojects to override with different option. Clang complains the above
flag being unused in this case.
Instead of adding a 75 char long -no-* flag, don't warn unused argument
for this flag.
Differential Revision: https://reviews.llvm.org/D108278
[nfc] Replaces enum indices into an array with a struct. Named the
fields to match the enum, leaves memory layout and initialization unchanged.
Motivation is to later safely remove dead fields and replace redundant ones
with (compile time) computation. It should also be possible to factor some
common fields into a base and introduce a gfx10 amdgpu instance with less
duplication than the arrays of integers require.
Reviewed By: ronlieb
Differential Revision: https://reviews.llvm.org/D108339
Completion now looks more like function/member completion:
used
alias(Aliasee)
abi_tag(Tags...)
Differential Revision: https://reviews.llvm.org/D108109
With -fpreserve-vec3-type enabled, a cast was not created when
converting from a vec3 type to a non-vec3 type, even though a
conversion to vec4 was performed. This resulted in creation of
invalid store instructions.
Differential Revision: https://reviews.llvm.org/D107963
Refactored implementation of AddressSanitizerPass and
HWAddressSanitizerPass to use pass options similar to passes like
MemorySanitizerPass. This makes sure that there is a single mapping
from class name to pass name (needed by D108298), and options like
-debug-only and -print-after makes a bit more sense when (despite
that it is the unparameterized pass name that should be used in those
options).
A result of the above is that some pass names are removed in favor
of the parameterized versions:
- "khwasan" is now "hwasan<kernel;recover>"
- "kasan" is now "asan<kernel>"
- "kmsan" is now "msan<kernel>"
Differential Revision: https://reviews.llvm.org/D105007
This patch implements Flow Sensitive Sample FDO (FSAFDO) profile
loader. We have two profile loaders for FS profile,
one before RegAlloc and one before BlockPlacement.
To enable it, when -fprofile-sample-use=<profile> is specified,
add "-enable-fs-discriminator=true \
-disable-ra-fsprofile-loader=false \
-disable-layout-fsprofile-loader=false"
to turn on the FS profile loaders.
Differential Revision: https://reviews.llvm.org/D107878
Fixes miscompile of calls into ocml. Bug 51445.
The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.
This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107971
Clean up the detection of parameter declarations in K&R C function
definitions. Also make it more precise by requiring the second
token after the r_paren to be either a star or keyword/identifier.
Differential Revision: https://reviews.llvm.org/D108094
Use uint64_t for lanemask on all GPU architectures at the interface
with clang. Updates tests. The deviceRTL is always linked as IR so the zext
and trunc introduced for wave32 architectures will fold after inlining.
Simplification partly motivated by amdgpu gfx10 which will be wave32 and
is awkward to express in the current arch-dependant typedef interface.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108317
This change-set puts 93d08acaac functionality
under -add-omp-offload-notes switch that is OFF by default.
CUDA toolchain is not able to handle ELF images with LLVMOMPOFFLOAD
notes for unknown reason (see https://reviews.llvm.org/D99551#2950272).
I disable the ELF notes embedding until the CUDA issue is triaged and resolved.
Differential Revision: https://reviews.llvm.org/D108246
The interesting bit about that triple isn't the architecture, it's the
fact that ps4 implies C99 as the standard rather than a newer C mode.
Specify the language standard rather than the triple so the test is a
bit more general.
This adds the Unicode 13 data for XID_Start and XID_Continue.
The definition of valid identifier is changed in all C++ modes
as P1949 (https://wg21.link/p1949) was accepted by WG21 as a defect
report.
Target is only ever non-null when we find an existing type, so move its declaration inside that case, and remove the dead code where Target was always null.