The getConstraintRegister method is used by semantic checking of inline
assembly statements in order to diagnose conflicts between clobber list
and input/output lists. By overriding getConstraintRegister we get those
diagnostics and we match RISC-V GCC's behavior. The implementation is
trivial due to the lack of single-register RISC-V-specific constraints.
Differential Revision: https://reviews.llvm.org/D108624
Generate btf_tag annotations for DISubprograms. The annotations
are represented as an DINodeArray in DebugInfo.
Differential Revision: https://reviews.llvm.org/D106618
In LLVM IR, `AlignmentBitfieldElementT` is 5-bit wide
But that means that the maximal alignment exponent is `(1<<5)-2`,
which is `30`, not `29`. And indeed, alignment of `1073741824`
roundtrips IR serialization-deserialization.
While this doesn't seem all that important, this doubles
the maximal supported alignment from 512MiB to 1GiB,
and there's actually one noticeable use-case for that;
On X86, the huge pages can have sizes of 2MiB and 1GiB (!).
So while this doesn't add support for truly huge alignments,
which i think we can easily-ish do if wanted, i think this adds
zero-cost support for a not-trivially-dismissable case.
I don't believe we need any upgrade infrastructure,
and since we don't explicitly record the IR version,
we don't need to bump one either.
As @craig.topper speculates in D108661#2963519,
this might be an artificial limit imposed by the original implementation
of the `getAlignment()` functions.
Differential Revision: https://reviews.llvm.org/D108661
The existing code attempting to bitcast from a value in the default globals AS
to i8 addrspace(0)* was triggering an assertion failure in our downstream fork.
I found this while compiling poppler for CHERI-RISC-V (we use AS200 for all
globals). The test case uses AMDGPU since that is one of the in-tree targets
with a non-zero default globals address space.
The new test previously triggered a "Invalid constantexpr bitcast!" assertion
and now correctly generates code with addrspace(1) pointers.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D105972
There are a number of language and preprocessor options that are reset in the `CompilerInvocation` that describes the build of an implicit module. This patch uses the logic for explicit modules as well.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108710
This change would treat the token `or` in system headers as an
identifier, and elsewhere as an operator. As reported in
llvm.org/pr42427, many users classify their third party library headers
as "system" headers to suppress warnings. There's no clean way to
separate Windows SDK headers from user headers.
Clang is still able to parse old Windows SDK headers if C++ operator
names are disabled. Traditionally this was controlled by
`-fno-operator-names`, but is now also enabled with `/permissive` since
D103773. This change will prevent `clang-cl` from parsing <query.h> from
the Windows SDK out of the box, but there are multiple ways to work
around that:
- Pass `/clang:-fno-operator-names`
- Pass `/permissive`
- Pass `-DQUERY_H_RESTRICTION_PERMISSIVE`
In all of these modes, the operator names will consistently be available
or not available, instead of depending on whether the code is in a
system header.
I added a release note for this, since it may break straightforward
users of the Windows SDK.
Fixes PR42427
Differential Revision: https://reviews.llvm.org/D108720
In -P mode, PrintPPOutputPPCallbacks::MoveToLine started at least one
newline if current and target line number mismatched. The method is also
called when entering a new file, be it the main file or an include file.
In this situation line numbers always almost mismatch, resulting in a
newline for each occurance even if no tokens have been printed
in-between.
Empty lines at the beginning of the output must be trimmed because it
may be parsed by scripts expecting the result to appear on the first
output line, as done by LibreOffice's configure script.
Fix by only emitting a newline if tokens have been printed so far using
the EmittedTokensOnThisLine flag. Also adding a test case of FileChanged
callbacks occuring with empty include files.
This fixes llvm.org/PR51616
Add support for the GNU C style __attribute__((error(""))) and
__attribute__((warning(""))). These attributes are meant to be put on
declarations of functions whom should not be called.
They are frequently used to provide compile time diagnostics similar to
_Static_assert, but which may rely on non-ICE conditions (ie. relying on
compiler optimizations). This is also similar to diagnose_if function
attribute, but can diagnose after optimizations have been run.
While users may instead simply call undefined functions in such cases to
get a linkage failure from the linker, these provide a much more
ergonomic and actionable diagnostic to users and do so at compile time
rather than at link time. Users instead may be able use inline asm .err
directives.
These are used throughout the Linux kernel in its implementation of
BUILD_BUG and BUILD_BUG_ON macros. These macros generally cannot be
converted to use _Static_assert because many of the parameters are not
ICEs. The Linux kernel still needs to be modified to make use of these
when building with Clang; I have a patch that does so I will send once
this feature is landed.
To do so, we create a new IR level Function attribute, "dontcall" (both
error and warning boil down to one IR Fn Attr). Then, similar to calls
to inline asm, we attach a !srcloc Metadata node to call sites of such
attributed callees.
The backend diagnoses these during instruction selection, while we still
know that a call is a call (vs say a JMP that's a tail call) in an arch
agnostic manner.
The frontend then reconstructs the SourceLocation from that Metadata,
and determines whether to emit an error or warning based on the callee's
attribute.
Link: https://bugs.llvm.org/show_bug.cgi?id=16428
Link: https://github.com/ClangBuiltLinux/linux/issues/1173
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106030
pointers to structs
clang was just being conservative and trying to prevent users from
messing up the qualifier on the inner pointer type. Lifting this
restriction enables using some of the libc++ templates with ObjC pointer
arguments, which clang currently rejects.
rdar://79018677
Differential Revision: https://reviews.llvm.org/D107021
This reverts commit df1f4e0cc6.
Now the test case explicitly specifies the target triple.
I decided to use x86_64 for that matter, to have a fixed
bitwidth for `size_t`.
Aside from that, relanding the original changes of:
https://reviews.llvm.org/D105184
Currently only `ConstantArrayType` is considered for flexible array
members (FAMs) in `getStaticSize()`.
However, `IncompleteArrayType` also shows up in practice as FAMs.
This patch will ignore the `IncompleteArrayType` and return Unknown
for that case as well. This way it will be at least consistent with
the current behavior until we start modeling them accurately.
I'm expecting that this will resolve a bunch of false-positives
internally, caused by the `ArrayBoundV2`.
Reviewed By: ASDenysPetrov
Differential Revision: https://reviews.llvm.org/D105184
Translation units with multiple direct modular dependencies trigger a non-deterministic ordering in `clang-scan-deps`. This boils down to usage of `std::unordered_map`, which gets replaced by `std::map` in this patch.
Depends on D103526.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103807
The `ASTReader` populates `Module::PresumedModuleMapFile` only for top-level modules, not submodules. To avoid generating empty `-fmodule-map-file=` arguments, make discovered modules depend on top-level precompiled modules. The granularity of submodules is not important here.
The documentation of `Module::PresumedModuleMapFile` says this field is non-empty only when building from preprocessed source. This means there can still be cases where the dependency scanner generates empty `-fmodule-map-file=` arguments. That's being addressed in separate patch: D108544.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108647
In this patch, the dependency scanner starts collecting precompiled dependencies from all encountered submodules, not only from top-level modules.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108540
NVPTX does not allow dots in the identifier, so ptxas errors out with
fatal : Parsing error near '.static': syntax error
because it parses .static as a directive. Avoid this problem by using
two underscores, similar to what OpenMP does for outlined functions.
Differential Revision: https://reviews.llvm.org/D108456
8ace121305 introduced a regression for code that explicitly ignores the
-Wframe-larger-than= warning. Make sure we don't generate the
warn-stack-size attribute for that case.
Differential Revision: https://reviews.llvm.org/D108686
Previously when emitting a C++ guarded initializer, we tried to work out what
the enclosing function would be used for and added it to the COMDAT containing
the variable if we thought that doing so would be correct. But this was done
from a context in which we didn't -- and realistically couldn't -- correctly
infer how the enclosing function would be used.
Instead, add the initialization function to a COMDAT from the code that
creates it, in the case where it makes sense to do so: when we know that
the one and only reference to the initialization function is in
@llvm.global.ctors and that reference is in the same COMDAT.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D108680
This adds support for Wasm SjLj in clang. Also this sets the new
`-mllvm -wasm-enable-eh` option for Wasm EH.
Note there is a little unfortunate inconsistency there: Wasm EH is
enabled by a clang option `-fwasm-exceptions`, which sets
`-mllvm -wasm-enable-eh` in the backend options. It also sets
`-exception-model=wasm` but this is done in the common code.
Wasm SjLj doesn't have a clang-level option like `-fwasm-exceptions`.
`-fwasm-exceptions` was added because each exception model has its
corresponding `-f***-exceptions`, but I'm not sure if adding a new
option like `-fwasm-sjlj` or something is a good idea.
So the current plan is Emscripten sets `-mllvm -wasm-enable-sjlj` if
Wasm SJLj is enabled in its settings.js, as it does for Emscripten
EH/SjLj (it sets `-mllvm -enable-emscripten-cxx-exceptions` for
Emscripten EH and `-mllvm -enable-emscripten-sjlj` for Emscripten SjLj).
And setting this enables the exception handling feature, and also sets
`-exception-model=wasm`, but this time this is not done in the common
code so we do it ourselves.
Also note that other exception models have 1-to-1 correspondance with
their `-f***-exceptions` flag and their `-exception-model=***` flag, but
because we use `-exception-model=wasm` also for Wasm SjLj while
`-fwasm-exceptions` still means Wasm EH, there is also a little
inconsistency there, but I think it is manageable.
Also this adds various error checking and tests.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D108582
-fstack-clash-protection was added in Clang commit e67cbac812 but was
enabled only on Linux. Allow it on FreeBSD as well, as it works fine.
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D108571
This CL is small, but the description can be a little long because I'm
trying to sum up the status quo for Emscripten/Wasm EH/SjLj options.
First, this CL adds an option for Wasm SjLj (`-wasm-enable-sjlj`), which
handles SjLj using Wasm EH. The implementation for this will be added as
a followup CL, but this adds the option first to do error checking.
This also adds an option for Wasm EH (`-wasm-enable-eh`), which has been
already implemented. Before we used `-exception-model=wasm` as the same
meaning as enabling Wasm EH, but after we add Wasm SjLj, it will be
possible to use Wasm EH instructions for Wasm SjLj while not enabling
EH, so going forward, to use Wasm EH, `opt` and `llc` will need this
option. This only affects `opt` and `llc` command lines and does not
affect Emscripten user interface.
Now we have two modes of EH (Emscripten/Wasm) and also two modes of SjLj
(also Emscripten/Wasm). The options corresponding to each of are:
- Emscripten EH: `-enable-emscripten-cxx-exceptions`
- Emscripten SjLj: `-enable-emscripten-sjlj`
- Wasm EH: `-wasm-enable-eh -exception-model=wasm`
`-mattr=+exception-handling`
- Wasm SjLj: `-wasm-enable-sjlj -exception-model=wasm`
`-mattr=+exception-handling`
The reason Wasm EH/SjLj's options are a little complicated are
`-exception-model` and `-mattr` are common LLVM options ane not under
our control. (`-mattr` can be omitted if it is embedded within the
bitcode file.)
And we have the following rules of the option composition:
- Emscripten EH and Wasm EH cannot be turned on at the same itme
- Emscripten SjLj and Wasm SjLj cannot be turned on at the same time
- Wasm SjLj should be used with Wasm EH
Which means we now allow these combinations:
- Emscripten EH + Emscripten SjLj: the current default in `emcc`
- Wasm EH + Emscripten SjLj:
This is allowed, but only as an interim step in which we are testing
Wasm EH but not yet have a working implementation of Wasm SjLj. This
will error out (D107687) in compile time if `setjmp` is called in a
function in which Wasm exception is used.
- Wasm EH + Wasm SjLj:
This will be the default mode later when using Wasm EH. Currently Wasm
SjLj implementation doesn't exist, so it doesn't work.
- Emscripten EH + Wasm SjLj will not work.
This CL moves these error checking routines to
`WebAssemblyPassConfig::addIRPasses`. Not sure if this is an ideal place
to do this, but I couldn't find elsewhere. Currently some checking is
done within LowerEmscriptenEHSjLj, but these checks only run if
LowerEmscriptenEHSjLj runs so it may not run when Wasm EH is used. This
moves that to `addIRPasses` and adds some more checks.
Currently LowerEmscriptenEHSjLj pass is responsible for Emscripten EH
and Emscripten SjLj. Wasm EH transformations are done in multiple
places, including WasmEHPrepare, LateEHPrepare, and CFGStackify. But in
the followup CL, LowerEmscriptenEHSjLj pass will be also responsible for
a part of Wasm SjLj transformation, because WasmSjLj will also be using
several Emscripten library functions, and we will be sharing more than
half of the transformation to do that between Emscripten SjLj and Wasm
SjLj.
Currently we have `-enable-emscripten-cxx-exceptions` and
`-enable-emscripten-sjlj` but these only work for `llc`, because for
`llc` we feed these options to the pass but when we run the pass using
`opt` the pass will be created with no options and the default options
will be used, which turns both Emscripten EH and Emscripten SjLj on.
Now we have one more SjLj option to care for, LowerEmscriptenEHSjLj pass
needs a finer way to control these options. This CL removes those
default parameters and make LowerEmscriptenEHSjLj pass read directly
from command line options specified. So if we only run
`opt -wasm-lower-em-ehsjlj`, currently both Emscripten EH and Emscripten
SjLj will run, but with this CL, none will run unless we additionally
pass `-enable-emscripten-cxx-exceptions` or `-enable-emscripten-sjlj`,
or both. This does not affect users; this only affects our `opt` tests
because `emcc` will not call either `opt` or `llc`. As a result of this,
our existing Emscripten EH/SjLj tests gained one or both of those
options in their `RUN` lines.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107685
CodeGenAction::ExecuteAction creates a BackendConsumer for the
purpose of handling diagnostics. The BackendConsumer's
DiagnosticHandlerImpl method expects CurLinkModule to be set,
but this did not happen on the code path that goes through
ExecuteAction. This change makes it so that the BackendConsumer
constructor used by ExecuteAction requires the Module to be
specified and passes the appropriate module in ExecuteAction.
The change also adds a test that fails without this change
and passes with it. To make the test work, the FIXME in the
handling of DK_Linker diagnostics was addressed so that warnings
and notes are no longer silently discarded. Since this introduces
a new warning diagnostic, a flag to control it (-Wlinker-warnings)
has also been added.
Reviewed By: xur
Differential Revision: https://reviews.llvm.org/D108603
Clang currently picks the second tentative definition when
VarDecl::getActingDefinition is called.
This can lead to attributes being dropped if they are attached to
tentative definitions that appear after the second one. This is
because VarDecl::getActingDefinition loops through VarDecl::redecls
assuming that the last tentative definition is the last element in the
iterator. However, it is the second element that would be the last
tentative definition.
This changeset modifies getActingDefinition to iterate through the
declaration chain in reverse, so that it can immediately return when
it encounters a tentative definition.
Differential Revision: https://reviews.llvm.org/D99732
Always use cuda.h to detect CUDA version. It's a more universal approach
compared to version.txt which is no longer present in recent CUDA versions.
Split the 'unknown CUDA version' warning in two:
* when detected CUDA version is partially supported by clang. It's expected to
work in general, at the feature parity with the latest supported CUDA
version. and may be missing support for the new features/instructions/GPU
variants. Clang will issue a warning.
* when detected version is new. Recent CUDA versions have been working with
clang reasonably well, and will likely to work similarly to the partially
supported ones above. Or it may not work at all. Clang will issue a warning and
proceed as if the latest known CUDA version was detected.
Differential Revision: https://reviews.llvm.org/D108247
This patch adds `#pragma clang restrict_expansion ` to enable flagging
macros as unsafe for header use. This is to allow macros that may have
ABI implications to be avoided in headers that have ABI stability
promises.
Using macros in headers (particularly public headers) can cause a
variety of issues relating to ABI and modules. This new pragma logs
warnings when using annotated macros outside the main source file.
This warning is added under a new diagnostics group -Wpedantic-macros
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D107095
Mapping expressions that have `this` as their base expression aren't
considered a valid base variable and the rest of the runtime expects
this. However, if we have an expression with no value declaration we can
try to extract it manually to provide more helpful debuggin information.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108483
Generate btf_tag annotations for record fields. The annotations
are represented as an DINodeArray in DebugInfo.
Differential Revision: https://reviews.llvm.org/D106616
Partially reverts 85157c0079, which had removed these builtins and intrinsics
in favor of normal codegen patterns. It turns out that it is possible for the
patterns to be split over multiple basic blocks, however, which means that DAG
ISel is not able to select them to the pmin/pmax instructions. To make sure the
SIMD intrinsics generate the correct instructions in these cases, reintroduce
the clang builtins and corresponding LLVM intrinsics, but also keep the normal
pattern matching as well.
Differential Revision: https://reviews.llvm.org/D108387
On some platforms, negative shift values mean to shift in the opposite
direction, but this is not true with WebAssembly. To avoid confusion, make the
shift values in the shift intrinsics unsigned.
Differential Revision: https://reviews.llvm.org/D108415
For each SIMD intrinsic function that takes or returns a scalar signed integer
value, ensure there is a corresponding intrinsic that returns or an
unsigned value. This is a convenience for users who use -Wsign-conversion so
they don't have to insert explicit casts, especially when the intrinsic
arguments are integer literals that fit into the unsigned integer type but not
the signed type.
Differential Revision: https://reviews.llvm.org/D108412
This implements P2362, which has not yet been approved by the
C++ committee, but because wide-multi character literals are
implementation defined, clang might not have to wait for WG21.
This change is also being applied in C mode as the behavior is
implementation-defined in C as well and there's no benefit to
having different rules between the languages.
The other part of P2362, making non-representable character
literals ill-formed, is already implemented by clang
When calculating the name to display for inline namespaces, we have
custom logic to try to hide redundant inline namespaces from the
diagnostic. Calculating these redundancies requires performing a lookup
in the parent declaration context, but that lookup should not try to
look through transparent declaration contexts, like linkage
specifications. Instead, loop up the declaration context chain until we
find a non-transparent context and use that instead.
This fixes PR49954.
The purpose of __attribute__((disable_sanitizer_instrumentation)) is to
prevent all kinds of sanitizer instrumentation applied to a certain
function, Objective-C method, or global variable.
The no_sanitize(...) attribute drops instrumentation checks, but may
still insert code preventing false positive reports. In some cases
though (e.g. when building Linux kernel with -fsanitize=kernel-memory
or -fsanitize=thread) the users may want to avoid any kind of
instrumentation.
Differential Revision: https://reviews.llvm.org/D108029
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.
It is NFC for non-HIP languages.
Differential Revision: https://reviews.llvm.org/D102405
Produce remarks when atomic instructions are expanded into hardware instructions
in SIISelLowering.cpp. Currently, these remarks are only emitted for atomic fadd
instructions.
Differential Revision: https://reviews.llvm.org/D108150
This patch implements the builtins for cmplxl by utilising
__builtin_complex. This builtin is implemented to match XL
functionality.
Differential revision: https://reviews.llvm.org/D107138
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations.
Differential Revision: https://reviews.llvm.org/D106615
Since they are bitmasks, it will be more common for them to be used and
potentially extended to 64-bit integers as unsigned values rather than signed
values.
Differential Revision: https://reviews.llvm.org/D108401
A new rule is added in 5.0:
If a list item appears in a reduction, lastprivate or linear clause
on a combined target construct then it is treated as if it also appears
in a map clause with a map-type of tofrom.
Currently map clauses for all capture variables are added implicitly.
But missing for list item of expression for array elements or array
sections.
The change is to add implicit map clause for array of elements used in
reduction clause. Skip adding map clause if the expression is not
mappable.
Noted: For linear and lastprivate, since only variable name is
accepted, the map has been added though capture variables.
To do so:
During the mappable checking, if error, ignore diagnose and skip
adding implicit map clause.
The changes:
1> Add code to generate implicit map in ActOnOpenMPExecutableDirective,
for omp 5.0 and up.
2> Add extra default parameter NoDiagnose in ActOnOpenMPMapClause:
Use that to skip error as well as skip adding implicit map during the
mappable checking.
Note: there are only tow places need to be check for NoDiagnose. Rest
of them either the check is for < omp 5.0 or the error already generated for
reduction clause.
Differential Revision: https://reviews.llvm.org/D108132
Completion now looks more like function/member completion:
used
alias(Aliasee)
abi_tag(Tags...)
Differential Revision: https://reviews.llvm.org/D108109
With -fpreserve-vec3-type enabled, a cast was not created when
converting from a vec3 type to a non-vec3 type, even though a
conversion to vec4 was performed. This resulted in creation of
invalid store instructions.
Differential Revision: https://reviews.llvm.org/D107963
Fixes miscompile of calls into ocml. Bug 51445.
The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.
This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107971
Use uint64_t for lanemask on all GPU architectures at the interface
with clang. Updates tests. The deviceRTL is always linked as IR so the zext
and trunc introduced for wave32 architectures will fold after inlining.
Simplification partly motivated by amdgpu gfx10 which will be wave32 and
is awkward to express in the current arch-dependant typedef interface.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108317
This change-set puts 93d08acaac functionality
under -add-omp-offload-notes switch that is OFF by default.
CUDA toolchain is not able to handle ELF images with LLVMOMPOFFLOAD
notes for unknown reason (see https://reviews.llvm.org/D99551#2950272).
I disable the ELF notes embedding until the CUDA issue is triaged and resolved.
Differential Revision: https://reviews.llvm.org/D108246
The interesting bit about that triple isn't the architecture, it's the
fact that ps4 implies C99 as the standard rather than a newer C mode.
Specify the language standard rather than the triple so the test is a
bit more general.
This adds the Unicode 13 data for XID_Start and XID_Continue.
The definition of valid identifier is changed in all C++ modes
as P1949 (https://wg21.link/p1949) was accepted by WG21 as a defect
report.
Three tests fail when building and testing LLVM from the Visual C++ environment
since they use the repo version of lit.py that do not have local customization
builtin_parameters = { 'build_mode' : 'Release' }
https://bugs.llvm.org/show_bug.cgi?id=51072
Reviewed By: dyung
Differential Revision: https://reviews.llvm.org/D108085
Reading modules first reads each control block in the chain and then all
AST blocks.
The first phase is intended to find recoverable errors, eg. an out of
date or missing module. If any error occurs during this phase, it is
safe to remove all modules in the chain as no references to them will
exist.
While reading the AST blocks, however, various fields in ASTReader are
updated with references to the module. Removing modules at this point
can cause dangling pointers which can be accessed later. These would be
otherwise harmless, eg. a binary search over `GlobalSLocEntryMap` may
access a failed module that could error, but shouldn't crash. Do not
remove modules in this phase, regardless of failures.
Since this is the case, it also doesn't make sense to return OutOfDate
during this phase, so remove the two cases where this happens.
When they were originally added these checks would return a failure when
the serialized and current path didn't match up. That was updated to an
OutOfDate as it was found to be hit when using VFS and overriding the
umbrella. Later on the path was changed to instead be the name as
written in the module file, resolved using the serialized base
directory. At this point the check is really only comparing the name of
the umbrella and only works for frameworks since those don't include
`Headers/` in the name (which means the resolved path will never exist)
Given all that, it seems safe to ignore this case entirely for now.
This makes the handling of an umbrella header/directory the same as
regular headers, which also don't check for differences in the path
caused by VFS.
Resolves rdar://79329355
Differential Revision: https://reviews.llvm.org/D107690
Removed AArch64 usage of the getMaxVScale interface, replacing it with
the vscale_range(min, max) IR Attribute.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D106277
Search avr-libc path according to avr-gcc installation at first,
then other possible installed pathes.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107682
The Linux kernel has a macro called IS_ENABLED(), which evaluates to a
constant 1 or 0 based on Kconfig selections, allowing C code to be
unconditionally enabled or disabled at build time. For example:
int foo(struct *a, int b) {
switch (b) {
case 1:
if (a->flag || !IS_ENABLED(CONFIG_64BIT))
return 1;
__attribute__((fallthrough));
case 2:
return 2;
default:
return 3;
}
}
There is an unreachable warning about the fallthrough annotation in the
first case because !IS_ENABLED(CONFIG_64BIT) can be evaluated to 1,
which looks like
return 1;
__attribute__((fallthrough));
to clang.
This type of warning is pointless for the Linux kernel because it does
this trick all over the place due to the sheer number of configuration
options that it has.
Add -Wunreachable-code-fallthrough, enabled under -Wunreachable-code, so
that projects that want to warn on unreachable code get this warning but
projects that do not care about unreachable code can still use
-Wimplicit-fallthrough without having to make changes to their code
base.
Fixes PR51094.
Reviewed By: aaron.ballman, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D107933
@arichardson pointed out in post-commit review for
https://reviews.llvm.org/D95583 (b714f73def) that `-verify` has an
optional argument that works a lot like `FileCheck`'s `-check-prefix`.
Use it to simplify the test for `-fno-implicit-modules-use-lock`!
The patch adds ELF notes into SHT_NOTE sections of ELF offload images
passed to clang-offload-wrapper.
The new notes use a null-terminated "LLVMOMPOFFLOAD" note name.
There are currently three types of notes:
VERSION: a string (not null-terminated) representing the ELF offload
image structure. The current version '1.0' does not put any restrictions
on the structure of the image. If we ever need to come up with a common
structure for ELF offload images (e.g. to be able to analyze the images
in libomptarget in some standard way), then we will introduce new versions.
PRODUCER: a vendor specific name of the producing toolchain.
Upstream LLVM uses "LLVM" (not null-terminated).
PRODUCER_VERSION: a vendor specific version of the producing toolchain.
Upstream LLVM uses LLVM_VERSION_STRING with optional <space> LLVM_REVISION.
All three notes are not mandatory currently.
Differential Revision: https://reviews.llvm.org/D99551
LoopLoadElimination, LoopVersioning and LoopVectorize currently
fetch MemorySSA when construction LoopAccessAnalysis. However,
LoopAccessAnalysis does not actually use MemorySSA and we can pass
nullptr instead.
This saves one MemorySSA calculation in the default pipeline, and
thus improves compile-time.
Differential Revision: https://reviews.llvm.org/D108074
Two standalone LoopRotate passes scheduled using
createFunctionToLoopPassAdaptor() currently enable MemorySSA.
However, while LoopRotate can preserve MemorySSA, it does not use
it, so requiring MemorySSA is unnecessary.
This change doesn't have a practical compile-time impact by itself,
because subsequent passes still request MemorySSA.
Differential Revision: https://reviews.llvm.org/D108073
This is a rather common feedback we get from out leak checkers: bug reports are
really short, and are contain barely any usable information on what the analyzer
did to conclude that a leak actually happened.
This happens because of our bug report minimizing effort. We construct bug
reports by inspecting the ExplodedNodes that lead to the error from the bottom
up (from the error node all the way to the root of the exploded graph), and mark
entities that were the cause of a bug, or have interacted with it as
interesting. In order to make the bug report a bit less verbose, whenever we
find an entire function call (from CallEnter to CallExitEnd) that didn't talk
about any interesting entity, we prune it (click here for more info on bug
report generation). Even if the event to highlight is exactly this lack of
interaction with interesting entities.
D105553 generalized the visitor that creates notes for these cases. This patch
adds a new kind of NoStateChangeVisitor that leaves notes in functions that
took a piece of dynamically allocated memory that later leaked as parameter,
and didn't change its ownership status.
Differential Revision: https://reviews.llvm.org/D105553
This covers the SSE and AVX/AVX2 headers. AVX512 has a lot more macros
due to rounding mode.
Fixes part of PR51324.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D107843
Some Clang diagnostics could only report OpenCL C version. Because
C++ for OpenCL can be used as an alternative to OpenCL C, the text
for diagnostics should reflect that.
Desrciptions modified for these diagnostics:
`err_opencl_unknown_type_specifier`
`warn_option_invalid_ocl_version`
`err_attribute_requires_opencl_version`
`warn_opencl_attr_deprecated_ignored`
`ext_opencl_ext_vector_type_rgba_selector`
Differential Revision: https://reviews.llvm.org/D107648
This fixes the 'unused linker option: -lm' warning when compiling
program with -c.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D107952
'armv7-pc-win32-macho'
It is incorrect to select the hardware floating point ABI on Mach-O
platforms using the Windows triple if the ABI is "apcs-gnu".
rdar://81810554
Differential Revision: https://reviews.llvm.org/D107939
A new attribute btf_tag is added. The syntax looks like
__attribute__((btf_tag(<string>)))
Users may tag a particular structure/member/function/func_parameter/variable
declaration with an arbitrary string and the intention is
that this string is passed to dwarf so it is available for
post-compilation analysis. The string will be also passed
to .BTF section if the target is BPF. For each permitted
declaration, multiple btf_tag's are allowed.
For detailed use cases, please see
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151009.html
In case that there exist redeclarations, the btf_tag attributes
will be accumulated along with different declarations, and the
last declaration will contain all attributes.
Differential Revision: https://reviews.llvm.org/D106614
Add -cc1 flags `-fmodules-uses-lock` and `-fno-modules-uses-lock` to
allow the lock manager to be turned off when building implicit modules.
Add `-Rmodule-lock` so that we can see when it's being used.
Differential Revision: https://reviews.llvm.org/D95583
Only the bare name is completed, with no args.
For args to be useful we need arg names. These *are* in the tablegen but
not currently emitted in usable form, so left this as future work.
C++11, C2x, GNU, declspec, MS syntax is supported, with the appropriate
spellings of attributes suggested.
`#pragma clang attribute` is supported but not terribly useful as we
only reach completion if parens are balanced (i.e. the line is not truncated)
There's no filtering of which attributes might make sense in this
grammatical context (e.g. attached to a function). In code-completion context
this is hard to do, and will only work in few cases :-(
There's also no filtering by langopts: this is because currently the
only way of checking is to try to produce diagnostics, which requires a
valid ParsedAttr which is hard to get.
This should be fairly simple to fix but requires some tablegen changes
to expose the logic without the side-effect.
Differential Revision: https://reviews.llvm.org/D107696
Add builtin and intrinsic for `__addex`.
This patch is part of a series of patches to provide builtins for
compatibility with the XL compiler.
Reviewed By: stefanp, nemanjai, NeHuang
Differential Revision: https://reviews.llvm.org/D107002
Clang test CodeGen/thinlto-clang-diagnostic-handler-in-be.c fails on
some non x86 targets, e.g. hexagon. Since the test already requires x86
to be available as a target this commit forces the target to x86_64.
Reviewed By: steven_wu
Differential Revision: https://reviews.llvm.org/D107667
This reverts the revert 28c04794df.
The failing MLIR test that caused the revert should be fixed in this
version.
Also includes a PPC test fix previously in 1f87c7c478.
Previoulsy debug-info-for-profiling and pseudo-probe-for-profiling are mutual exclusive because they compete the dwarf discrimnator for callsites on the IR. This changes allows to use the two switches together. The side effect is that callsite discriminators will be taken by pseudo probe, while discriminators for other instructions are still available for AutoFDO use. This is less than ideal, however, it still allows us a chance to smoothly transition from AutoFDO to CSSPGO, by collecting both profiles from a CSSPGO binary.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D107876
The existing logic for per-target libc++ include directories only
seem to exist for the Gnu and Fuchsia drivers, added in
ea12d779bc / D89013.
This is less generic than the corresponding case in the Gnu driver,
but matches the existing level of genericity in the MinGW driver
(and others too).
Differential Revision: https://reviews.llvm.org/D107893
This patch adjusts the intrinsics definition of
llvm.matrix.column.major.load and llvm.matrix.column.major.store to
allow overloading the type of the stride. The bitwidth of the stride is
used to perform the offset computation.
This fixes a crash when using __builtin_matrix_column_major_load or
__builtin_matrix_column_major_store on 32 bit platforms. The stride argument
of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit
platforms.
Note that we still perform offset computations with 64 bit width on 32
bit platforms for accesses that do not take a user-specified stride.
This can be fixed separately.
Fixes PR51304.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D107349
We do not want to define __PRIVILEGED__. There is no use case for the
definition and gcc does not define it. This patch removes that definition.
Reviewed By: lei, NeHuang
Differential Revision: https://reviews.llvm.org/D107461
When none of the translation units in the binary have been instrumented
we shouldn't need to link the profile runtime. However, because we pass
-u__llvm_profile_runtime on Linux and Fuchsia, the runtime would still
be pulled in and incur some overhead. On Fuchsia which uses runtime
counter relocation, it also means that we cannot reference the bias
variable unconditionally.
This change modifies the InstrProfiling pass to pull in the profile
runtime only when needed by declaring the __llvm_profile_runtime symbol
in the translation unit only when needed. For now we restrict this only
for Fuchsia, but this can be later expanded to other platforms. This
approach was already used prior to 9a041a7522, but we changed it
to always generate the __llvm_profile_runtime due to a TAPI limitation,
but that limitation may no longer apply, and it certainly doesn't apply
on platforms like Fuchsia.
Differential Revision: https://reviews.llvm.org/D98061
This change follows up on a FIXME submitted with D105974. This change simply let's the reference case fall through to return a concrete 'true'
instead of a nonloc pointer of appropriate length set to NULL.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D107720
Clang diagnostics refer to identifier names in quotes.
This patch makes inline remarks conform to the convention.
New behavior:
```
% clang -O2 -Rpass=inline -Rpass-missed=inline -S a.c
a.c:4:25: remark: 'foo' inlined into 'bar' with (cost=-30, threshold=337) at callsite bar:0:25; [-Rpass=inline]
int bar(int a) { return foo(a); }
^
```
Reviewed By: hoy
Differential Revision: https://reviews.llvm.org/D107791
%%%
This patch defines the macro __HOS_AIX__ when the target is AIX and without any dependency on the host. The macro indicates that the host is AIX. Defining the macro will help minimize porting pain for existing code compiled with xlc/xlC. xlC never shipped cross-compiling support, so the difference is not observable anyway.
%%%
This is a follow up to the discussion in https://reviews.llvm.org/D107242.
Reviewed By: cebowleratibm, joerg
Differential Revision: https://reviews.llvm.org/D107825
Summary: Move the test case to existing test file. Remove test file as duplicated. The file was mistakenly added due to concerns of a hidden bug (see https://reviews.llvm.org/D104381). After it turned out, that the bug was already fixed with another revision (https://reviews.llvm.org/D85817) and corresponding test was added as well, we can remove this file.
Differential Revision: https://reviews.llvm.org/D106152
This patch implements paper P0692R1 from the C++20 standard. Disable usual access checking rules to template argument names in a declaration of partial specializations, explicit instantiation or explicit specialization (C++20 13.7.5/10, 13.9.1/6).
Fixes: https://llvm.org/PR37424
This patch also implements option *A* from this paper P0692R1 from the C++20 standard.
This patch follows the @rsmith suggestion from D78404.
Reviewed By: krisb
Differential Revision: https://reviews.llvm.org/D92024
Explicitely set x86_64-linux-gnu as a target for asan-use-callbacks
clang test since some target do not support -fsanitize=address (e.g.
i386-pc-openbsd). Also remove redundant -fsanitize=address and move
-emit-llvm right after -S.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D107633
Before this patch, CXXCtorInitializers that don't typecheck get discarded in
most cases. In particular:
- typos that can't be corrected don't turn into RecoveryExpr. The full expr
disappears instead, and without an init expr we discard the node.
- initializers that fail initialization (e.g. constructor overload resolution)
are discarded too.
This patch addresses both these issues (a bit clunkily and repetitively, for
member/base/delegating initializers)
It does not preserve any AST nodes when the member/base can't be resolved or
other problems of that nature. That breaks invariants of CXXCtorInitializer
itself, and we don't have a "weak" RecoveryCtorInitializer like we do for Expr.
I believe the changes to diagnostics in existing tests are improvements.
(We're able to do some analysis on the non-broken parts of the initializer)
Differential Revision: https://reviews.llvm.org/D101641
Code added by D106688 has a problem. It passes the option -bxyz to the system linker as -b xyz xyz (duplication of the string 'xyz' is incorrect). This patch fixes that oversight.
Reviewed by: hubert.reinterpretcast, jsji
Differential Revision: https://reviews.llvm.org/D107786
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.
It is NFC for non-HIP languages.
Differential Revision: https://reviews.llvm.org/D102405
- They need to be preserved even if there's no reference within the
device code as the host code may need to initialize them based on the
application logic.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D107718
We were using an OpaqueValueExpr allocated on the stack to store
the size of a VLA. Because the VLASizeMap in CodegenFunction
uses the address of the expression to avoid recomputing VLAs,
we were accidentally reusing an earlier llvm::Value. This led to
invalid LLVM IR.
This is a temporary solution until VLASizeMap can be pushed and popped
based on the context.
Differential Revision: https://reviews.llvm.org/D107666
After D94315 we add the `NoInline` attribute to the outlined function to handle
data environments in the OpenMP if clause. This conflicted with the `AlwaysInline`
attribute added to the outlined function. for better performance in D106799.
The data environments should ideally not require NoInline, but for now this
fixes PR51349.
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D107649
The diagnostic texts for warning on attributes that don't appear on the
initial declaration is generally useful. We'd like to re-use it in
D106030, but first let's combine two that already are very similar so we
may re-use it a third time in that commit.
Also, fix a few places that were using notePreviousDefinition to point
to declarations, to instead use diag::note_previous_declaration.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D107613
This reverts commit 5181be344a.
Break libcxx type_traits header which uses aligned storage with
alignments greater than 4096. Reverting untill we can fix the header.
%%%
This patch defines __HOS_AIX__ macro for AIX in case of a cross compiler implementation.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107242
%%%
This patch defines the macro __THW_PPC__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107243
%%%
This patch defines the macro __THW_BIG_ENDIAN__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107241
D31709 added an assertion was added to `FullSourceLoc::hasManager()` that ensured a valid `SourceLocation` is always paired with a `SourceManager`, and missing `SourceManager` is always paired with an invalid `SourceLocation`.
This appears to be incorrect, since clients never cared about constructing `FullSourceLoc` to uphold that invariant, or always checking `isValid()` before calling `hasManager()`.
The assertion started failing when serializing diagnostics pointing into an explicit module. Explicit modules don't have valid `SourceLocation` for the `import` statement, since they are "imported" from the command-line argument `-fmodule-name=x.pcm`.
This patch removes the assertion, since `FullSourceLoc` was never intended to uphold any kind of invariants between the validity of `SourceLocation` and presence of `SourceManager`.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106862
This change provides a way to conveniently declare types that have
address space qualifiers removed.
Since OpenCL adds address spaces implicitly even when they are not
specified in source, it is useful to allow deriving address space
unqualified types.
Fixes llvm.org/PR45326
Differential Revision: https://reviews.llvm.org/D106785
This is recommit of the patch 16ff91ebcc,
reverted in 0c28a7c990 because it had
an error in call of getFastMathFlags (base type should be FPMathOperator
but not Instruction). The original commit message is duplicated below:
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by
instruction 'fcmp uno'. This simple solution is target-independent
and works well in most cases. It however is not suitable if floating
point exceptions are tracked. Corresponding IEEE 754 operation and C
function must never raise FP exception, even if the argument is a
signaling NaN. Compare instructions usually does not have such
property, they raise 'invalid' exception in such case. So this
mechanism is unsuitable when exception behavior is strict. In
particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948.
It is used in the cases when raising FP exceptions by 'isnan' is not
allowed. This solution implements 'isnan' using integer operations.
It solves the problem of exceptions, but offers one solution for all
targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
specific code into IR. Now only SystemZ implements this hook and it
generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or
target-specific intrinsics. It complicates analysis and can prevent
some optimizations.
* IR can be created by tools other than clang, in this case treatment
of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
On AVR, '.ctors' is used, not '.init_array'. Make this the default
unless specifically overridden by driver argument.
This matches gcc, and it matches the behavior in (e.g.) the NetBSD
driver (for certain OS variants).
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D107610
`__alignof__(x)` always returns `ABIAlign` if the "x" is marked `__attribute__((aligned()))`. However, the "aligned" attribute should only increase the alignment of a struct, or struct member, unless it's used together with the "packed" attribute, or used as a part of a typedef, in which case, the "aligned" attribute can both increase and decrease alignment.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D107598
GCC supports multiple forms of -falign-loops=.
-falign-loops= is currently ignored in Clang.
This patch implements the simplest but the most useful form where N is a
power of 2.
The underlying implementation uses a `llvm::TargetOptions` option for now.
Bitcode generation ignores this option.
Differential Revision: https://reviews.llvm.org/D106701
The root problem is a null pointer is accessed during the call to
checkOpenMPLoop, because loop up bound expr is an error expression
due to error diagnostic was emit early.
To fix this, in setLCDeclAndLB, setUB and setStep instead return false,
return true when LB, UB or Step contains Error, so that the checking is
stopped in checkOpenMPLoop.
Differential Revision: https://reviews.llvm.org/D107385
On AIX an aligned attribute cannot decrease the alignment of a variable
when placed on a variable declaration of vector type.
Differential Revision: https://reviews.llvm.org/D107522
Limit the maximum alignment for attribute aligned to 4096 to match
the limit of the .align pseudo op in the system assembler.
Differential Revision: https://reviews.llvm.org/D107497
Clang diagnostics should not start with a capital letter or use
trailing punctuation (https://clang.llvm.org/docs/InternalsManual.html#the-format-string),
but quite a few driver diagnostics were not following this advice. This
corrects the grammar and punctuation to improve consistency, but does
not change the circumstances under which the diagnostics are produced.
Implement target builtins for gfx90a including fadd64, fadd32, add2h,
max and min on various global, flat and ds address spaces for which
intrinsics are implemented.
Differential Revision: https://reviews.llvm.org/D106909
This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107393
For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.
Differential Revision: https://reviews.llvm.org/D106860
Zero-width bitfields on AIX pad out to the natral alignment boundary but
do not change the containing records alignment.
Differential Revision: https://reviews.llvm.org/D106900
The lit tests for `clang-scan-deps` invoke the tool without going through the substitution system. While the test runner correctly picks up the `clang-scan-deps` binary from the build directory, it doesn't print its absolute path. When copying the invocations when reproducing test failures, this can result in `command not found: clang-scan-deps` errors or worse yet: pick up the system `clang-scan-deps`. This patch adds new local `%clang-scan-deps` substitution.
Reviewed By: lxfind, dblaikie
Differential Revision: https://reviews.llvm.org/D107155
For some use-cases, it might be useful to be able to turn off modules for C++ in `-cc1`. (The feature is implied by `-std=C++20`.)
This patch exposes the `-fno-cxx-modules` option in `-cc1`.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106864
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by
instruction 'fcmp uno'. This simple solution is target-independent
and works well in most cases. It however is not suitable if floating
point exceptions are tracked. Corresponding IEEE 754 operation and C
function must never raise FP exception, even if the argument is a
signaling NaN. Compare instructions usually does not have such
property, they raise 'invalid' exception in such case. So this
mechanism is unsuitable when exception behavior is strict. In
particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948.
It is used in the cases when raising FP exceptions by 'isnan' is not
allowed. This solution implements 'isnan' using integer operations.
It solves the problem of exceptions, but offers one solution for all
targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
specific code into IR. Now only SystemZ implements this hook and it
generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or
target-specific intrinsics. It complicates analysis and can prevent
some optimizations.
* IR can be created by tools other than clang, in this case treatment
of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
See PR48656.
The implementation of the template instantiation of requires expressions
was incorrectly trying to get the expression from an 'ExprRequirement'
before checking if it was an error state.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D107399
See PR47174.
When canonicalizing nested name specifiers of the type kind,
the prefix for 'DependentTemplateSpecialization' types was being
dropped, leading to malformed types which would cause failures
when rebuilding template names.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D107311
where should not.
Currently we are using QTy->isIncompleteType(&ND) to check incomplete
type. But before doing that, need to instantiate for a class template
specialization or a class member of a class template specialization,
or an array with known size of such..., so that we know it is really
incomplete type.
To fix this using RequireCompleteType instead.
The new test is added into "test/OpenMP/target_update_messages.cpp"
The different of using RequireCompleteType is when emit incomplete type,
an additional note is also emitted to point to where incomplete type
is declared. Because this change, many tests are needed to be fixed
by adding additional note.
This is to fix https://bugs.llvm.org/show_bug.cgi?id=50508
Differential Revision: https://reviews.llvm.org/D107200
This patch implements P2092
Simple requirements in requirement body shall not start with requires.
A warning was already in place so we just turn this warning into an error.
In addition, we add tests to make sure typename is optional in
requirement-parameter-list as per the same paper.
Previously we would show an error, but keep the member, and also the
CXXRrecordDecl, valid. This could lead to crashes when attempting to
access the record layout or size.
Differential Revision: https://reviews.llvm.org/D105478
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.
With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`. `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.
Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:
thinlto/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
"dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
"dwarfehprepare.NumNoUnwind": 28744,
a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.
Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D105225
The new -mtargetos= option is a replacement for the existing, OS-specific options
like -miphoneos-version-min=. This allows us to introduce support for new darwin OSes
easier as they won't require the use of a new option. The older options will be
deprecated and the use of the new option will be encouraged instead.
Differential Revision: https://reviews.llvm.org/D106316
In some cases, when the execution path of the diagnostic
goes back and forth, arrows can overlap and create a mess.
Dimming arrows that are not relevant at the moment, solves this issue.
They are still visible, but don't draw too much attention.
Differential Revision: https://reviews.llvm.org/D92928
This commit adds a very first version of this feature.
It is off by default and has to be turned on by checking the
corresponding box. For this reason, HTML reports still keep
control notes (aka grey bubbles).
Further on, we plan on attaching arrows to events and having all arrows
not related to a currently selected event barely visible. This will
help with reports where control flow goes back and forth (eg in loops).
Right now, it can get pretty crammed with all the arrows.
Differential Revision: https://reviews.llvm.org/D92639
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
Definition of `__cpp_threadsafe_static_init` macro is controlled by
language option Opts.ThreadsafeStatics. This patch sets language
option to false by default in OpenCL mode, resulting in macro
`__cpp_threadsafe_static_init` being undefined. Default value can be
overridden using command line option -fthreadsafe-statics.
Change is supposed to address portability because not all OpenCL
vendors support thread safe implementation of static initialization.
Fixes llvm.org/PR48012
Differential Revision: https://reviews.llvm.org/D107163
The -fms-extensions converts __pragma (and _Pragma) into a #pragma that
has to occur at the beginning of a line and end with a newline. This
patch ensures that the newline after the #pragma is added even if
Token::isAtStartOfLine() indicated that we should not start a newline.
Committing relying post-commit review since the change is small, some
downstream uses might be blocked without this fix, and to make clear the
decision of the new -fminimize-whitespace feature (fix on main, revert
on clang-13.x branch) suggested by @aaron.ballman in D104601.
Differential Revision: https://reviews.llvm.org/D107183
Currently, the default alignment is much larger than the actual size of
the vector in memory. Fix this to use a sane default.
For SVE, temporarily remove lowering of load/store operations for
predicates with less than 16 elements. The layout the backend was
assuming for SVE predicates with less than 16 elements doesn't agree
with the frontend. More work probably needs to be done here.
This change is, strictly speaking, not backwards-compatible at the
bitcode level. But probably nobody is actually depending on that; i1
vectors in memory are rare, and the code that does use them probably
ends up forcing the alignment to something sane anyway. If we think
this is a concern, I can restrict this to scalable vectors for now
(where it's actually causing issues for me at the moment).
Differential Revision: https://reviews.llvm.org/D88994
Target-dependent constant folding will fold these down to simple
constants (or at least, expressions that don't involve a GEP). We don't
need heroics to try to optimize the form of the expression before that
happens.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 .
Differential Revision: https://reviews.llvm.org/D107116
In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate
type { [8 x i64] }. When emitting code for inline assembly operands,
clang tries to scalarize aggregate types to an integer of the equivalent
length, otherwise it passes them by-reference. This patch adds a target
hook to tell whether a given inline assembly operand is scalarizable
so that clang can emit code to pass/return it by-value.
Differential Revision: https://reviews.llvm.org/D94098
The builtins vec_xl_len_r and vec_xst_len_r actually use the
wrong side of the vector on big endian Power9 systems. We never
spotted this before because there was no such thing as a big
endian distro that supported Power9. Now we have AIX and the
elements are in the wrong part of the vector. This just fixes
it so the elements are loaded to and stored from the right
side of the vector.
Change `CountersPtr` in `__profd_` to a label difference, which is a link-time
constant. On ELF, when linking a shared object, this requires that `__profc_` is
either private or linkonce/linkonce_odr hidden. On COFF, we need D104564 so that
`.quad a-b` (64-bit label difference) can lower to a 32-bit PC-relative relocation.
```
# ELF: R_X86_64_PC64 (PC-relative)
.quad .L__profc_foo-.L__profd_foo
# Mach-O: a pair of 8-byte X86_64_RELOC_UNSIGNED and X86_64_RELOC_SUBTRACTOR
.quad l___profc_foo-l___profd_foo
# COFF: we actually use IMAGE_REL_AMD64_REL32/IMAGE_REL_ARM64_REL32 so
# the high 32-bit value is zero even if .L__profc_foo < .L__profd_foo
# As compensation, we truncate CountersDelta in the header so that
# __llvm_profile_merge_from_buffer and llvm-profdata reader keep working.
.quad .L__profc_foo-.L__profd_foo
```
(Note: link.exe sorts `.lprfc` before `.lprfd` even if the object writer
has `.lprfd` before `.lprfc`, so we cannot work around by reordering
`.lprfc` and `.lprfd`.)
With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`)
`ld -pie` linked clang is 1.74% smaller due to fewer R_X86_64_RELATIVE relocations.
```
% readelf -r pie | awk '$3~/R.*/{s[$3]++} END {for (k in s) print k, s[k]}'
R_X86_64_JUMP_SLO 331
R_X86_64_TPOFF64 2
R_X86_64_RELATIVE 476059 # was: 607712
R_X86_64_64 2616
R_X86_64_GLOB_DAT 31
```
The absolute function address (used by llvm-profdata to collect indirect call
targets) can be converted to relative as well, but is not done in this patch.
Differential Revision: https://reviews.llvm.org/D104556
Don't try to run the non-integrated assembler; just verify that the
invocations look like what we expect. Do verify that the integrated
assembler handles warnings as expected.
This patch will re-enable the patch posted under https://reviews.llvm.org/D106688 originally which was reverted due to buildbreak that was caused by mismatched diagnostic message arguments.
Reviewed By: Zarko Todorovski
Differential Revision: https://reviews.llvm.org/D107105
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
This is the same patch as in D106748 but with a tiny fix in checking of diagnostic messages.
Also added tests when program scope global variables are not supported.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D107154
Commit 83df122 (r368334) added 'REQUIRES: linux' to this test, but
because triples are not respected by REQUIRES, that meant it was
invariably Unsupported. The correct keyword would be 'system-linux'
(checking the host rather than the target).
Because the test was always skipped, commit 0cfd9e5 (r375439) did not
notice that the test modification was incorrect.
This patch corrects the REQUIRES clause and fixes the incorrect
previous patch.
Found after implementing https://reviews.llvm.org/D107162
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
Under the -faltivec-src-compat=gcc option, AltiVec vector initialization should
be treated as if they were compiled with gcc - which is, to emit an error when
the vectors are initialized in the parenthesized or non-parenthesized manner.
This patch implements this behaviour.
Differential Revision: https://reviews.llvm.org/D106410
In a post-commit message to https://reviews.llvm.org/D102343
@MaskRay pointed out syntax errors in one of the test cases. This
patch fixes those problems, I had forgotten the colon after the CHECK- strings.
Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.
Reviewed By: JonChesterfield, yaxunl
Differential Revision: https://reviews.llvm.org/D105981
Renamed language standard from openclcpp to openclcpp10.
Added new std values i.e. '-cl-std=clc++1.0' and
'-cl-std=CLC++1.0'.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D106266
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106748
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well.
Also, ensure that cl_khr_3d_image_writes feature macro is set to the same value.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106260
Parse the -b option in the driver and pass it to the linker if the target OS is AIX. This will establish compatibility with the other AIX compilers.
Reviewed By: Zarko Todorovski
Differential Revision: https://reviews.llvm.org/D106688
This patch adds `#pragma clang deprecated` to enable deprecation of
preprocessor macros.
The macro must be defined before `#pragma clang deprecated`. When
deprecating a macro a custom message may be optionally provided.
Warnings are emitted at the use site of a deprecated macro, and can be
controlled via the `-Wdeprecated` warning group.
This patch takes some rough inspiration and a few lines of code from
https://reviews.llvm.org/D67935.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106732
@kpn pointed out that the global variable initialization functions didn't
have the "strictfp" metadata set correctly, and @rjmccall said that there
was buggy code in SetFPModel and StartFunction, this patch is to solve
those problems. When Sema creates a FunctionDecl, it sets the
FunctionDeclBits.UsesFPIntrin to "true" if the lexical FP settings
(i.e. a combination of command line options and #pragma float_control
settings) correspond to ConstrainedFP mode. That bit is used when CodeGen
starts codegen for a llvm function, and it translates into the
"strictfp" function attribute. See bugs.llvm.org/show_bug.cgi?id=44571
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D102343
Summary:
Set the TargetCPUName for AIX to default to pwr7, removing the setting
of it based on the major/minor of the OS version, which previously
set it to pwr4 for AIX 7.1 and earlier. The old code would also set it to
pwr4 when the OS version was not specified and with the change, it will
default it to pwr7 in all cases.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By:hubert.reinterpretcast (Hubert Tong)
Differential Revision: https://reviews.llvm.org/D107063
The implementation of -fminimize-whitespace (D104601) revised the logic
when to emit newlines. There was no case to handle when more than
8 lines were skippped in -P (DisableLineMarkers) mode and instead fell
through the case intended for -fminimize-whitespace, i.e. emit nothing.
This patch will emit one newline in this case.
The newline logic is slightly reorganized. The `-P -fminimize-whitespace`
case is handled explicitly and emitting at least one newline is the new
fallback case. The choice between emitting a line marker or up to
7 empty lines is now a choice only with enabled line markers. The up to
8 newlines likely are fewer characters than a line directive, but
in -P mode this had the paradoxic effect that it would print up to
7 empty lines, but none at all if more than 8 lines had to be skipped.
Now with DisableLineMarkers, we don't consider printing empty lines
(just start a new line) which matches gcc's behavior.
The line-directive-output-mincol.c test is replaced with a more
comprehensive test skip-empty-lines.c also testing the more than
8 skipped lines behaviour with all flag combinations.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106924
When substitution failed on the first constrained template argument (but
only the first), we would assert / crash. Checking for failure was only
being performed from the second constraint on.
This changes it so the checking is performed in that case,
and the code is also now simplified a little bit to hopefully
avoid this confusion.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D106907
On ELF, an SHT_INIT_ARRAY outside a section group is a GC root. The current
codegen abuses SHT_INIT_ARRAY in a section group to mean a GC root.
On PE/COFF, the dynamic initialization for `__declspec(selectany)` in a comdat
can be garbage collected by `-opt:ref`.
Call `addUsedGlobal` for the two cases to fix the abuse/bug.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D106925
ClassTemplateSpecializationDecl not within a ClassTemplateDecl
represents an explicit instatiation of a template and so should be
handled as if it were a normal CXXRecordDecl. Unfortunately, having an
equivalent for FunctionTemplateDecl remains a TODO in ASTDumper's
VisitFunctionTemplateDecl, with all the explicit instantiations just
being emitted inside the FunctionTemplateDecl along with all the other
specializations, meaning we can't easily support explicit function
instantiations in update_cc_test_checks.
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D106243
The Intel compiler ICC supports the option "-fp-model=(source|double|extended)"
which causes the compiler to use a wider type for intermediate floating point
calculations. Also supported is a way to embed this effect in the source
program with #pragma float_control(source|double|extended).
This patch extends pragma float_control syntax, and also adds support
for a new floating point option "-ffp-eval-method=(source|double|extended)".
source: intermediate results use source precision
double: intermediate results use double precision
extended: intermediate results use extended precision
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D93769
Currently, we prohibit this pragma from appearing within a language
linkage specification, but this is useful functionality that is
supported by MSVC (which is where we inherited this feature from).
This patch allows you to use the pragma within an extern "C" {} (etc)
block.
The device runtime contains several calls to __kmpc_get_hardware_num_threads_in_block
and __kmpc_get_hardware_num_blocks. If the thread_limit and the num_teams are constant,
these calls can be folded to the constant value.
In commit D106033 we have the optimization phase. This commit adds the attributes to
the outlined function for the grid size. the two attributes are `omp_target_num_teams` and
`omp_target_thread_limit`. These values are added as long as they are constant.
Two functions are created `getNumThreadsExprForTargetDirective` and
`getNumTeamsExprForTargetDirective`. The original functions `emitNumTeamsForTargetDirective`
and `emitNumThreadsForTargetDirective` identify the expresion and emit the code.
However, for the Device version of the outlined function, we cannot emit anything.
Therefore, this is a first attempt to separate emision of code from deduction of the
values.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106298
With the old PM, the stub for __hwasan_generate_tag is still generated
in the IR, but never called.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D106858
Change the ffp-model=precise to enables -ffp-contract=on (previously
-ffp-model=precise enabled -ffp-contract=fast). This is a follow-up
to Andy Kaylor's comments in the llvm-dev discussion "Floating Point
semantic modes". From the same email thread, I put Andy's distillation
of floating point options and floating point modes into UsersManual.rst
Also fixes bugs.llvm.org/show_bug.cgi?id=50222
I had to revert this a few times because of failures on the x86-64
buildbot but I think we finally have that fixed by LNT/79f2b03c51.
Reviewed By: rjmccall, andrew.kaylor
Differential Revision: https://reviews.llvm.org/D74436
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.
Differential Revision: https://reviews.llvm.org/D106724
Redefines NULL as nullptr instead of ((void*)0)
in C++ for OpenCL.
Such internal representation of NULL provides
compatibility with C++11 and later language
standards.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D105987
> `#pragma clang include_instead(<header>)` is a pragma that can be used
> by system headers (and only system headers) to indicate to a tool that
> the file containing said pragma is an implementation-detail header and
> should not be directly included by user code.
>
> The library alternative is very messy code that can be seen in the first
> diff of D106124, and we'd rather avoid that with something more
> universal.
>
> This patch takes the first step by warning a user when they include a
> detail header in their code, and suggests alternative headers that the
> user should include instead. Future work will involve adding a fixit to
> automate the process, as well as cleaning up modules diagnostics to not
> suggest said detail headers. Other tools, such as clangd can also take
> advantage of this pragma to add the correct user headers.
>
> Differential Revision: https://reviews.llvm.org/D106394
This caused compiler crashes in Chromium builds involving PCH and an include
directive with macro expansion, when Token::getLiteralData() returned null. See
the code review for details.
This reverts commit e8a64e5491.
I don't know how well this works with clang-cl, but people want to try
it out, and I think we want to make it work, so exposing the flags seems
reasonable.
Differential revision: https://reviews.llvm.org/D106791
When `-fno-integrated-as` is passed to the Clang driver (or set by default by a specific toolchain), it will construct an assembler job in addition to the cc1 job. Similarly, the `-fembed-bitcode` driver flag will create additional cc1 job that reads LLVM IR file.
The Clang tooling library only cares about the job that reads a source file. Instead of relying on the fact that the client injected `-fsyntax-only` to the driver invocation to get a single `-cc1` invocation that reads the source file, this patch filters out such jobs from `Compilation` automatically and ignores the rest.
This fixes a test failure in `ClangScanDeps/headerwithname.cpp` and `ClangScanDeps/headerwithnamefollowedbyinclude.cpp` on AIX reported here: https://reviews.llvm.org/D103461#2841918 and `clang-scan-deps` failures with `-fembed-bitcode`.
Depends on D106788.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D105695
Constructor homing reduces the amount of class type info that is emitted
by emitting conmplete type info for a class only when a constructor for
that class is emitted.
This will mainly reduce the amount of duplicate debug info in object
files. In Chrome enabling ctor homing decreased total build directory sizes
by about 30%.
It's also expected that some class types (such as unused classes)
will no longer be emitted in the debug info. This is fine, since we wouldn't
expect to need these types when debugging.
In some cases (e.g. libc++, https://reviews.llvm.org/D98750), classes
are used without calling the constructor. Since this is technically
undefined behavior, enabling constructor homing should be fine.
However Clang now has an attribute
`__attribute__((standalone_debug))` that can be used on classes to
ignore ctor homing.
Bug: https://bugs.llvm.org/show_bug.cgi?id=46537
Differential Revision: https://reviews.llvm.org/D106084
To match xlc behaviour and definition in the PowerPC ISA3.1,
it is a better idea to have ibm-clang produce an error when a
0 is passed to the builtin, which will match xlc's behaviour.
This patch changes the accepted range from 0 to 31 to 1 to 31.
Differential revision: https://reviews.llvm.org/D106817
Allegedly the DWARF backend ignores this field of DIEnumerator, but we
set it nonetheless in case we decide to use it in the future.
Alternatively, we could remove it, but it is simpler to pass down the
signed bit as it is in the AST for now.
Implemented to address comments on D106585
This patch adds the always inline attribute to the outlined functions generated
by OpenMP regions. Because there is only a single instance of this function and
it always has internal linkage it is safe to inline in every instance it is
created. This could potentially lead to performance degredation due to
inflated register counts in the parallel region.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106799
This patch adds a driver flag `-fopenmp-target-new-runtime` to optionally enable the new device runtime
bitcode library. This allows users to enable the new experimental runtime
before it becomes the default in the future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106793
This patch replaces the workaround for simpler implicit moves
implemented in D105518.
The Microsoft STL currently has some issues with P2266.
Where before, with -fms-compatibility, we would disable simpler
implicit moves globally, with this change, we disable it only
when the returned expression is in a context contained by
std namespace and is located within a system header.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: aaron.ballman, mibintc
Differential Revision: https://reviews.llvm.org/D105951
DIEnumerator stores an APInt as of April 2020, so now we don't need to
truncate the enumerator value to 64 bits. Fixes assertions during IRGen.
Split from D105320, thanks to Matheus Izvekov for the test case and
report.
Differential Revision: https://reviews.llvm.org/D106585
XL provides functions __vec_ldrmb/__vec_strmb for loading/storing a
sequence of 1 to 16 bytes in big endian order, right justified in the
vector register (regardless of target endianness).
This is equivalent to vec_xl_len_r/vec_xst_len_r which are only
available on Power9.
This patch simply uses the Power9 functions when compiled for Power9,
but provides a more general implementation for Power8.
Differential revision: https://reviews.llvm.org/D106757
This patch changes the index argument of lvxl?/lve[bhw]x and
stvxl?/stve[bhw]x builtins from int to long. Because on 64-bit
subtargets, an extra extsw will always been generated, which is
incorrect.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D106530
`#pragma clang include_instead(<header>)` is a pragma that can be used
by system headers (and only system headers) to indicate to a tool that
the file containing said pragma is an implementation-detail header and
should not be directly included by user code.
The library alternative is very messy code that can be seen in the first
diff of D106124, and we'd rather avoid that with something more
universal.
This patch takes the first step by warning a user when they include a
detail header in their code, and suggests alternative headers that the
user should include instead. Future work will involve adding a fixit to
automate the process, as well as cleaning up modules diagnostics to not
suggest said detail headers. Other tools, such as clangd can also take
advantage of this pragma to add the correct user headers.
Differential Revision: https://reviews.llvm.org/D106394
In OpenMP 5.1:
> If the `write` or `update` clause is specifieded, the atomic operation is not an atomic conditional update for which the comparison fails, and the effective memory ordering is `release`, `acq_rel`, or `seq_cst`, the strong flush on entry to the atomic operation is also a release flush. If the `read` or `update` clause is specified and the effective memory ordering is `acquire`, `acq_rel`, or `seq_cst` then the strong flush on exit from the atomic operation is also an acquire flush.
In OpenMP 5.0:
> If the `write`, `update`, or **`capture`** clause is specified and the `release`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on entry to the atomic operation is also a release flush. If the `read` or `capture` clause is specified and the `acquire`, `acq_rel`, or `seq_cst` clause is specified then the strong flush on exit from the atomic operation is also an acquire flush.
From my understanding, in OpenMP 5.1, `capture` is removed from the requirement for flush, therefore we don't have to enforce it.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D100768
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10304.
Note: No currently available Z system supports the arch14
architecture. Once new systems become available, the
official system name will be added as supported -march name.
Set default version for OpenCL C to 1.2. This means that the
absence of any standard flag will be equivalent to passing
'-cl-std=CL1.2'.
Note that this patch also fixes incorrect version check for
the pointer to pointer kernel arguments diagnostic and
atomic test.
Differential Revision: https://reviews.llvm.org/D106504
This patch adds the -fminimize-whitespace with the following effects:
* If combined with -E, remove as much non-line-breaking whitespace as
possible.
* If combined with -E -P, removes as much whitespace as possible,
including line-breaks.
The motivation is to reduce the amount of insignificant changes in the
preprocessed output with source files where only whitespace has been
changed (add/remove comments, clang-format, etc.) which is in particular
useful with ccache.
A patch for ccache for using this flag has been proposed to ccache as well:
https://github.com/ccache/ccache/pull/815, which will use
-fnormalize-whitespace when clang-13 has been detected, and additionally
uses -P in "unify_mode". ccache already had a unify_mode in an older
version which was removed because of problems that using the
preprocessor itself does not have (such that the custom tokenizer did
not recognize C++11 raw strings).
This patch slightly reorganizes which part is responsible for adding
newlines that are required for semantics. It is now either
startNewLineIfNeeded() or MoveToLine() but never both; this avoids the
ShouldUpdateCurrentLine workaround and avoids redundant lines being
inserted in some cases. It also fixes a mandatory newline not inserted
after a _Pragma("...") that is expanded into a #pragma.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D104601
Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax}
with standard codegen patterns. Since wasm_simd128.h uses an integer vector as
the standard single vector type, the IR for the pmin and pmax intrinsic
functions contains bitcasts that would not be there otherwise. Add extra codegen
patterns that can still select the pmin and pmax instructions in the presence of
these bitcasts.
Differential Revision: https://reviews.llvm.org/D106612
Address sanitizer passes may generate call of ASAN bitcode library
functions after bitcode linking in lld, therefore lld cannot add
those symbols since it does not know they will be used later.
To solve this issue, clang emits a reference to a bicode library
function which calls all ASAN functions which need to be
preserved. This basically force all ASAN functions to be
linked in.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D106315
In -fgpu-rdc case, fat binary is embedded as global variable __hip_fatbin.
It needs to have protected visibility to avoid conflict between shared
libraries.
Reviewed by: Siu Chi Chan
Differential Revision: https://reviews.llvm.org/D106571
Fixes: SWDEV-292290
https://bugs.llvm.org/show_bug.cgi?id=51109
When we merged two classes, `*this` became an obsolete representation of
the new `State`. This is b/c the member relations had changed during the
previous merge of another member of the same class in a way that `*this`
had no longer any members. (`mergeImpl` might keep the member relations
to `Other` and could dissolve `*this`.)
Differential Revision: https://reviews.llvm.org/D106285
NULL was undefined in OpenCL prior to version 2.0. However, the
language specification states that "macro names defined by the C99
specification but not currently supported by OpenCL are reserved
for future use". Therefore, application developers cannot redefine
NULL.
The change is supposed to resolve inconsistency between language
versions. Currently there is no apparent reason why NULL should
be kept undefined.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D105988
Add the builtins defined by Section 42 "Integer dot product" in
the OpenCL Extension Specification.
Differential Revision: https://reviews.llvm.org/D106434
Commit db7efcab7d changed the implementations of the wasm_*_extract_lane and
wasm_*_replace_lane intrinsics from using builtin functions to using the
standard vector extensions. This did not change the resulting IR, but it changes
how update_cc_test_checks.py labels values in the IR. This commit simply updates
those labels.
Differential Revision: https://reviews.llvm.org/D106611
An empty enum is used to implement C++'s new-ish "byte" type (to make
sure it's a separate type for overloading, etc - compared to a typedef)
- without any enumerators. Some clang warnings don't make sense in this
sort of situation, so let's skip them for empty enums.
It's arguable that possibly some situations of enumerations without
enumerators might want the previous-to-this-patch behavior (if the enum
is autogenerated and in some cases comes up empty, then maybe a default
in an empty switch would still be considered problematic - so that when
you add the first enumeration you do get a -Wswitch warning). But I
think that's niche enough & this std::byte case is mainstream enough
that we should prioritize the latter over the former.
If someone's got a middle ground proposal to account for both of those
situations, I'm open to patches/suggestions/etc.
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104797
This patch defines the macro __LONGDOUBLE64 for AIX when long double is 8 bytes.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D105477
This patch makes the changes in the driver that converts the medium code
model to large.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D106371
Emit the unsupported option error until the Clang's library integration support for 128-bit long double is available for AIX.
Reviewed By: Whitney, cebowleratibm
Differential Revision: https://reviews.llvm.org/D106074
We caught the cases where the user would explicitly use the & operator,
but we were missing implicit conversions such as array decay.
Fixes PR26336. Thanks to Samuel Neves for inspiration for the patch.
This commit adds driver support for the Mac Catalyst target,
as supported by the Apple clang compile
Differential Revision: https://reviews.llvm.org/D105960
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtin and intrinsic for "__stbcx".
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106484
Fixed test to use predefined version marco instead
of passing extra macro in the command line.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D106254
Clang implemented the _ExtInt datatype as a bit-precise integer type,
which was then proposed to WG14. WG14 has accepted the proposal
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2709.pdf), but Clang
requires some additional work as a result.
In the original Clang implementation, we elected to disallow implicit
conversions involving these types until after WG14 finalized the rules.
This patch implements the rules decided by WG14: no integer promotion
for bit-precise types, conversions prefer the larger of the two types
and in the event of a tie (say _ExtInt(32) and a 32-bit int), the
standard type wins.
There are more changes still needed to conform to N2709, but those will
be handled in follow-up patches.
Change the ffp-model=precise to enables -ffp-contract=on (previously
-ffp-model=precise enabled -ffp-contract=fast). This is a follow-up
to Andy Kaylor's comments in the llvm-dev discussion "Floating Point
semantic modes". From the same email thread, I put Andy's distillation
of floating point options and floating point modes into UsersManual.rst
Also fixes bugs.llvm.org/show_bug.cgi?id=50222
Reviewed By: rjmccall, andrew.kaylor
Differential Revision: https://reviews.llvm.org/D74436
According to https://godbolt.org/z/q5rME1naY and acle, we found that
there are different SVE conversion behaviours between clang and gcc. It turns
out that llvm does not handle SVE predicates width properly.
This patch 1) checks SVE predicates width rightly with svbool_t type.
2) removes warning on svbool_t VLST <-> VLAT/GNUT conversion.
3) disables VLST <-> VLAT/GNUT conversion between SVE vectors and predicates
due to different width.
Differential Revision: https://reviews.llvm.org/D106333
This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106496
These builtins were added to capture the fact that the underlying Wasm
instructions return i32s and implicitly sign or zero extend the extracted lanes
in the case of the i8x16 and i16x8 variants. But we do sufficient optimizations
during code gen that these low-level details do not need to be exposed to users.
This commit replaces the use of the builtins in wasm_simd128.h with normal
target-independent vector code. As a result, we can switch the relevant
intrinsics to use functions rather than macros and can use more user-friendly
return types rather than trying to precisely expose the underlying Wasm types.
Note, however, that the generated LLVM IR is no different after this change.
Differential Revision: https://reviews.llvm.org/D106500
Taking the address of a reference parameter might be valid, and without
CFA, false positives are going to be more trouble than they're worth.
Differential Revision: https://reviews.llvm.org/D102728
This commit adds supports for clang to remap macOS availability attributes that have introduced,
deprecated or obsoleted versions to appropriate Mac Catalyst availability attributes. This
mapping is done using the version mapping provided in the macOS SDK, in the SDKSettings.json file.
The mappings in the SDKSettings json file will also be used in the clang driver for the driver
Mac Catalyst patch, and they could also be used in the future for other platforms as well.
Differential Revision: https://reviews.llvm.org/D105257
Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal instruction selection patterns. The wasm_simd128.h
intrinsics header was already using portable code for the corresponding
intrinsics, so now it produces the correct instructions.
Differential Revision: https://reviews.llvm.org/D106400
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
This patch is in a series of patches to provide
builtins for compatibility with the XL compiler.
This patch adds builtins related to floating point
operations
Reviewed By: #powerpc, nemanjai, amyk, NeHuang
Differential Revision: https://reviews.llvm.org/D103986
This patch:
- Fixes how the std-namespace test is written in SmartPtrModelling
(now accounts for functions with no Decl available)
- Adds the smart pointer checker flag check where it was missing
Differential Revision: https://reviews.llvm.org/D106296
Add the builtins defined by Section 40 "Extended Bit Operations" in
the OpenCL Extension Specification.
Differential Revision: https://reviews.llvm.org/D106267
The checker warns if a stream is read that is already in end-of-file
(EOF) state.
The commit adds indication of the last location where the EOF flag is set
on the stream.
Reviewed By: Szelethus
Differential Revision: https://reviews.llvm.org/D104925
I missed to add half-precision FP types for vle16/vse16 in the previous
patches. Added them in this patch.
Differential Revision: https://reviews.llvm.org/D106340
Implemented builtins for mtmsr, mfspr, mtspr on PowerPC;
the patch is intended for XL Compatibility.
Differential revision: https://reviews.llvm.org/D106130
When disabling simpler implicit moves in MSVC compatibility mode as
a workaround in D105518, we forgot to make the opposite change and
enable regular (P1825) implicit moves in the same mode.
As a result, we were not doing any implicit moves at all. OOPS!
This fixes it and adds test for this.
This is a fix to a temporary workaround, there is ongoing
work to replace this, applying the workaround only to
system headers and the ::stl namespace.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106303
This patch implements store, load, move from and to registers related
builtins, as well as the builtin for stfiw. The patch aims to provide
feature parady with xlC on AIX.
Differential revision: https://reviews.llvm.org/D105946
The Intel compiler ICC supports the option "-fp-model=(source|double|extended)"
which causes the compiler to use a wider type for intermediate floating point
calculations. Also supported is a way to embed this effect in the source
program with #pragma float_control(source|double|extended).
This patch extends pragma float_control syntax, and also adds support
for a new floating point option "-ffp-eval-method=(source|double|extended)".
source: intermediate results use source precision
double: intermediate results use double precision
extended: intermediate results use extended precision
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D93769
This commit adds support for Mac Catalyst availability attribute, as
supported by the Apple clang compiler. A follow-up commit will provide
additional support for inferring Mac Catalyst availability from macOS
availability using the mapping in the SDKSettings.json.
Differential Revision: https://reviews.llvm.org/D105052
Whenever -fmodule-name=top_level_module name is parsed, and clang actually tries to
import top_level_module, the headers are imported textually and the module isn't actually
built. However, the dependency scanner could still record it as a potential dependency
if the module was reimported and thus recorded by the preprocessor callbacks.
This change avoids collecting this kind of module as a dependency by verifying that we don't
collect top level modules without actual PCM files.
Differential Revision: https://reviews.llvm.org/D106100
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch add the builtin and emit target independent
code for __cmpb.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D105194
This patch is in a series of patches to provide builtins for compatibility with the XL compiler.
This patch adds semachecking for an already implemented builtin, `__icbt`. `__icbt` is only
valid for Power8 and up.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D105834
For example, in OpenMP offload codegen tests, global variables like
`.offload_maptypes*` are much easier to read in hex.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104743
`--check-globals` activates checks for all global values, and
`--global-value-regex` filters them. For example, I'd like to use it
in OpenMP offload codegen tests to check only global variables like
`.offload_maptypes*`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104742
This patch fixes `__builtin_ppc_recipdivf`, `__builtin_ppc_recipdivd`,
`__builtin_ppc_rsqrtf`, and `__builtin_ppc_rsqrtd`. FastMathFlags are
set to fast immediately before emitting these builtins. Now the flags
are restored to their previous values after the builtins are emitted.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D105984
Summary:
Test and produce warning for subtracting a pointer from null or subtracting
null from a pointer.
This reland adds the functionality that the warning is no longer reusing an
existing warning, it has different wording for C vs C++ to refect the fact
that nullptr-nullptr has defined behaviour in C++, it is suppressed
when the warning is triggered by a system header and adds
-Wnull-pointer-subtraction to allow the warning to be controlled. -Wextra
implies -Wnull-pointer-subtraction.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: efriedma (Eli Friedman), nickdesaulniers (Nick Desaulniers)
Differential Revision: https://reviews.llvm.org/D98798
Added a number of different builtins that exist in the XL compiler. Most of
these builtins already exist in clang under a different name.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D104386
This patch avoid minimizing input files that contributed to a PCH or its modules. This prevents the implicit modular build to fail on unexpected file size. Depends on D106146.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104536
This patch is in a series of patches to provide builtins for
compatibility with the XL compiler. This patch adds software divide
builtins with no checking. These builtins are each emitted as a fast
fdiv.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D106150
This diff changes llvm-ifs to use unified IFS file format
and perform other renaming changes in preparation for the
merging between elfabi/ifs.
Differential Revision: https://reviews.llvm.org/D99810
This patch adds the `-faltivec-src-compat=mixed` option to the
`builtins-ppc-altivec.c` test.
Currently, the default for `-faltivec-src-compat` is `mixed`. The reason we
explicitly specify `mixed` to the RUN lines of this test is because eventually,
the default will set to `xl`.
Having the default as `xl` changes the CHECKs of this test slightly, as it
reorders some of the `vector bool` and `vector pixel` CHECKs (since under the
`xl` option, `vector bool` and `vector pixel` are treated in the same way as
other vector scalars). Explicitly specifying `mixed` ensures that we are testing
pre-existing Clang behaviour.
Differential Revision: https://reviews.llvm.org/D106282
Use _Float16 as the half-precision floating point type. Define a new
type specifier 'x' for the _Float16 type.
Differential Revision: https://reviews.llvm.org/D105001
This patch implements the initialization of vectors under the
-faltivec-src-compat=xl option introduced in https://reviews.llvm.org/D103615.
Under this option, the initialization of scalar vectors, vector bool, and vector
pixel are treated the same, where the initialization value is splatted across
the whole vector.
This patch does not change the behaviour of the -faltivec-src-compat=mixed option,
which is the current default for Clang.
Differential Revision: https://reviews.llvm.org/D106120
Summary:
The AIX linker will produce errors on unresolved weak symbols. Change the
generated code to not check for the initialization function but just call
it and ensure that it always exists. Also, the AIX atexit routine has a
different name (and signature) so call it correctly. Update the lit tests
to test on AIX appropriately.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: hubert.reinterpretcast (Hubert Tong)
Differential Revision: https://reviews.llvm.org/D104420
This patch handles the `std::swap` function specialization
for `std::unique_ptr`. Implemented to be very similar to
how `swap` method is handled
Differential Revision: https://reviews.llvm.org/D104300
It's noteworthy that GCC has the same bug here, which is a bit
surprising. Both Clang and GCC's bug is only for function template
arguments that are themselves templates with default template arguments
(f1<t1<int[, missing_default_here]>>). Probably because function name
matching isn't generally necessary - whereas type matching is necessary
for DWARF consumers to associate declarations and definitions across
translation units, so the bug's been addressed there already - but
continued to exist for function templates since it's fairly benign
there.
I came across this while working on a change that could reconstitute
these pretty printed names based on the rest of the DWARF, reducing the
size of the DWARF by not having to encode all the template parameters in
the name string. That reconstitution code can't tell the difference
between a defaulted argument or not, so couldn't create the current
buggy-ish output.
Making the names more consistent between direct and indirect references,
and between function and class templates seems all to the good.
(I fixed the function template version of this a few years back in
9fdd09a4cc - clearly I should've looked
more closely and generalized the code better so it only had to be fixed
once - well, doing that here now)
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.
This patch:
* Adds a verifier check that the attribute is required.
* Adds it in the IRBuilder methods for these intrinsics.
* Autoupgrades old bitcode without the attribute.
* Updates the lowering code to use the attribute rather than
the pointer element type.
* Updates lots of tests to specify the attribute.
* Adds -force-opaque-pointers to the intrinsic-array.ll test
to demonstrate they work now.
https://reviews.llvm.org/D106184
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
Turning on -funique-internal-linkage-names when -fpseudo-probe-for-profiling is on, unless -fno-unique-internal-linkage-names is specified.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D106193
This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`).
The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands.
Reviewed By: #powerpc, qiucf
Differential Revision: https://reviews.llvm.org/D105957
Implement a subset of builtins required for compatiblilty with AIX XL compiler.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D105930
This patch adds unique idenfitiers to the existing OpenMP remarks. This makes
it easier to identify the corresponding documentation for each remark that will
be hosted in the OpenMP webpage.
Depends on D105898
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105939
This patch rewrites and reworks a few of the existing remarks to make the mmore
concise and consistent prior to writing the documentation for them.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105898
On Power PC some legacy compilers included a number of builtins in a
builtins.h header file. While this header file is not required to hold
builtins for clang some legacy code does try to include this file and so
this patch provides an empty version of that file.
Differential Revision: https://reviews.llvm.org/D106065
This change addresses this assertion that occurs in a downstream
compiler with a custom target.
```APInt.h:1151: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"'```
No covering test case is susbmitted with this change since this crash
cannot be reproduced using any upstream supported target. The test case
that exposes this issue is as simple as:
```lang=c++
void test(int * p) {
int * q = p-1;
if (q) {}
if (q) {} // crash
(void)q;
}
```
The custom target that exposes this problem supports two address spaces,
16-bit `char`s, and a `_Bool` type that maps to 16-bits. There are no upstream
supported targets with similar attributes.
The assertion appears to be happening as a result of evaluating the
`SymIntExpr` `(reg_$0<int * p>) != 0U` in `VisitSymIntExpr` located in
`SimpleSValBuilder.cpp`. The `LHS` is evaluated to `32b` and the `RHS` is
evaluated to `16b`. This eventually leads to the assertion in `APInt.h`.
While this change addresses the crash and passes LITs, two follow-ups
are required:
1) The remainder of `getZeroWithPtrWidth()` and `getIntWithPtrWidth()`
should be cleaned up following this model to prevent future
confusion.
2) We're not sure why references are found along with the modified
code path, that should not be the case. A more principled
fix may be found after some further comprehension of why this
is the case.
Acks: Thanks to @steakhal and @martong for the discussions leading to this
fix.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105974
This patch handles the `<<` operator defined for `std::unique_ptr` in
the std namespace (ignores custom overloads of the operator).
Differential Revision: https://reviews.llvm.org/D105421
This patch handles all the comparision methods (defined via overloaded
operators) on std::unique_ptr. These operators compare the underlying
pointers, which is modelled by comparing the corresponding inner-pointer
SVal. There is also a special case for comparing the same pointer.
Differential Revision: https://reviews.llvm.org/D104616
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for population
count, reversed load and store related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106021
x86_64-linux-gnu and x86_64-linux-gnux32 use different ABIs and objects
built for one cannot be used for the other. In order to build and use
compiler-rt for x32, we need to treat x32 as a new arch there. This
updates the driver to search using the new arch name.
Reviewed By: glaubitz
Differential Revision: https://reviews.llvm.org/D100148
This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins.
Reviewed By: #powerpc, nemanjai, amyk
Differential Revision: https://reviews.llvm.org/D105360
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and emit target independent
code for rotate related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D104744
LLVM changed to not emit L... labels for things marked "do_not_dead_strip"
because the linker can sometimes drop the flag if there's no proper symbol.
This Clang test checked for the old behaviour, but doesn't actually care about
that bit.
If the instantiation of a member variable makes it possible to
compute a previously undeduced type, we should use that piece of
information.
Fix bug#50590
Differential Revision: https://reviews.llvm.org/D103849
This patch make coroutine passes run by default in LLVM pipeline. Now
the clang and opt could handle IR inputs containing coroutine intrinsics
without special options.
It should be fine. On the one hand, the coroutine passes seems to be stable
since there are already many projects using coroutine feature.
On the other hand, the coroutine passes should do nothing for IR who doesn't
contain coroutine intrinsic.
Test Plan: check-llvm
Reviewed by: lxfind, aeubanks
Differential Revision: https://reviews.llvm.org/D105877
Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50435.
Differential Revision: https://reviews.llvm.org/D106019
Summary This option can be used to reduce the size of the
binary. The trade-off in this case would be the run-time
performance.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105726
Replace the experimental clang builtin and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50433.
Differential Revision: https://reviews.llvm.org/D105950
The anonymous and non-anonymous bit-field diagnostics are easily
combined into one diagnostic. However, the diagnostic was missing a
"the" that is present in the almost-identically worded
warn_bitfield_width_exceeds_type_width diagnostic, hence the changes to
test cases.
`-u` is a linker option used to pretend a symbol is undefined,
this option are common used for forcing archive member extraction.
This option should pass to `ld`, and many other toolchain in Clang
like `tools::gnutools` has pass that too.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D105091
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D102875
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.
Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>
Differential revision: https://reviews.llvm.org/D105501
Otherwise, if someone specifies a valid AMD arch, we may end up triggering an
assertion on unexpected arch later on.
Differential Revision: https://reviews.llvm.org/D105295
This patch simplifies the way we deal with (dis)equalities.
Due to the symmetry between constraint handler and range inferrer,
we can have very similar implementations of logic handling
questions about (dis)equality and assumptions involving (dis)equality.
It also helps us to remove one more visitor, and removes uncertainty
that we got all the right places to put `trackNE` and `trackEQ`.
Differential Revision: https://reviews.llvm.org/D105693
After taking C++98 implicit moves out in D104500,
we put it back in, but now in a new form which preserves
compatibility with pure C++98 programs, while at the same time
giving almost all the goodies from P1825.
* We use the exact same rules as C++20 with regards to which
id-expressions are move eligible. The previous
incarnation would only benefit from the proper subset which is
copy ellidable. This means we can implicit move, in addition:
* Parameters.
* RValue references.
* Exception variables.
* Variables with higher-than-natural required alignment.
* Objects with different type from the function return type.
* We preserve the two-overload resolution, with one small tweak to the
first one: If we either pick a (possibly converting) constructor which
does not take an rvalue reference, or a user conversion operator which
is not ref-qualified, we abort into the second overload resolution.
This gives C++98 almost all the implicit move patterns which we had created test
cases for, while at the same time preserving the meaning of these
three patterns, which are found in pure C++98 programs:
* Classes with both const and non-const copy constructors, but no move
constructors, continue to have their non-const copy constructor
selected.
* We continue to reject as ambiguous the following pattern:
```
struct A { A(B &); };
struct B { operator A(); };
A foo(B x) { return x; }
```
* We continue to pick the copy constructor in the following pattern:
```
class AutoPtrRef { };
struct AutoPtr {
AutoPtr(AutoPtr &);
AutoPtr();
AutoPtr(AutoPtrRef);
operator AutoPtrRef();
};
AutoPtr test_auto_ptr() {
AutoPtr p;
return p;
}
```
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D105756
Similar to D46745, "S" represents an absolute symbolic operand, which
can be used to specify the access models, e.g.
extern int var;
void *addr_via_asm() {
void *ret;
asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var));
return ret;
}
'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105254
LDARX and LWARX sometimes gets optimized out by the compiler
when it is critical to the correctness of the code. This inline asm generation
ensures that it preserved.
Differential Revision: https://reviews.llvm.org/D105754
Properties that were declared `@property(copy, nonatomic) id foo` make an
unnecessary call to objc_get_property(). This call can be replaced with a
direct access to the backing variable identical to how a `@property(nonatomic)
id foo` would do it.
This reduces codegen by 4 bytes (x86_64/arm64) and removes a cross linkage unit
function call per property declared as copy/nonatomic.
Differential Revision: https://reviews.llvm.org/D105311
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D104915
Summary: This patch is a part of an attempt to obtain more
timer data from the analyzer. In this patch, we try to use
LLVM::TimeRecord to save time before starting the analysis
and to print the time that a specific function takes while
getting analyzed.
The timer data is printed along with the
-analyzer-display-progress outputs.
ANALYZE (Syntax): test.c functionName : 0.4 ms
ANALYZE (Path, Inline_Regular): test.c functionName : 2.6 ms
Authored By: RithikSharma
Reviewer: NoQ, xazax.hun, teemperor, vsavchenko
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105565
While GNU as only allows the directory form of the .file directive for DWARF v5,
the integrated assembler prefers the directory form on all DWARF versions
(-fdwarf-directory-asm).
We currently set CC1 -fno-dwarf-directory-asm for -fno-integrated-as -gdwarf-5
which may cause the directory entry 0 and the filename entry 0 to be incorrect
(see D105662 and the example below). This patch makes -fno-integrated-as -gdwarf-5 use
-fdwarf-directory-asm as well.
```
cd /tmp/c
before
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -S -o - | grep '\.file.*0'
.file 0 "/tmp/c/e/a.c" md5 0x97e31cee64b4e58a4af8787512d735b6
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -c
% llvm-dwarfdump a.o | grep include_directories
include_directories[ 0] = "/tmp/c/e"
after
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -S -o - | grep '\.file.*0'
.file 0 "/tmp/c" "e/a.c" md5 0x97e31cee64b4e58a4af8787512d735b6
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -c
% llvm-dwarfdump a.o | grep include_directories
include_directories[ 0] = "/tmp/c"
```
Reviewed By: #debug-info, dblaikie, osandov
Differential Revision: https://reviews.llvm.org/D105835
On AIX when there is a pragma pack, or pragma align in effect then zero-width bitfields should pad out to the end of the bitfield container but not increase the alignment requirements of the struct greater then the max field align.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D105635
Replace the clang builtin function and LLVM intrinsic for
f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing
combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern.
Differential Revision: https://reviews.llvm.org/D105755
This patch implements trap and FP to and from double conversions. The builtins
generate code that mirror what is generated from the XL compiler. Intrinsics
are named conventionally with builtin_ppc, but are aliased to provide the same
builtin names as the XL compiler.
Differential Revision: https://reviews.llvm.org/D103668
We are currently being inconsistent in using signed vs unsigned comparisons for
vec_all_* and vec_any_* interfaces that use vector bool types. For example we
use signed comparison for vec_all_ge(vector signed char, vector bool char) but
unsigned comparison for when the arguments are swapped. GCC and XL use signed
comparison instead. This patch makes clang consistent with itself and with XL
and GCC.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D105666
Support Narrowing conversions to bool in if constexpr condition
under C++23 language mode.
Only if constexpr is implemented as the behavior of static_assert
is already conforming. Still need to work on explicit(bool) to
complete support.
The function is supposed to be the equivalent of rint() (as in
round to nearest, ties to even) rather than round() (round to
nearest, ties away from zero). In fact, the instruction we emit
without VSX is vrfin which is correct. However, with VSX we emit
xvrspi which is the equivalent of round() and therefore incorrect.
Since there is no equivalent VSX instruction, simply use vrfin
regardless of availability of VSX.
OpenMP 5.1 added support for writing OpenMP directives using [[]]
syntax in addition to using #pragma and this introduces support for the
new syntax.
In OpenMP, the attributes take one of two forms:
[[omp::directive(...)]] or [[omp::sequence(...)]]. A directive
attribute contains an OpenMP directive clause that is identical to the
analogous #pragma syntax. A sequence attribute can contain either
sequence or directive arguments and is used to ensure that the
attributes are processed sequentially for situations where the order of
the attributes matter (remember:
https://eel.is/c++draft/dcl.attr.grammar#4.sentence-4).
The approach taken here is somewhat novel and deserves mention. We
could refactor much of the OpenMP parsing logic to work for either
pragma annotation tokens or for attribute clauses. It would be a fair
amount of effort to share the logic for both, but it's certainly
doable. However, the semantic attribute system is not designed to
handle the arbitrarily complex arguments that OpenMP directives
contain. Adding support to thread the novel parsed information until we
can produce a semantic attribute would be considerably more effort.
What's more, existing OpenMP constructs are not (often) represented as
semantic attributes. So doing this through Attr.td would be a massive
undertaking that would likely only benefit OpenMP and comes with
additional risks. Rather than walk down that path, I am taking
advantage of the fact that the syntax of the directives within the
directive clause is identical to that of the #pragma form. Once the
parser recognizes that we're processing an OpenMP attribute, it caches
all of the directive argument tokens and then replays them as though
the user wrote a pragma. This reuses the same OpenMP parsing and
semantic logic directly, but does come with a risk if the OpenMP
committee decides to purposefully diverge their pragma and attribute
syntaxes. So, despite this being a novel approach that does token
replay, I think it's actually a better approach than trying to do this
through the declarative syntax in Attr.td.
A number of functions in the header have guards for 64-bit only
that were presumably added as some of the functions in the blocks
use vector __int128 which is only available in 64-bit mode.
A more appropriate guard (__SIZEOF_INT128__) has been added for
those functions since, making the 64-bit guards redundant.
This patch removes those guards as they inadvertently guard code
that uses vector long long which does not actually require 64-bit
mode.
The `-analyzer-display-progress` displayed the function name of the
currently analyzed function. It differs in C and C++. In C++, it
prints the argument types as well in a comma-separated list.
While in C, only the function name is displayed, without the brackets.
E.g.:
C++: foo(), foo(int, float)
C: foo
In crash traces, the analyzer dumps the location contexts, but the
string is not enough for `-analyze-function` in C++ mode.
This patch addresses the issue by dumping the proper function names
even in stack traces.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105708
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.
The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
This reverts commit 3ec88ca60b which reverted e386871e1d due to a asan build
failure.
This patch removes the new lines in the test case which seem to introduce the
failure.
Differential revision: https://reviews.llvm.org/D104898
Replace the clang builtin function and LLVM intrinsic previously used to select
the f64x2.promote_low_f32x4 instruction with custom combines from standard
SelectionDAG nodes. Implement the new combines to share code with the similar
combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232.
Differential Revision: https://reviews.llvm.org/D105675
Set the device malloc and free functions as weak,
and move the std headers after device malloc/free
to avoid issues with std malloc/free.
Fixes: SWDEV-293590
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D105707
If the base is used in a map clause and later we have a memberexpr with
this base, and the member is a pointer, and this pointer is dereferenced
anyhow (subscript, array section, dereference, etc.), such components
should be considered as overlapped, otherwise it may lead to incorrect
size computations, since we try to map a pointee as a part of the whole
struct, which is not true for the pointer members.
Differential Revision: https://reviews.llvm.org/D105562
Reapply with fixes for clang tests.
-----
This is a simple enum attribute. Test changes are because enum
attributes are sorted before type attributes, so mustprogress is
now in a different position.
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D95561
This reverts commit 52aeacfbf5.
There isn't full agreement on a path forward yet, but there is agreement that
this shouldn't land as-is. See discussion on https://reviews.llvm.org/D105338
Also reverts unreviewed "[clang] Improve `-Wnull-dereference` diag to be more in-line with reality"
This reverts commit f4877c78c0.
And all the related changes to tests:
This reverts commit 9a0152799f.
This reverts commit 3f7c9cc274.
This reverts commit 329f8197ef.
This reverts commit aa9f58cc2c.
This reverts commit 2df37d5ddd.
This reverts commit a72a441812.