This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107393
For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.
Differential Revision: https://reviews.llvm.org/D106860
Zero-width bitfields on AIX pad out to the natral alignment boundary but
do not change the containing records alignment.
Differential Revision: https://reviews.llvm.org/D106900
For some use-cases, it might be useful to be able to turn off modules for C++ in `-cc1`. (The feature is implied by `-std=C++20`.)
This patch exposes the `-fno-cxx-modules` option in `-cc1`.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106864
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by
instruction 'fcmp uno'. This simple solution is target-independent
and works well in most cases. It however is not suitable if floating
point exceptions are tracked. Corresponding IEEE 754 operation and C
function must never raise FP exception, even if the argument is a
signaling NaN. Compare instructions usually does not have such
property, they raise 'invalid' exception in such case. So this
mechanism is unsuitable when exception behavior is strict. In
particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948.
It is used in the cases when raising FP exceptions by 'isnan' is not
allowed. This solution implements 'isnan' using integer operations.
It solves the problem of exceptions, but offers one solution for all
targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
specific code into IR. Now only SystemZ implements this hook and it
generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or
target-specific intrinsics. It complicates analysis and can prevent
some optimizations.
* IR can be created by tools other than clang, in this case treatment
of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
See PR48656.
The implementation of the template instantiation of requires expressions
was incorrectly trying to get the expression from an 'ExprRequirement'
before checking if it was an error state.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D107399
See PR47174.
When canonicalizing nested name specifiers of the type kind,
the prefix for 'DependentTemplateSpecialization' types was being
dropped, leading to malformed types which would cause failures
when rebuilding template names.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D107311
where should not.
Currently we are using QTy->isIncompleteType(&ND) to check incomplete
type. But before doing that, need to instantiate for a class template
specialization or a class member of a class template specialization,
or an array with known size of such..., so that we know it is really
incomplete type.
To fix this using RequireCompleteType instead.
The new test is added into "test/OpenMP/target_update_messages.cpp"
The different of using RequireCompleteType is when emit incomplete type,
an additional note is also emitted to point to where incomplete type
is declared. Because this change, many tests are needed to be fixed
by adding additional note.
This is to fix https://bugs.llvm.org/show_bug.cgi?id=50508
Differential Revision: https://reviews.llvm.org/D107200
https://reviews.llvm.org/D105964 updated the detection of function
definitions. It had the unfortunate effect to start marking object
definitions with attribute-like macros as function definitions.
This addresses this issue.
Reviewed By: owenpan
Differential Revision: https://reviews.llvm.org/D107269
This patch implements P2092
Simple requirements in requirement body shall not start with requires.
A warning was already in place so we just turn this warning into an error.
In addition, we add tests to make sure typename is optional in
requirement-parameter-list as per the same paper.
Previously we would show an error, but keep the member, and also the
CXXRrecordDecl, valid. This could lead to crashes when attempting to
access the record layout or size.
Differential Revision: https://reviews.llvm.org/D105478
The patch https://reviews.llvm.org/D105964 (58494c856a)
updated detection of function declaration names. It had the unfortunate
consequence that it started breaking between `function` and the function
name in some cases in JavaScript code.
This patch addresses this.
Reviewed By: MyDeveloperDay, owenpan
Differential Revision: https://reviews.llvm.org/D107267
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.
With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`. `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.
Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:
thinlto/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
"dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
"dwarfehprepare.NumNoUnwind": 28744,
a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.
Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D105225
The new -mtargetos= option is a replacement for the existing, OS-specific options
like -miphoneos-version-min=. This allows us to introduce support for new darwin OSes
easier as they won't require the use of a new option. The older options will be
deprecated and the use of the new option will be encouraged instead.
Differential Revision: https://reviews.llvm.org/D106316
In some cases, when the execution path of the diagnostic
goes back and forth, arrows can overlap and create a mess.
Dimming arrows that are not relevant at the moment, solves this issue.
They are still visible, but don't draw too much attention.
Differential Revision: https://reviews.llvm.org/D92928
This commit adds a very first version of this feature.
It is off by default and has to be turned on by checking the
corresponding box. For this reason, HTML reports still keep
control notes (aka grey bubbles).
Further on, we plan on attaching arrows to events and having all arrows
not related to a currently selected event barely visible. This will
help with reports where control flow goes back and forth (eg in loops).
Right now, it can get pretty crammed with all the arrows.
Differential Revision: https://reviews.llvm.org/D92639
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
For some reason, Microsoft declares _m_prefetch to take a const void*,
but _m_prefetchw to take a /volatile/ const void*.
Do the same for compatibility.
Differential revision: https://reviews.llvm.org/D106790
Definition of `__cpp_threadsafe_static_init` macro is controlled by
language option Opts.ThreadsafeStatics. This patch sets language
option to false by default in OpenCL mode, resulting in macro
`__cpp_threadsafe_static_init` being undefined. Default value can be
overridden using command line option -fthreadsafe-statics.
Change is supposed to address portability because not all OpenCL
vendors support thread safe implementation of static initialization.
Fixes llvm.org/PR48012
Differential Revision: https://reviews.llvm.org/D107163
The -fms-extensions converts __pragma (and _Pragma) into a #pragma that
has to occur at the beginning of a line and end with a newline. This
patch ensures that the newline after the #pragma is added even if
Token::isAtStartOfLine() indicated that we should not start a newline.
Committing relying post-commit review since the change is small, some
downstream uses might be blocked without this fix, and to make clear the
decision of the new -fminimize-whitespace feature (fix on main, revert
on clang-13.x branch) suggested by @aaron.ballman in D104601.
Differential Revision: https://reviews.llvm.org/D107183
In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate
type { [8 x i64] }. When emitting code for inline assembly operands,
clang tries to scalarize aggregate types to an integer of the equivalent
length, otherwise it passes them by-reference. This patch adds a target
hook to tell whether a given inline assembly operand is scalarizable
so that clang can emit code to pass/return it by-value.
Differential Revision: https://reviews.llvm.org/D94098
Rename the current -E option to "-E -Xflang -fno-reformat".
Add a new Parsing::EmitPreprocessedSource() routine to convert the
cooked character stream output of the prescanner back to something
more closely resembling output from a traditional preprocessor;
call this new routine when -E appears.
The new -E output is suitable for use as fixed form Fortran source to
compilation by (one hopes) any Fortran compiler. If the original
top-level source file had been free form source, the output will be
suitable for use as free form source as well; otherwise there may be
diagnostics about missing spaces if they were indeed absent in the
original fixed form source.
Unless the -P option appears, #line directives are interspersed
with the output (but be advised, f18 will ignore these if presented
with them in a later compilation).
An effort has been made to preserve original alphabetic character case
and source indentation.
Add -P and -fno-reformat to the new drivers.
Tweak test options to avoid confusion with prior -E output; use
-fno-reformat where needed, but prefer to keep -E, sometimes
in concert with -P, on most, updating expected results accordingly.
Differential Revision: https://reviews.llvm.org/D106727
The builtins vec_xl_len_r and vec_xst_len_r actually use the
wrong side of the vector on big endian Power9 systems. We never
spotted this before because there was no such thing as a big
endian distro that supported Power9. Now we have AIX and the
elements are in the wrong part of the vector. This just fixes
it so the elements are loaded to and stored from the right
side of the vector.
This patch will re-enable the patch posted under https://reviews.llvm.org/D106688 originally which was reverted due to buildbreak that was caused by mismatched diagnostic message arguments.
Reviewed By: Zarko Todorovski
Differential Revision: https://reviews.llvm.org/D107105
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
This is the same patch as in D106748 but with a tiny fix in checking of diagnostic messages.
Also added tests when program scope global variables are not supported.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D107154
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
Under the -faltivec-src-compat=gcc option, AltiVec vector initialization should
be treated as if they were compiled with gcc - which is, to emit an error when
the vectors are initialized in the parenthesized or non-parenthesized manner.
This patch implements this behaviour.
Differential Revision: https://reviews.llvm.org/D106410
Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.
Reviewed By: JonChesterfield, yaxunl
Differential Revision: https://reviews.llvm.org/D105981
Renamed language standard from openclcpp to openclcpp10.
Added new std values i.e. '-cl-std=clc++1.0' and
'-cl-std=CLC++1.0'.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D106266
Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D107025
CL 2.0 introduced atomics and generic address space so there were
only one set of APIs for doing atomics, however since CL 3.0
makes generic address space optional, there has to be new sets
of atomic interfaces to handle that cases.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106778
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106748
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well.
Also, ensure that cl_khr_3d_image_writes feature macro is set to the same value.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106260
Parse the -b option in the driver and pass it to the linker if the target OS is AIX. This will establish compatibility with the other AIX compilers.
Reviewed By: Zarko Todorovski
Differential Revision: https://reviews.llvm.org/D106688
This patch adds `#pragma clang deprecated` to enable deprecation of
preprocessor macros.
The macro must be defined before `#pragma clang deprecated`. When
deprecating a macro a custom message may be optionally provided.
Warnings are emitted at the use site of a deprecated macro, and can be
controlled via the `-Wdeprecated` warning group.
This patch takes some rough inspiration and a few lines of code from
https://reviews.llvm.org/D67935.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106732
@kpn pointed out that the global variable initialization functions didn't
have the "strictfp" metadata set correctly, and @rjmccall said that there
was buggy code in SetFPModel and StartFunction, this patch is to solve
those problems. When Sema creates a FunctionDecl, it sets the
FunctionDeclBits.UsesFPIntrin to "true" if the lexical FP settings
(i.e. a combination of command line options and #pragma float_control
settings) correspond to ConstrainedFP mode. That bit is used when CodeGen
starts codegen for a llvm function, and it translates into the
"strictfp" function attribute. See bugs.llvm.org/show_bug.cgi?id=44571
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D102343