This reverts commit a544710cd4.
See discussion in D120540.
This breaks C++ Clang modules on Darwin and also more than a dozen
tests in the LLDB testsuite. I think we need to be more careful to
separate out the enabling of Clang C++ modules and C++20
modules. Either by having -fmodules-ts control the HaveModules flag,
or by adding a way to explicitly turn them off.
This adds tests checking the behavior of const variables declared with
weak attribute.
Both checking that they can not be used in places where a constant
expression is required and that a dynamic initializer is emitted when
used as an initializer expression.
Differential Revision: https://reviews.llvm.org/D126578
`isExternCContext()` is returning false for functions in C files
Reviewed By: rnk, aaron.ballman
Differential Revision: https://reviews.llvm.org/D126559
"std::has_unique_object_representations<_BitInt(N)>" was always true,
even if the type has padding bits (since the trait assumes all integer
types have no padding bits). The standard has an explicit note that
this should not hold for types with padding bits.
Differential Revision: https://reviews.llvm.org/D125802
Compiler-rt doesn't provide support file for cfi on s390x ad ppc64le (at least).
When trying to use the flag, we get a file error.
This is an attempt at making the error more explicit.
Differential Revision: https://reviews.llvm.org/D120484
Make the SimpleSValBuilder to be able to look up and use a constraint
for an operand of a SymbolCast, when the operand is constrained to a
const value.
This part of the SValBuilder is responsible for constant folding. We
need this constant folding, so the engine can work with less symbols,
this way it can be more efficient. Whenever a symbol is constrained with
a constant then we substitute the symbol with the corresponding integer.
If a symbol is constrained with a range, then the symbol is kept and we
fall-back to use the range based constraint manager, which is not that
efficient. This patch is the natural extension of the existing constant
folding machinery with the support of SymbolCast symbols.
Differential Revision: https://reviews.llvm.org/D126481
For some reason, we implemented the xx_stxvp intrinsics
to require a const pointer. This absolutely doesn't make
sense for a store. Remove the const from the definition.
clang by default assumes static library name to be xxx.lib
when -lxxx is specified on Windows with MSVC environment,
instead of libxxx.a.
This patch fixes static device library unbundling for that.
It falls back to libxxx.a if xxx.lib is not found.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D126681
Create dxc_D as alias to option D which Define <macro> to <value> (or 1 if <value> omitted).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125338
Vector types in hlsl is using clang ext_vector_type.
Declaration of vector types is in builtin header hlsl.h.
hlsl.h will be included by default for hlsl shader.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D125052
This is to avoid err_target_unknown_abi which is caused by use default TargetTriple instead of shader model target triple.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D125585
-Wgnu-statement-expression currently warns for both direct source uses of statement expressions but also macro expansions; since they may be used by macros to avoid multiple evaluation of macro arguments, engineers might want to suppress warnings when statement expressions are expanded from macros but see them if introduced directly in source code.
Differential Revision: https://reviews.llvm.org/D126522
`-gen-reproducer` causes crash reproduction to be emitted
even when clang didn't crash, and now can optionally take an
argument of never, on-crash (default), on-error and always.
Differential revision: https://reviews.llvm.org/D120201
The new tests in clang/test/Driver/modules.cpp added by D120540 will fail if the
toolchain getting tested doesn't support linking. This reduces the utility of
the test since we would like a failure of this test to reflect a problem with
modules. We should already have other tests that validate linking support.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D126669
Vector types in hlsl is using clang ext_vector_type.
Declaration of vector types is in builtin header hlsl.h.
hlsl.h will be included by default for hlsl shader.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D125052
Without this patch, arguments to the
`llvm::OpenMPIRBuilder::AtomicOpValue` initializer are reversed.
Reviewed By: ABataev, tianshilei1992
Differential Revision: https://reviews.llvm.org/D126619
This patch allows user to use C++20 module by -fcxx-modules. Previously,
we could only use it under -std=c++20. Given that user could use C++20
coroutine standalonel by -fcoroutines-ts. It makes sense to offer an
option to use C++20 modules without enabling C++20.
Reviewed By: iains, MaskRay
Differential Revision: https://reviews.llvm.org/D120540
This patch allows user to use C++20 module by -fcxx-modules. Previously,
we could only use it under -std=c++20. Given that user could use C++20
coroutine standalonel by -fcoroutines-ts. It makes sense to offer an
option to use C++20 modules without enabling C++20.
Reviewed By: iains, MaskRay
Differential Revision: https://reviews.llvm.org/D120540
Before this patch, there was re-declaration error if error was encountered in
the same line. The recovery support acted only if this type of error was
encountered in the first line of the program and not in subsequent lines.
For example:
```
clang-repl> int i=9;
clang-repl> int j=9; err;
input_line_3:1:5: error: redefinition of 'j'
int j = 9;
```
Differential revision: https://reviews.llvm.org/D123674
8b4fa2c98e added this to remove left-over files on January.
Now we could assume they're cleaned up and this `rm` script is no longer necessary as FIXME states.
Differential Revision: https://reviews.llvm.org/D126597
Update the diagnostic in D81404: the convention is to use
err_drv_unsupported_option_argument instead of adding a new diagnostic for every
option.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D126511
Currently if `--sysroot /` is passed to the Clang driver, the include paths generated by the Clang driver will start with a double slash: `//usr/include/...`.
If VFS is used to inject files into the include paths (for example, the Swift compiler does this), VFS will get confused and the injected files won't be visible.
This change makes sure that the include paths start with a single slash.
Fixes#28283.
Differential Revision: https://reviews.llvm.org/D126289
-gen-reproducer causes crash reproduction to be emitted even
when clang didn't crash, and now can optionally take an argument
of never, on-crash (default), on-error and always.
Differential revision: https://reviews.llvm.org/D120201
This is a commit with the following changes:
* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality
Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks
will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.
* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.
* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`
This is bringing the following benefits:
* Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources.
* Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time.
* Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).
For normal preprocessing measurements show differences are below the noise level.
Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.
Differential Revision: https://reviews.llvm.org/D125486
Differential Revision: https://reviews.llvm.org/D125487
Differential Revision: https://reviews.llvm.org/D125488
This is first of a series of patches for making the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`.
This patch only includes NFC renaming changes to make reviewing of the functionality changing parts easier.
Differential Revision: https://reviews.llvm.org/D125484
OpenMP 5.2, sec. 10.2 "teams Construct", p. 232, L9-12 restricts what
regions can be strictly nested within a `teams` construct. This patch
relaxes Clang's enforcement of this restriction in the case of nested
`atomic` constructs unless `-fno-openmp-extensions` is specified.
Cases like the following then seem to work fine with no additional
implementation changes:
```
#pragma omp target teams map(tofrom:x)
#pragma omp atomic update
x++;
```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D126323
e5ccd66801 and
5029dce492 added deprecation warnings to
the <stdbool.h> and <stdnoreturn.h> headers, respectively, because the
headers are deprecated in C2x.
However, there are system headers that include these headers
unconditionally, and #warning diagnostics within system headers are
shown to users instead of suppressed, which means these deprecation
warnings are being triggered in circumstances that users have no
control over except to disable all the warnings through the
_CLANG_DISABLE_CRT_DEPRECATION_WARNINGS macro or other means.
This removes the problematic #warning uses until we find a more
palatable solution.
Refactor the code that handles the align clause of 'omp allocate' so
it can be used with globals as well as local variables.
Differential Revision: https://reviews.llvm.org/D126426
CoreturnStmt needs to keep the operand value distinct from its use in
any return_value call, so that instantiation may rebuild the latter.
But it also needs to keep the operand value separate in the case of
calling return_void. Code generation checks the operand value form to
determine whether it is a distincte entity to the promise call. This
adds the same logic to CFG generation.
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D126399
CUDA requires that static variables be visible to the host when
offloading. However, The standard semantics of a stiatc variable dictate
that it should not be visible outside of the current file. In order to
access it from the host we need to perform "externalization" on the
static variable on the device. This requires generating a semi-unique
name that can be affixed to the variable as to not cause linker errors.
This is currently done using the CUID functionality, an MD5 hash value
set up by the clang driver. This allows us to achieve is mostly unique
ID that is unique even between multiple compilations of the same file.
However, this is not always availible. Instead, this patch uses the
unique ID from the file to generate a unique symbol name. This will
create a unique name that is consistent between the host and device side
compilations without requiring the CUID to be entered by the driver. The
one downside to this is that we are no longer stable under multiple
compilations of the same file. However, this is a very niche use-case
and is not supported by Nvidia's CUDA compiler so it likely to be good
enough.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125904
Post-commit feedback on https://reviews.llvm.org/D122895 pointed out
that the diagnostic wording for some code was using "declaration" in a
confusing way, such as:
int foo(); // warning: a function declaration without a prototype is deprecated in all versions of C and is not supported in C2x
int foo(int arg) { // warning: a function declaration without a prototype is deprecated in all versions of C and is not supported in C2x
return 5;
}
And that we had other minor issues with the diagnostics being somewhat
confusing.
This patch addresses the confusion by reworking the implementation to
be a bit more simple and a bit less chatty. Specifically, it changes
the warning and note diagnostics to be able to specify "declaration" or
"definition" as appropriate, and it changes the function merging logic
so that the function without a prototype is always what gets warned on,
and the function with a prototype is sometimes what gets noted.
Additionally, when diagnosing a K&R C definition that is preceded by a
function without a prototype, we don't note the prior declaration, we
warn on it because it will also be changing behavior in C2x.
Differential Revision: https://reviews.llvm.org/D125814
Dependent patch adds UnarySymExpr, now I'd like to handle that for SMT
conversions like refutation.
Differential Revision: https://reviews.llvm.org/D125547
This patch adds a new descendant to the SymExpr hierarchy. This way, now
we can assign constraints to symbolic unary expressions. Only the unary
minus and bitwise negation are handled.
Differential Revision: https://reviews.llvm.org/D125318
Depends on D124758. That patch introduced serious regression in the run-time in
some special cases. This fixes that.
Differential Revision: https://reviews.llvm.org/D126406
This ensures that a deduced type like __auto_type matches the correct
association instead of matching all associations.
This addresses a regression from e4a42c5b64Fixes#55702
The new driver uses an augmented linker wrapper to perform the device
linking phase, but to the user looks like a regular linker invocation.
Contrary to the old driver, the new driver contains all the information
necessary to produce a linked device image in the host object itself.
Currently, we infer the usage of the device linker by the user
specifying an offloading toolchain, e.g. (--offload-arch=...) or
(-fopenmp-targets=...), but this shouldn't be strictly necessary.
This patch introduces a new option `--offload-link` to tell
the driver to use the offloading linker instead. So a compilation flow
can now look like this,
```
clang foo.cu --offload-new-driver -fgpu-rdc --offload-arch=sm_70 -c
clang foo.o --offload-link -lcudart
```
I was considering if this could be merged into the `-fuse-ld` option,
but because the device linker wraps over the users linker it would
conflict with that. In the future it's possible to merge this into `lld`
completely or `gold` via a plugin and we would use this option to
enable the device linking feature. Let me know what you think for this.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D126398
The concept specialization expression should start at the location of
the nested qualifiers when it has nested qualifiers.
This ensures that libclang reports correct source ranges that include
all subexpressions when visiting the expression.
Differential Revision: https://reviews.llvm.org/D126332
Const class members may be initialized with a defaulted default
constructor under the same conditions it would be allowed for a const
object elsewhere.
Differential Revision: https://reviews.llvm.org/D126170
These symbols are understood to not be used for client API consumption
by convention so they should not appear in the generated symbol graph.
Differential Revision: https://reviews.llvm.org/D125678
Warns when end-of-file is reached without seeing all matching
'omp end declare target' directives. The diagnostic shows the
location of the related begin directive.
Differential Revision: https://reviews.llvm.org/D126331
This would allow more AST nodes being preserved for broken code, and
have a more consistent valid bit for ref-type var decl (currently, a
ref-type var decl with a broken initializer is valid).
Per https://reviews.llvm.org/D76831#1973053, the initializer of a variable
should play no part in its "invalid" bit.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D122935
Following the new flow for external object code emission,
provide flags to switch between integrated and external
backend similar to the integrated assembler options.
SPIR-V target is the only user of this functionality at
this point.
This patch also updated SPIR-V documentation to clarify
that integrated object code emission for SPIR-V is an
experimental feature.
Differential Revision: https://reviews.llvm.org/D125679
Fixes https://github.com/llvm/llvm-project/issues/55546
The assertion mentioned in the issue is triggered because an
inconsistency is formed in the Sym->Class and Class->Sym relations. A
simpler but similar inconsistency is demonstrated here:
https://reviews.llvm.org/D114887 .
Previously in `removeMember`, we didn't remove the old symbol's
Sym->Class relation. Back then, we explained it with the following two
bullet points:
> 1) This way constraints for the old symbol can still be found via it's
> equivalence class that it used to be the member of.
> 2) Performance and resource reasons. We can spare one removal and thus one
> additional tree in the forest of `ClassMap`.
This patch do remove the old symbol's Sym->Class relation in order to
keep the Sym->Class relation consistent with the Class->Sym relations.
Point 2) above has negligible performance impact, empirical measurements
do not show any noticeable difference in the run-time. Point 1) above
seems to be a not well justified statement. This is because we cannot
create a new symbol that would be equal to the old symbol after the
simplification had happened. The reason for this is that the SValBuilder
uses the available constant constraints for each sub-symbol.
Differential Revision: https://reviews.llvm.org/D126281
This is a support for " #pragma omp atomic compare fail ". It has Parser & AST support for now.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D123235
Since this didn't make it into the v14 release - anyone requesting the
v14 ABI shouldn't get this GCC-compatible change that isn't backwards
compatible with v14 Clang.
Differential Revision: https://reviews.llvm.org/D126334
This creates an entry with address=nullptr and flag=0x80.
When an 'omp_all_memory' entry is specified any other 'out' or
'inout' entries are not needed and are not passed to the runtime.
Differential Revision: https://reviews.llvm.org/D126321
These tests are failing on the PPC64 AIX CI bot, but it's unclear why,
as they pass on other CI jobs.
I marked them as unsupported on AIX for now while investigating the failure.
Previous changes for the BTF attributes introduced a new sub-tree
visitation. That uncovered that when accessing the typespec location we
would assert that the type specification is either a type declaration or
`typename`. However, `typename` was explicitly permitted. This change
predates the introduction of newer deduced type representations such as
`__underlying_type` from C++ and the addition of the GNU `__typeof__`
expression.
Thanks to aaron.ballman for the valuable discussion and pointer to
`isTypeRep`.
Differential Revision: https://reviews.llvm.org/D126093
Reviewed By: aaron.ballman, yonghong-song
Adds support for the reserved locator 'omp_all_memory' for use
in depend clauses with 'out' or 'inout' dependence-types.
Differential Revision: https://reviews.llvm.org/D125828
This commit builds upon recently added indexing support for C++ concepts
from https://reviews.llvm.org/D124441 by extending libclang to
support indexing and visiting concepts, constraints and requires
expressions as well.
Differential Revision: https://reviews.llvm.org/D126031
Clang has recently started diagnosing prototype redeclaration errors like [rG385e7df33046](https://reviews.llvm.org/rG385e7df33046d7292612ee1e3ac00a59d8bc0441)
This flagged legitimate issues in a codebase but was confusing to resolve because it actually conflicted with a function declaration from a system header and not from the one emitted with "note: ".
This patch updates the error handling to use the canonical declaration's source location instead to avoid misleading errors like the one described.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D126258
We use the clang-linker-wrapper to perform device linking of embedded
offloading object files. This is done by generating those jobs inside of
the linker-wrapper itself. This patch adds an argument in Clang and the
linker-wrapper that allows users to forward input to the device linking
phase. This can either be done for every device linker, or for a
specific target triple. We use the `-Xoffload-linker <arg>` and the
`-Xoffload-linker-<triple> <arg>` syntax to accomplish this.
Reviewed By: markdewing, tra
Differential Revision: https://reviews.llvm.org/D126226
For generic targets such as SPIR-V clang sets all OpenCL
extensions/features as supported by default. However
concrete targets are unlikely to support all extensions
features, which creates a problem when such generic SPIR-V
binary is compiled for a specific target later on.
To allow compile time diagnostics for unsupported features
this flag is now being exposed in the clang driver.
Differential Revision: https://reviews.llvm.org/D125243
The new test is a copy of the corresponding PS4 test, with the triple
etc updated, because there's currently no good way to make one lit test
"iterate" with multiple targets.
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170
unsigned char __readx18byte(unsigned long)
unsigned short __readx18word(unsigned long)
unsigned long __readx18dword(unsigned long)
unsigned __int64 __readx18qword(unsigned long)
Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedLoad()`
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126024
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170
void __writex18byte(unsigned long, unsigned char)
void __writex18word(unsigned long, unsigned short)
void __writex18dword(unsigned long, unsigned long)
void __writex18qword(unsigned long, unsigned __int64)
Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedStore()`.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126023
The arch or cpu has its default fpu features and versions such as fpuv2_sf/fpuv3_sf.
And there is also -mfpu option to specify and override fpu version and features.
For example, C860 has fpuv3_sf/fpuv3_df feature as default, when
-mfpu=fpv2 is given, fpuv3_sf/fpuv3_df is replaced with fpuv2_sf/fpuv2_df.
This starts to fill out the C DR status page with information
determined from tests. It also starts to add some test coverage for the
DRs we can add tests for (some are difficult as not all C DRs involve
questions about code and some DRs are about the behavior of linking
multiple TUs together).
Note: there is currently no automation for filling out the HTML page
from test coverage like there is for the C++ DRs, but this commit
attempts to use a similar comment style in case we want to add such a
script in the future.
Even though the comment description is ".unroll_inner.iv < NumIterations", the code emitted a BO_LE ('<=') operator for the inner loop that is to be unrolled. This lead to one additional copy of the body code in a partially unrolled. It only manifests when the unrolled loop is consumed by another loop-associated construct. Fix by using the BO_LT operator instead.
The condition for the outer loop and the corresponding code for tiling correctly used BO_LT already.
Fixes#55236
Support for `__attribute__((no_builtin("foo")))` was added in https://reviews.llvm.org/D68028,
but builtins were still being used even when the attribute was placed on a function.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D124701
32 Bit windows includes attribute 'thiscall' on member functions as a
default calling convention. This test was not written in a way that
works with that, so added wildcards so it is tolerant of it.
Allows emitting define amdgpu_kernel void @func() IR from C or C++.
This replaces the current workflow which is to write a stub in opencl that
calls an external C function implemented in C++ combined through llvm-link.
Calling the resulting function still requires a manual implementation of the
ABI from the host side. The primary application is for more rapid debugging
of the amdgpu backend by permuting a C or C++ test file instead of manually
updating an IR file.
Implementation closely follows D54425. Non-amd reviewers from there.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D125970
WG14 DR423 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2148.htm#dr_423),
resolved during the C11 time frame, changed the way qualifiers are
handled on function return types and in cast expressions after it was
noted that these types are now directly observable via generic
selection expressions. In C, the function declarator is adjusted to
ignore all qualifiers (including _Atomic qualifiers).
Clang already handles the cast expression case correctly (by performing
the lvalue conversion, which drops the qualifiers as well), but with
these changes it will now also handle function declarations
appropriately.
Fixes#39595
Differential Revision: https://reviews.llvm.org/D125919
It doesn't matter which value we use for dead args, so let's switch
to poison, so we can eventually kill undef.
Reviewed By: aeubanks, fhahn
Differential Revision: https://reviews.llvm.org/D125983
This patch implements the following floating point negative absolute value
builtins that required for compatibility with the XL compiler:
```
double __fnabs(double);
float __fnabss(float);
```
These builtins will emit :
- fnabs on PWR6 and below, or if VSX is disabled.
- xsnabsdp on PWR7 and above, if VSX is enabled.
Differential Revision: https://reviews.llvm.org/D125506
Emit predefined macros for GPU family. e.g.
for GPU gfx9xx emit __GFX9__, etc.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D125909
Fix __has_builtin to return 1 only if the requested target features
of a builtin are enabled by refactoring the code for checking
required target features of a builtin and use it in evaluation
of __has_builtin.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D125829
An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.
Reviewed By: dblaikie, rnk, aprantl
Differential Revision: https://reviews.llvm.org/D123534
New diagnostics were added for unreachable generic selection expression
associations in ca75ac5f04, but it did
not account for a difference in behavior between C and C++ regarding
lvalue to rvalue conversions. So we would issue diagnostics about a
selection being unreachable and then reach it. This corrects the
diagnostic behavior in that case.
Differential Revision: https://reviews.llvm.org/D125882
Since we already have a dedicated file for testing the /Zc: flags,
let's be consistent about putting /Zc: tests there.
No behavior change.
Differential Revision: https://reviews.llvm.org/D125889
This new CTU implementation is the natural extension of the normal single TU
analysis. The approach consists of two analysis phases. During the first phase,
we do a normal single TU analysis. During this phase, if we find a foreign
function (that could be inlined from another TU) then we don’t inline that
immediately, we rather mark that to be analysed later.
When the first phase is finished then we start the second phase, the CTU phase.
In this phase, we continue the analysis from that point (exploded node)
which had been enqueued during the first phase. We gradually extend the
exploded graph of the single TU analysis with the new node that was
created by the inlining of the foreign function.
We count the number of analysis steps of the first phase and we limit the
second (ctu) phase with this number.
This new implementation makes it convenient for the users to run the
single-TU and the CTU analysis in one go, they don't need to run the two
analysis separately. Thus, we name this new implementation as "onego" CTU.
Discussion:
https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728
Differential Revision: https://reviews.llvm.org/D123773
x86-target-features.c can spuriously fail when checking for absence of
the string "lvi" in the compiler output due to the temporary path used
for the output file. For example:
"-o" "/tmp/lit-tmp-981j7lvi/x86-target-features-670b86.o"
will make the test fail. This commit checks specifically for lvi as a
target feature, in a similar way to the positive CHECK directive just
above.
Test Plan: fails when using -mlvi-hardening and pass otherwise
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D125084
CSKY is always in 4-byte align, no matter it's long long type.
For global aggregate variable, it's 4-byte align if its size is bigger than or equal to 4 bytes.
Differential Revision: https://reviews.llvm.org/D124977
Similar to D123386, this adds D-Movs to the AArch64 perfect shuffle
tables, slightly lowering the costs a little more. This is a rough
improvement in general, especially if you ignore mov v0.16b, v2.16b type
moves that are often artefacts of the calling convention.
The D register movs are encoded as (0x4 | LaneIdx), and to generate a D
register move we are required to bitcast into a higher type, but it is
otherwise very similar to the S-lane mov's already supported.
Differential Revision: https://reviews.llvm.org/D125477
The standard says:
The optional requires-clause ([temp.pre]) in an init-declarator or
member-declarator shall be present only if the declarator declares a
templated function ([dcl.fct]).
This implements that limitation, and updates the tests to the best of my
ability to capture the intent of the original checks.
Differential Revision: https://reviews.llvm.org/D125711
Previously the Expr returned by getOperand() was actually the
subexpression common to the "ready", "suspend", and "resume"
expressions, which often isn't just the operand but e.g.
await_transform() called on the operand.
It's important for the AST to expose the operand as written
in the source for traversals and tools like clangd to work
correctly.
Fixes https://github.com/clangd/clangd/issues/939
Differential Revision: https://reviews.llvm.org/D115187
The vload*_half* and vstore*_half* builtins do not require the
cl_khr_fp16 extension: pointers to `half` can be declared without the
extension and the _half variants of vload and vstore should be
available without the extension.
This aligns the guards for these builtins for
`-fdeclare-opencl-builtins` with `opencl-c.h`.
Fixes https://github.com/llvm/llvm-project/issues/55275
Differential Revision: https://reviews.llvm.org/D125401
function in promise_type
According to https://cplusplus.github.io/CWG/issues/2585.html, this
fixes https://github.com/llvm/llvm-project/issues/54881
Simply, the clang tried to found (do lookup and overload resolution. Is
there any better word to use than found?) allocation function in
promise_type and global scope. However, this is not consistent with the
standard. The standard behavior would be that the compiler shouldn't
lookup in global scope in case we lookup the allocation function name in
promise_type. In other words, the program is ill-formed if there is
incompatible allocation function in promise type.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D125517
An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.
Reviewed By: dblaikie, rnk, aprantl
Differential Revision: https://reviews.llvm.org/D123534
The Clang driver additional stages to build a complete offloading
program for applications using CUDA or OpenMP offloading. This normally
requires either a source file input or a valid object file to be
handled. This would cause problems when trying to compile an assembly or
LLVM IR file through clang with flags that would enable offloading. This
patch simply adds a check to prevent the offloading toolchain from being
used if we don't have a valid source file.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125705
I noticed these two tests emit a warning about a missing
unhandled_exception. That's irrelevant to what is being tested, but
is unnecessary noise.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D125535
When a non-const compound statement is used to initialize a constexpr pointer,
the pointed value is not const itself and cannot be folded at codegen time.
This matches GCC behavior for compound literal expr arrays.
Fix issue #39324.
Differential Revision: https://reviews.llvm.org/D124038
Summary:
Normally we parse through the CUDA installation to disover the needed
features. However, we may want to build libraries on targets that do not
currently have CUDA installed but still need to know which features to
make use of when creating the PTX or bitcode. This flag is a simple way
to specify this so we can compile certain codes withotu a valid CUDA
installation.
Ideally this could be done via an -Xarch or simimlar flag but currently
they cannot handle this. We would need to support using an -Xarch flag
that takes multiple arguments that then pass them to the -Xclang
functionality.
The previous patches allowed us to create a static library containing
all the device code. This patch uses that library to perform the device
runtime linking late when performing LTO. This in addition to
simplifying the libraries, allows us to transparently handle the runtime
library as-needed without needing Clang to manually pass the necessary
library in the linker wrapper job.
Depends on D125315
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125333
We use globals to configure debugging at compile-time for the device
runtime. Because these are only used by the OpenMP runtime we shouldn't
define them if we aren't using the device runtime. When a user passes in
'-nogpulib' this indicates that we are not using the device runtime, so
we should check for the precense of this flag and not emit these globals
if used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125314
Currently we define the `__CUDA_ARCH__` macro only in CUDA mode. This
patch allows us to use this macro in OpenMP-offloading mode when
targeting NVPTX.
Reviewed By: tra, tianshilei1992
Differential Revision: https://reviews.llvm.org/D125256
Return the Decl when parsing the template member declaration so the
'omp declare simd' pragma can be applied to it. Previously a nullptr
was returned causing an error applying the pragma.
Fixes#52700.
Differential Revision: https://reviews.llvm.org/D125493
In some rare cases the type of an SVal might be interesting.
This introspection function exposes this information in tests.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D125532
When a preprocessor directive is unknown outside of a skipped
conditional block, we give an error diagnostic because we don't know
how to proceed with preprocessing. But when the directive is in a
skipped conditional block, we would not diagnose it on the theory that
the directive may be known to an implementation other than Clang.
Now, for unknown directives inside a skipped conditional block, we
diagnose the unknown directive as a warning if it is sufficiently
similar to a directive specific to preprocessor conditional blocks. For
example, we'll warn about `#esle` and suggest `#else` but we won't warn
about `#progma` because it's not a directive specific to preprocessor
conditional blocks.
Fixes#51598
Differential Revision: https://reviews.llvm.org/D124726
Before issuing the warning about use of a strict prototype, check if
the declarator is required to have a prototype through some other means
determined at parse time.
This silences false positives in OpenCL code (where the functions are
forced to have a prototype) and block literal expressions.
That's required to support `\n`, but can also be used for other commands.
We already had the infrastructure in place to parse a varying number of
arguments, we simply needed to generalize it so that it would work not
only for block commands.
This should fix#55319.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D125429
The command traits have a member NumArgs for which all the parsing
infrastructure is in place, but no command was setting it to a value
other than 0. By doing so we get warnings when passing an empty
paragraph to \retval (the first argument is the return value, then comes
the description). We also take \xrefitem along for the ride, although as
the documentation states it's unlikely to be used directly.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D125422
This adds a late Machine Pass to work around a Cortex CPU Erratum
affecting Cortex-A57 and Cortex-A72:
- Cortex-A57 Erratum 1742098
- Cortex-A72 Erratum 1655431
The pass inserts instructions to make the inputs to the fused AES
instruction pairs no longer trigger the erratum. Here the pass errs on
the side of caution, inserting the instructions wherever we cannot prove
that the inputs came from a safe instruction.
The pass is used:
- for Cortex-A57 and Cortex-A72,
- for "generic" cores (which are used when using `-march=`),
- when the user specifies `-mfix-cortex-a57-aes-1742098` or
`mfix-cortex-a72-aes-1655431` in the command-line arguments to clang.
Reviewed By: dmgreen, simon_tatham
Differential Revision: https://reviews.llvm.org/D119720
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125323
BoolAssignment checker is now taint-aware and warns if a tainted value is
assigned.
Original author: steakhal
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D125360
This allows using any recognized kind of string for any
__attribute__((format)) archetype. Before this change, for instance,
the printf archetype would only accept char pointer types and the
NSString archetype would only accept NSString pointers. This is
more restrictive than necessary as there exist functions to
convert between string types that can be annotated with
__attribute__((format_arg)) to transfer format information.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D125254
rdar://89060618
MSVC expects wchar_t to be defined in stddef.h if /Zc:wchar_t- is specified
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D124026
The re-apply includes fixes to clang tests that were missed in
the original commit.
Original message:
Prior to this patch we would only set to undef the unused arguments of the
external functions. The rationale was that unused arguments of internal
functions wouldn't need to be turned into undef arguments because they
should have been simply eliminated by the time we reach that code.
This is actually not true because there are plenty of cases where we can't
remove unused arguments. For instance, if the internal function is used in
an indirect call, it may not be possible to change the function signature.
Yet, for statically known call-sites we would still like to mark the unused
arguments as undef.
This patch enables the "set undef arguments" optimization on internal
functions when we encounter cases where internal functions cannot be
optimized. I.e., whenever an internal function is marked "live".
Differential Revision: https://reviews.llvm.org/D124699
This diff changes the serialization of the `ORIGINAL_PCH_DIR`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This will allow for the module to be relocatable
across machines.
The path is restored relative to the module's BaseDirectory on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124946
This diff changes the serialization of the `SUBMODULE_TOPHEADER`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This matches the behavior of the
`SUBMODULE_HEADER` entry and will allow for the module to be
relocatable across machines.
The path is restored relative to the module's `BaseDirectory` on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124938
This diff adds a new frontend flag `-fmodule-file-home-is-cwd`.
The behavior of this flag is similar to
`-fmodule-map-file-home-is-cwd` but does not require the module
map files to be modified to have inputs relative to the cwd.
Instead the output modules will have their `BaseDirectory` set
to the cwd and will try and resolve paths relative to that.
The motiviation for this change is to support relocatable pcm
files that are built on different machines with different paths
without having to alter module map files, which is sometimes not
possible as they are provided by 3rd parties.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124874
This PR changes the `SymIntExpr` so the expression that uses a
negative value as `RHS`, for example: `x +/- (-N)`, is modeled as
`x -/+ N` instead.
This avoids producing a very large `RHS` when the symbol is cased to
an unsigned number, and as consequence makes the value more robust in
presence of casts.
Note that this change is not applied if `N` is the lowest negative
value for which negation would not be representable.
Reviewed By: steakhal
Patch By: tomasz-kaminski-sonarsource!
Differential Revision: https://reviews.llvm.org/D124658
This adds an extension warning when using the preprocessor conditionals
in a language mode they're not officially supported in, and an opt-in
warning for compatibility with previous standards.
Fixes#55306
Differential Revision: https://reviews.llvm.org/D125178
It turns out that the llvm buildbots run the test with
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4, which would cause this
test to fail as the test assumed that the default target is Windows. To
fix this, we explicitly set -target for the Windows testcases.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D125425
When Clang generates the path prefix (i.e. the path of the directory
where the file is) when generating FILE, __builtin_FILE(), and
std::source_location, Clang uses the platform-specific path separator
character of the build environment where Clang _itself_ is built. This
leads to inconsistencies in Chrome builds where Clang running on
non-Windows environments uses the forward slash (/) path separator
while Clang running on Windows builds uses the backslash (\) path
separator. To fix this, we add a flag -ffile-reproducible (and its
inverse, -fno-file-reproducible) to have Clang use the target's
platform-specific file separator character.
Additionally, the existing flags -fmacro-prefix-map and
-ffile-prefix-map now both imply -ffile-reproducible. This can be
overriden by setting -fno-file-reproducible.
[0]: https://crbug.com/1310767
Differential revision: https://reviews.llvm.org/D122766
Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If
there is a need to have highly optimized codegen and kernel developers have
knowledge of inter-wave runtime behavior which is unknown to the compiler this
builtin can be used to tune scheduling.
This intrinsic creates a barrier between scheduling regions. The immediate
parameter is a mask to determine the types of instructions that should be
prevented from crossing the sched_barrier. In this initial patch, there are only
two variations. A mask of 0 means that no instructions may be scheduled across
the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing
instructions may cross the sched_barrier.
Note that this intrinsic is only meant to work with the scheduling passes. Any
other transformations that may move code will not be impacted in the ways
described above.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D124700
Create dxc_D as alias to option D which Define <macro> to <value> (or 1 if <value> omitted).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125338
D87451 added -mignore-xcoff-visibility for AIX targets and made it the default (which mimicked the behaviour of the XL 16.1 compiler on AIX).
However, ignoring hidden visibility has unwanted side effects and some libraries depend on visibility to hide non-ABI facing entities from user headers and
reserve the right to change these implementation details based on this (https://libcxx.llvm.org/DesignDocs/VisibilityMacros.html). This forces us to use
internal linkage fallbacks for these cases on AIX and creates an unwanted divergence in implementations on the plaform.
For these reasons, it's preferable to not add -mignore-xcoff-visibility by default, which is what this patch does.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D125141
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125165
This patch adds the necessary code generation to create the wrapper code
that registers all the globals in CUDA. We create the necessary
functions and iterate through the list of
`__start_cuda_offloading_entries` to find which globals must be
registered. This is very similar to the code generation done currently
in Clang for non-rdc builds, but here we are registering a fully linked
fatbinary and finding the globals via the above sections.
With this we should be able to fully support basic RDC / LTO building of CUDA
code.
It's also worth noting that this does not include the necessary PTX to JIT the
image, so to use this support the offloading architecture must match the
system's architecture.
Depends on D123810
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D123812
The changes made in D123460 generalized the code generation for OpenMP's
offloading entries. We can use the same scheme to register globals for
CUDA code. This patch adds the code generation to create these
offloading entries when compiling using the new offloading driver mode.
The offloading entries are simple structs that contain the information
necessary to register the global. The struct used is as follows:
```
Type struct __tgt_offload_entry {
void *addr; // Pointer to the offload entry info.
// (function or global)
char *name; // Name of the function or global.
size_t size; // Size of the entry info (0 if it a function).
int32_t flags;
int32_t reserved;
};
```
Currently CUDA handles RDC code generation by deferring the registration
of globals in the current TU to a callback function containing the
modules ID. Later all the module IDs will be used to register all of the
globals at once. Rather than mimic this, offloading entries allow us to
mimic the way OpenMP registers globals. That is, we create a simple
global struct for each device global to be registered. These are placed
at a special section `cuda_offloading_entires`. Because this section is
a valid C-identifier, the linker will profide a `__start` and `__stop`
pointer that we can use to iterate and register all globals at runtime.
the registration requires a flag variable to indicate which registration
function to use. I have assigned the flags somewhat arbitrarily, but
these use the following values.
Kernel: 0
Variable: 0
Managed: 1
Surface: 2
Texture: 3
Depends on D120272
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D123471
This adds the -Wgnu-line-marker diagnostic flag, grouped under -Wgnu,
to warn about use of the GNU linemarker preprocessor extension.
Fixes#55067
Differential Revision: https://reviews.llvm.org/D124534
This adds support for variable stride with the val, uval, and ref linear
modifiers. Previously only the no modifer type ls<argno> was supported.
val -> Ls<argno>
uval -> Us<argno>
ref -> Rs<argno>
Differential Revision: https://reviews.llvm.org/D125330
CUDA/HIP programs use __noinline__ like a keyword e.g.
__noinline__ void foo() {} since __noinline__ is defined
as a macro __attribute__((noinline)) in CUDA/HIP runtime
header files.
However, gcc and clang supports __attribute__((__noinline__))
the same as __attribute__((noinline)). Some C++ libraries
use __attribute__((__noinline__)) in their header files.
When CUDA/HIP programs include such header files,
clang will emit error about invalid attributes.
This patch fixes this issue by supporting __noinline__ as
a keyword, so that CUDA/HIP runtime could remove
the macro definition.
Reviewed by: Aaron Ballman, Artem Belevich
Differential Revision: https://reviews.llvm.org/D124866
Specifically for: !tbaa, !tbaa.struct, !annotation, !srcloc, !nosanitize.
The goal is to avoid test brittleness caused by hardcoded values.
Differential Revision: https://reviews.llvm.org/D123273
Add mangling for linear parameters specified with ref, uval, and val
for 'omp declare simd' vector functions.
Add missing stride for linear this parameters.
Differential Revision: https://reviews.llvm.org/D125269
The controlling expression of a _Generic selection expression undergoes
lvalue conversion, array conversion, and function conversion before
picking the association. This means that array types, function types,
and qualified types are all unreachable code if they're used as an
association. I've been caught by this twice in the past few months and
I figure that if a WG14 member can't seem to remember this rule, users
are also likely to struggle with it. So this adds an on-by-default
unreachable code diagnostic for generic selection expression
associations.
Note, we don't have to worry about function types as those are already
a constraint violation which generates an error.
Differential Revision: https://reviews.llvm.org/D125259
In case of placement new, if we do not know the alignment of the
operand, we can't assume it has the preferred alignment. It might be
e.g. a pointer to a struct member which follows ABI alignment rules.
This makes UBSAN no longer report "constructor call on misaligned
address" when constructing a double into a struct field of type double
on i686. The psABI specifies an alignment of 4 bytes, but the preferred
alignment used by Clang is 8 bytes.
We now use ABI alignment for allocating new as well, as the preferred
alignment should be used for over-aligning e.g. local variables, which
isn't relevant for ABI code dealing with operator new. AFAICT there
wouldn't be problems either way though.
Fixes#54845.
Differential Revision: https://reviews.llvm.org/D124736
Currently for SVE2 ACLE builtins, single tests are used to verify both
clang code generation (when the feature is available) and semantic
error/warning messages (when the feature is unavailable). This
patch moves the semantic testing for the target feature flag into
dedicated Sema tests.
Differential Revision: https://reviews.llvm.org/D124850
Currently for SVE ACLE builtins, single tests are used to verify both
clang code generation (when the feature is available) and semantic
error/warning messages (when the feature is unavailable). This
patch moves the semantic testing into dedicated Sema tests.
Differential Revision: https://reviews.llvm.org/D124924
Summary:
By evaluating both children states, now we are capable of discovering
infeasible parent states. In this patch, `assume` is implemented in the terms
of `assumeDuali`. This might be suboptimal (e.g. where there are adjacent
assume(true) and assume(false) calls, next patches addresses that). This patch
fixes a real CRASH.
Fixes https://github.com/llvm/llvm-project/issues/54272
Differential Revision:
https://reviews.llvm.org/D124758
In some cases a parent State is already infeasible, but we recognize
this only if an additonal constraint is added. This patch is the first
of a series to address this issue. In this patch `assumeDual` is changed
to clone the parent State but with an `Infeasible` flag set, and this
infeasible-parent is returned both for the true and false case. Then
when we add a new transition in the exploded graph and the destination
is marked as infeasible, the node will be a sink node.
Related bug:
https://github.com/llvm/llvm-project/issues/50883
Actually, this patch does not solve that bug in the solver, rather with
this patch we can handle the general parent-infeasible cases.
Next step would be to change the State API and require all checkers to
use the `assume*Dual` API and deprecate the simple `assume` calls.
Hopefully, the next patch will introduce `assumeInBoundDual` and will
solve the CRASH we have here:
https://github.com/llvm/llvm-project/issues/54272
Differential Revision: https://reviews.llvm.org/D124674
It should be useful clang-fuzzer itself, though my own motivation is
to use this in fuzzing clang-pseudo. (clang-tools-extra/pseudo/fuzzer).
Differential Revision: https://reviews.llvm.org/D125166
Ensures an -Wenum-conversion warning happens when one of the enums is
signed and the other is unsigned. Also adds a test file to verify these
warnings.
This warning would not happen since the -Wsign-conversion would make a
diagnostic then return, never allowing the -Wenum-conversion checks.
For example:
C
enum PE { P = -1 };
enum NE { N };
enum NE conv(enum PE E) { return E; }
Before this would only create a diagnostic with -Wsign-conversion and
never on -Wenum-conversion. Now it will create a diagnostic for both
-Wsign-conversion and -Wenum-conversion.
I could change it to just warn on -Wenum-conversion as that was what I
initially did. Seeing PR35200 (or GitHub Issue 316268), I let both
diagnostics check so that the sign conversion could generate a warning.
This patch restores the symmetry between how operator new and operator delete
are handled by also inlining the content of operator delete when possible.
Patch by Fred Tingaud.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D124845
Like regular assignment, compound assignment operators can be assumed to
write to their left-hand side operand. So we strengthen the requirements
there. (Previously only the default read access had been required.)
Just like operator->, operator->* can also be assumed to dereference the
left-hand side argument, so we require read access to the pointee. This
will generate new warnings if the left-hand side has a pt_guarded_by
attribute. This overload is rarely used, but it was trivial to add, so
why not. (Supporting the builtin operator requires changes to the TIL.)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D124966
This includes a fix for the libc++ issue I ran across with friend
declarations not properly being identified as overloads.
This reverts commit 45c07db31c.