In some cases, an error constructing a compiler or assembler job could
leave the Inputs in a state that the code for constructing the linker
job was not ready for.
The linker wrapper is used to perform linking and wrapping of embedded
device object files. Currently its internals are not able to be tested
easily. This patch adds the `--dry-run` and `--print-wrapped-module`
options to investigate the link jobs that will be run along with the
wrapped code that will be created to register the binaries.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D124039
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
When doing overload resolution, we have to check that candidates' parameter types are equal before trying to find a better candidate through checking which candidate is more constrained.
This revision adds this missing check and makes us diagnose those cases as ambiguous calls when the types are not equal.
Fixes GitHub issue https://github.com/llvm/llvm-project/issues/53640
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D123182
This reverts commit af0285122f.
The test "libomp::loop_dispatch.c" on builder
openmp-gcc-x86_64-linux-debian fails from time-to-time.
See #54969. This patch is unrelated.
Previously an opt-in flag `-fopenmp-new-driver` was used to enable the
new offloading driver. After passing tests for a few months it should be
sufficiently mature to flip the switch and make it the default. The new
offloading driver is now enabled if there is OpenMP and OpenMP
offloading present and the new `-fno-openmp-new-driver` is not present.
The new offloading driver has three main benefits over the old method:
- Static library support
- Device-side LTO
- Unified clang driver stages
Depends on D122683
Differential Revision: https://reviews.llvm.org/D122831
The OMPScheduleType enum stores the constants from libomp's internal sched_type in kmp.h and are used by several kmp API functions. The enum values have an internal structure, namely each scheduling algorithm (e.g.) exists in four variants: unordered, orderend, normerge unordered, and nomerge ordered.
This patch (basically a followup to D114940) splits the "ordered" and "nomerge" bits into separate flags, as was already done for the "monotonic" and "nonmonotonic", so we can apply bit flags operations on them. It also now contains all possible combinations according to kmp's sched_type. Deriving of the OMPScheduleType enum from clause parameters has been moved form MLIR's OpenMPToLLVMIRTranslation.cpp to OpenMPIRBuilder to make available for clang as well. Since the primary purpose of the flag is the binary interface to libomp, it has been made more private to LLVMFrontend. The primary interface for generating worksharing-loop using OpenMPIRBuilder code becomes `applyWorkshareLoop` which derives the OMPScheduleType automatically and calls the appropriate emitter function.
While this is mostly a NFC refactor, it still applies the following functional changes:
* The logic from OpenMPToLLVMIRTranslation to derive the OMPScheduleType also applies to clang. Most notably, it now applies the nonmonotonic flag for non-static schedules by default.
* In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was previously not applied if the simd modifier was used. I assume this was a bug, since the effect was due to `loop.schedule_modifier()` returning `mlir::omp::ScheduleModifier::none` instead of `llvm::Optional::None`.
* In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was set even if ordered was specified, in breach to what the comment before citing the OpenMP specification says. I assume this was an oversight.
The ordered flag with parameter was not considered in this patch. Changes will need to be made (e.g. adding/modifying function parameters) when support for it is added. The lengthy names of the enum values can be discussed, for the moment this is avoiding reusing previously existing enum value names such as `StaticChunked` to avoid confusion.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D123403
Partially implement the proposed resolution to CWG2569.
D119136 broke some libstdc++ code, as P2036R3, implemented as a DR to
C++11 made ill-formed some previously valid and innocuous code.
We resolve this issue to allow decltype(x) - but not decltype((x)
to appear in the parameter list of a lambda that capture x by copy.
Unlike CWG2569, we do not extend that special treatment to
sizeof/noexcept yet, as the resolution has not been approved yet
and keeping the review small allows a quicker fix of impacted code.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D123909
In the past, `clang --target=riscv64-unknown-linux-gnu -mno-relax -c hello.s` will assemble hello.s without relaxation, but `clang --target=riscv64-unknown-linux-gnu -mno-relax -fno-integrated-as -c hello.s` doesn't pass the `-mno-relax` option to assembler, and assemble with relaxation
This patch pass the -mno-relax option to assembler when -fno-integrated-as is specified.
Differential Revision: https://reviews.llvm.org/D120639
This is extended to all `std::` functions that take a reference to a
value and return a reference (or pointer) to that same value: `move`,
`forward`, `move_if_noexcept`, `as_const`, `addressof`, and the
libstdc++-specific function `__addressof`.
We still require these functions to be declared before they can be used,
but don't instantiate their definitions unless their addresses are
taken. Instead, code generation, constant evaluation, and static
analysis are given direct knowledge of their effect.
This change aims to reduce various costs associated with these functions
-- per-instantiation memory costs, compile time and memory costs due to
creating out-of-line copies and inlining them, code size at -O0, and so
on -- so that they are not substantially more expensive than a cast.
Most of these improvements are very small, but I measured a 3% decrease
in -O0 object file size for a simple C++ source file using the standard
library after this change.
We now automatically infer the `const` and `nothrow` attributes on these
now-builtin functions, in particular meaning that we get a warning for
an unused call to one of these functions.
In C++20 onwards, we disallow taking the addresses of these functions,
per the C++20 "addressable function" rule. In earlier language modes, a
compatibility warning is produced but the address can still be taken.
The same infrastructure is extended to the existing MSVC builtin
`__GetExceptionInfo`, which is now only recognized in namespace `std`
like it always should have been.
This is a re-commit of
fc30901096,
a571f82a50, and
64c045e25b
which were reverted in
e75d8b7037
due to a crasher bug where CodeGen would emit a builtin glvalue as an
rvalue if it constant-folds.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D123345
A randomized structure needs to use a designated or default initializer.
Using a non-designated initializer will result in values being assigned
to the wrong fields.
Differential Revision: https://reviews.llvm.org/D123763
The previous patch introduced the offloading binary format so we can
store some metada along with the binary image. This patch introduces
using this inside the linker wrapper and Clang instead of the previous
method that embedded the metadata in the section name.
Differential Revision: https://reviews.llvm.org/D122683
std::addressof, plus the libstdc++-specific std::__addressof.
This brings us to parity with the corresponding GCC behavior.
Remove STDBUILTIN macro that ended up not being used.
We still require these functions to be declared before they can be used,
but don't instantiate their definitions unless their addresses are
taken. Instead, code generation, constant evaluation, and static
analysis are given direct knowledge of their effect.
This change aims to reduce various costs associated with these functions
-- per-instantiation memory costs, compile time and memory costs due to
creating out-of-line copies and inlining them, code size at -O0, and so
on -- so that they are not substantially more expensive than a cast.
Most of these improvements are very small, but I measured a 3% decrease
in -O0 object file size for a simple C++ source file using the standard
library after this change.
We now automatically infer the `const` and `nothrow` attributes on these
now-builtin functions, in particular meaning that we get a warning for
an unused call to one of these functions.
In C++20 onwards, we disallow taking the addresses of these functions,
per the C++20 "addressable function" rule. In earlier language modes, a
compatibility warning is produced but the address can still be taken.
The same infrastructure is extended to the existing MSVC builtin
`__GetExceptionInfo`, which is now only recognized in namespace `std`
like it always should have been.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D123345
A randomized structure needs to use a designated or default initializer.
Using a non-designated initializer will result in values being assigned
to the wrong fields.
Differential Revision: https://reviews.llvm.org/D123763
The target profile option(/T) decide the shader model when compile hlsl.
The format is shaderKind_major_minor like ps_6_1.
The shader model is saved as llvm::Triple is clang/llvm like
dxil-unknown-shadermodel6.1-hull.
The main job to support the option is translating ps_6_1 into
shadermodel6.1-pixel.
That is done inside tryParseProfile at HLSL.cpp.
To integrate the option into clang Driver, a new DriverMode DxcMode is
created. When DxcMode is enabled, OSType for TargetTriple will be
forced into Triple::ShaderModel. And new ToolChain HLSLToolChain will
be created when OSType is Triple::ShaderModel.
In HLSLToolChain, ComputeEffectiveClangTriple is overridden to call
tryParseProfile when targetProfile option is set.
To make test work, Fo option is added and .hlsl is added for active
-xhlsl.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D122865
Patch by: Xiang Li <python3kgae@outlook.com>
In D123649, I got the formula for getFlexibleArrayInitChars slightly
wrong: the flexible array elements can be contained in the tail padding
of the struct. Fix the formula to account for that.
With the fixed formula, we run into another issue: in some cases, we
were emitting extra padding for flexible arrray initializers. Fix
CGExprConstant so it uses a packed struct when necessary, to avoid this
extra padding.
Differential Revision: https://reviews.llvm.org/D123826
Given the declaration:
typedef void func_t(unsigned);
__attribute__((noreturn)) func_t func;
we would incorrectly determine that `func` had no prototype because the
`noreturn` attribute would convert the underlying type directly into a
FunctionProtoType, but the declarator for `func` itself was not one for
a function with a prototype. This adds an additional check for when the
declarator is a type representation for a function with a prototype.
When emitting a "conflicting types" warning for a function declaration,
it's more clear to diagnose the previous declaration specifically as
being a builtin if it one.
Implement P2036R3.
Captured variables by copy (explicitely or not), are deduced
correctly at the point we know whether the lambda is mutable,
and ill-formed before that.
Up until now, the entire lambda declaration up to the start of the body would be parsed in the parent scope, such that capture would not be available to look up.
The scoping is changed to have an outer lambda scope, followed by the lambda prototype and body.
The lambda scope is necessary because there may be a template scope between the start of the lambda (to which we want to attach the captured variable) and the prototype scope.
We also need to introduce a declaration context to attach the captured variable to (and several parts of clang assume captures are handled from the call operator context), before we know the type of the call operator.
The order of operations is as follow:
* Parse the init capture in the lambda's parent scope
* Introduce a lambda scope
* Create the lambda class and call operator
* Add the init captures to the call operator context and the lambda scope. But the variables are not capured yet (because we don't know their type).
Instead, explicit captures are stored in a temporary map that conserves the order of capture (for the purpose of having a stable order in the ast dumps).
* A flag is set on LambdaScopeInfo to indicate that we have not yet injected the captures.
* The parameters are parsed (in the parent context, as lambda mangling recurses in the parent context, we couldn't mangle a lambda that is attached to the context of a lambda whose type is not yet known).
* The lambda qualifiers are parsed, at this point We can switch (for the second time) inside the lambda context, unset the flag indicating that we have not parsed the lambda qualifiers,
record the lambda is mutable and capture the explicit variables.
* We can parse the rest of the lambda type, transform the lambda and call operator's types and also transform the call operator to a template function decl where necessary.
At this point, both captures and parameters can be injected in the body's scope. When trying to capture an implicit variable, if we are before the qualifiers of a lambda, we need to remember that the variables are still in the parent's context (rather than in the call operator's).
Reviewed By: aaron.ballman, #clang-language-wg, ChuanqiXu
Differential Revision: https://reviews.llvm.org/D119136
Clang should no longer incorrectly diagnose a variable declaration inside of a
lambda expression that shares the name of a variable in a containing
if/while/for/switch init statement as a redeclaration.
After this patch, clang is supposed to accept code below:
void foo() {
for (int x = [] { int x = 0; return x; }(); ;) ;
}
Fixes https://github.com/llvm/llvm-project/issues/54913
Differential Revision: https://reviews.llvm.org/D123840
This catches places where a function without a prototype is
accidentally used, potentially passing an incorrect number of
arguments, and is a follow-up to the work done in
https://reviews.llvm.org/D122895 and described in the RFC
(https://discourse.llvm.org/t/rfc-enabling-wstrict-prototypes-by-default-in-c).
The diagnostic is grouped under the new -Wdeprecated-non-prototypes
warning group and is enabled by default.
The diagnostic is disabled if the function being called was implicitly
declared (the user already gets an on-by-default warning about the
creation of the implicit function declaration, so no need to warn them
twice on the same line). Additionally, the diagnostic is disabled if
the declaration of the function without a prototype was in a location
where the user explicitly disabled deprecation warnings for functions
without prototypes (this allows the provider of the API a way to
disable the diagnostic at call sites because the lack of prototype is
intentional).
The HIP headers want to use this to swap the implementation of the
function, rather than relying on backend expansion of the generic
atomic instruction.
Fixes: SWDEV-332998
This adds a PS5-specific ToolChain subclass, which defines some basic
PS5 driver behavior. Future patches will add more target-specific
driver behavior.
Flexible array initialization is a C/C++ extension implemented in many
compilers to allow initializing the flexible array tail of a struct type
that contains a flexible array. In clang, this is currently restricted
to C. But this construct is used in the Microsoft SDK headers, so I'd
like to extend it to C++.
For now, this doesn't handle dynamic initialization; probably not hard
to implement, but it's extra code, and I don't think it's necessary for
the expected uses. And we explicitly fail out of constant evaluation.
I've added some additional code to assert that initializers have the
correct size, with or without flexible array init. This might catch
issues unrelated to flexible array init.
Differential Revision: https://reviews.llvm.org/D123649
HLSL does not support pointers or references. This change generates
errors in sema for generating pointer, and reference types as well as
common operators (address-of, dereference, arrow), which are used with
pointers and are unsupported in HLSL.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D123167
Remove the checking of the generated asm, as that's already tested
elsewhere, and adjust some tests that were expecting the wrong
intrinsic to be generated.
Differential Revision: https://reviews.llvm.org/D118259
HLSL has a language feature called Semantics which get attached to
declarations like attributes and are used in a variety of ways.
One example of semantic use is here with the `SV_GroupIndex` semantic
which, when applied to an input for a compute shader is pre-populated
by the driver with a flattened thread index.
Differential Revision: https://reviews.llvm.org/D122699
# Conflicts:
# clang/include/clang/Basic/Attr.td
# clang/include/clang/Basic/AttrDocs.td
This patch enables shift operators on SVE vector types, as well as
supporting vector-scalar shift operations.
Shifts by a scalar that is wider than the contained type in the
vector are permitted but as in the C standard if the value is larger
than the width of the type the behavior is undefined.
Differential Revision: https://reviews.llvm.org/D123303
Undefined behaviour is just passed on to extract_element when the
index is out of bounds. Subscript on svbool_t is not allowed as
this doesn't really have meaningful semantics.
Differential Revision: https://reviews.llvm.org/D122732
This is the template version of https://reviews.llvm.org/D114251.
This patch introduces a new template name kind (UsingTemplateName). The
UsingTemplateName stores the found using-shadow decl (and underlying
template can be retrieved from the using-shadow decl). With the new
template name, we can be able to find the using decl that a template
typeloc (e.g. TemplateSpecializationTypeLoc) found its underlying template,
which is useful for tooling use cases (include cleaner etc).
This patch merely focuses on adding the node to the AST.
Next steps:
- support using-decl in qualified template name;
- update the clangd and other tools to use this new node;
- add ast matchers for matching different kinds of template names;
Differential Revision: https://reviews.llvm.org/D123127