The code has a few sequence that looked like:
Ops.push_back(Ops[0]);
Ops.erase(Ops.begin());
And are equivalent to:
std::rotate(Ops.begin(), Ops.begin() + 1, Ops.end());
The latter has the advantage of never reallocating the vector, which
would be a bug in the original code as push_back would read from the
memory it deallocated.
Since RPATH initialization was disabled for the runtime libraries to
avoid overwriting RPATH unconditionally we need to explicity set up it
for the Win to Arm Linux cross builds.
See some details here: https://reviews.llvm.org/D91099
Make it required. Since it's a module pass, optnone won't test it, so
extend the clang test to also use opt-bisect now that it's supported.
14/16 check-dfsan tests failed with NPM enabled, now all pass.
Reviewed By: leonardchan
Differential Revision: https://reviews.llvm.org/D91385
Clean up the logic for `err_fe_{pch,module,ast}_file_modified` to use a
`select` like other ASTReader diagnostics. There should be no
functionality change here, just a cleanup.
Differential Revision: https://reviews.llvm.org/D91367
This RUN line was added as a temporary measure to undo the damage done
by D79655. Everyone's build system should be fine by now.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D91448
All three diagnostics have a select between "PCH", "module", and "AST"
in the text. The most generic of these is "AST", so rename them from
`err_module_...` to `err_ast_...`.
Differential Revision: https://reviews.llvm.org/D91436
This logic seems easier to follow without the `Error()` helper, and
checking `DiagnosticsEngine::isDiagnosticInFlight` just once up front.
Differential Revision: https://reviews.llvm.org/D91366
Just skip (non-bitfield) zero-sized fields, like we do with empty bases.
The class->struct conversion in the test is because -std=c++20 else deletes some default methods
due to non-accessible base dtors otherwise.
As a side-effect of writing the test, I discovered that D76801 did an ABI breaking change of sorts
for Objective-C's @encode. But it's been in for a while, so I'm not sure if we want to row back on
that or now.
Fixes PR48048.
Differential Revision: https://reviews.llvm.org/D90622
`TD->getTemplatedDecl()` might not be a DeclContext variant, which can
trigger an assertion inside `isa<>`.
Differential Revision: https://reviews.llvm.org/D91380
Merge existing marhsalling info kinds and add some primitives to
express flag options that contribute to a bitfield.
Depends on D82574
Original patch by Daniel Grumberg.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D82860
The new option `-fproc-stat-info=<file>` can be used to generate report
about used memory and execution tile of each stage of compilation.
Documentation for this option can be found in `UserManual.rst`. The
option can be used in parallel builds.
Differential Revision: https://reviews.llvm.org/D78903
- If an aggregate argument is indirectly accessed within kernels, direct
passing results in unpromotable `alloca`, which degrade performance
significantly. InferAddrSpace pass is enhanced in
[D91121](https://reviews.llvm.org/D91121) to take the assumption that
generic pointers loaded from the constant memory could be regarded
global ones. The need for the coercion on aggregate arguments is
mitigated.
Differential Revision: https://reviews.llvm.org/D89980
This was motivated by changes to llvm's `not --crash` disabling symbolization
but I ended up removing `not` from the script entirely because it
returns differently depending on whether clang "crashes" or exits for some
other reason. The script had to choose between calling `not` and `not --crash`
and sometimes it was wrong.
The script also now disables symbolization when we don't read the stack
trace because symbolizing is kind of slow.
Differential Revision: https://reviews.llvm.org/D91372
VE needs to support integrated assembler and "nas". This "nas"
doesn't recognize ".sigaddr" pseudo mnemonics, so need to disable
it. This patch disable it on VE by default. Also add a regression
test for that.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91350
This removes lots of duplicated code which was necessary before
https://reviews.llvm.org/D89158.
Now we can use PassBuilder::runRegisteredEPCallbacks().
This is mostly sanitizers.
There is likely more that can be done to simplify, but let's start with this.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D90870
Need to check if there are map types for the components before trying to
access them when trying to modify type mappings for combined partial
mappings.
Differential Revision: https://reviews.llvm.org/D91370
to synchronize with tools/clang-format/git-clang-format
tra: Keeping them in sync does have a minor benefit of not raising a question why the two maps are different.
Differential Revision: https://reviews.llvm.org/D91034
There are two factory functions used to create a semantic attribute,
Create() and CreateImplicit(). CreateImplicit() does not need to
specify the source range of the attribute since it's an implicitly-
generated attribute. The same logic does not apply to Create(), so
this removes the default argument from those declarations to avoid
accidentally creating a semantic attribute without source location
information.
As of D80952 we are disabling strict floating point on all hosts except
those that are explicitly listed as supported. Use of strict floating point
on other hosts requires use of the -fexperimental-strict-floating-point
flag. This is to avoid bugs like "https://bugs.llvm.org/show_bug.cgi?id=45329"
(which has an incorrect link in the previous review).
In the review for D80952 I was asked to mark the -fexperimental option as
a MarshallingInfoFlag. This patch does exactly that.
Differential Revision: https://reviews.llvm.org/D88987
We want to allow using MMA on P10 CPU only. This patch prevents the use of MMA
with the -mmma option on P9 CPUs and earlier.
Differential Revision: https://reviews.llvm.org/D91200
For dllexported default constructors with default arguments, we export
default constructor closures which pass in the default args. (See D8331
for a good explanation.)
For templates, that means those default args must be instantiated even
if the function isn't called. That is done by the
InstantiateDefaultCtorDefaultArgs() function, but it wasn't done for
explicit specializations, causing asserts (see bug).
Differential revision: https://reviews.llvm.org/D91089
We have option -mabi=ieeelongdouble to set current long double to
IEEEquad semantics. Like what GCC does, we need to define
__LONG_DOUBLE_IEEE128__ macro in this case, and __LONG_DOUBLE_IBM128__
if using PPCDoubleDouble.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D90208
Non-mechanical changes:
- Added FIXME to StringLiteral to cover multi-token string literals.
- LiteralExpression::getLiteralToken() is gone. (It was never called)
This is because we don't codegen methods in Alternatives
It's conceptually suspect if we consider multi-token string literals, though.
Differential Revision: https://reviews.llvm.org/D91277
Some targets may add required passes via
TargetMachine::registerPassBuilderCallbacks(). We need to run those even
under -O0. As an example, BPFTargetMachine adds
BPFAbstractMemberAccessPass, a required pass.
This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust
usage of the NPM) by allowing us to share added passes like coroutines
and sanitizers between -O0 and other optimization levels.
Since callbacks may end up not adding passes, we need to check if the
pass managers are empty before adding them, so PassManager now has an
isEmpty() function. For example, polly adds callbacks but doesn't always
add passes in those callbacks, so this is necessary to keep
-debug-pass-manager tests' output from changing depending on if polly is
enabled or not.
Tests are a continuation of those added in
https://reviews.llvm.org/D89083.
Reviewed By: asbirlea, Meinersbur
Differential Revision: https://reviews.llvm.org/D89158
except where they are necessary to disambiguate the target.
This substantially improves diagnostics from the standard library,
which are otherwise full of `::__1::` noise.
arguments of types by default.
This somewhat improves the worst-case printing of types like
std::string, std::vector, etc., where many irrelevant default arguments
can be included in the type as printed if we've lost the type sugar.
Avoid requiring an actual MemoryBuffer in ComputePreambleBounds, when
a MemoryBufferRef will do just fine.
Differential Revision: https://reviews.llvm.org/D90890
The Itanium CXX ABI grammer has been extended to support parameterized
vendor extended types [1].
This patch updates Clang's mangling for matrix types to use the new
extension.
[1] b359d28971
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D91253
Support a vector register constraint in inline asm of clang.
Add a regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91251
This reverts commit 09248a5d25.
Some builds are broken. I suspect a `static constexpr` in a class missing a
definition out of class (required pre-c++17).
Noticed while fixing unused prefix warnings - there isn't actually any diff in the loop unrolled ir between old/new pass managers any more, so the broken checks were superfluous
Similar to the previous patch, this doesn't convert *all* the classes that
could be converted. It also doesn't enforce any new invariants etc.
It *does* include some data we don't use yet: specific token types that are
allowed and optional/required status of sequence items. (Similar to Dmitri's
prototype). I think these are easier to add as we go than later, and serve
a useful documentation purpose.
Differential Revision: https://reviews.llvm.org/D90659
`-###` has always been supported in the new flang driver. This patch
merely makes sure that it's included when printing the help screen (i.e.
`flang-new -help`).
Merge existing marhsalling info kinds and add some primitives to
express flag options that contribute to a bitfield.
Depends on D82574
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D82860
This defines two node archetypes with trivial class definitions:
- Alternatives: the generated abstract classes are trivial as all
functionality is via LLVM RTTI
- Unconstrained: this is a placeholder, I think all of these are going to be
Lists but today they have no special accessors etc, so we just say
"could contain anything", and migrate them one-by-one to Sequence later.
Compared to Dmitri's prototype, Nodes.td looks more like a class hierarchy and
less like a grammar. (E.g. variants list the Alternatives parent rather than
vice versa).
The main reasons for this:
- the hierarchy is an important part of the API we want direct control over.
- e.g. we may introduce abstract bases like "loop" that the grammar doesn't
care about in order to model is-a concepts that might make refactorings
more expressive. This is less natural in a grammar-like idiom.
- e.g. we're likely to have to model some alternatives as variants and others
as class hierarchies, the choice will probably be based on natural is-a
relationships.
- it reduces the cognitive load of switching from editing *.td to working with
code that uses the generated classes
Differential Revision: https://reviews.llvm.org/D90543
In JavaScript breaking before a `@tag` in a comment puts it on a new line, and
machinery that parses these comments will fail to understand such comments.
This adapts clang-format to not break before `@`. Similar functionality exists
for not breaking before `{`.
Reviewed By: mprobst
Differential Revision: https://reviews.llvm.org/D91078
Since these are scoped enumerators, they have to be prefixed by DeclaratorContext, so lets remove Context from the name, and return some characters to the multiverse.
Patch was reviewed here: https://reviews.llvm.org/D91011
Thank you to aaron, bruno, wyatt and barry for indulging me.
This header has long lacked a standard multiple inclusion guard
like other headers have, for no apparent reason. The GCC header
of the same name likewise lacks one up through release 10.1, but
trunk GCC (release 11, and perhaps future 10.x) has fixed it
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96238).
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D91226
This enables a method sending an autorelease message to an object and
returning the object in MRR to avoid adding the object to an autorelease
pool if a call to objc_retainAutoreleasedReturnValue in the caller
function accepts the hand off of the retain count.
rdar://problem/50678052
Differential Revision: https://reviews.llvm.org/D91111
The original bug was discovered in T75057860. Clang front-end emits an AST that looks like this for an co_await expression:
|- ExprWithCleanups
|- -CoawaitExpr
|- -MaterializeTemporaryExpr ... Awaiter
...
|- -CXXMemberCallExpr ... .await_ready
...
|- -CallExpr ... __builtin_coro_resume
...
|- -CXXMemberCallExpr ... .await_resume
...
ExprWithCleanups is responsible for cleaning up (including calling dtors) for the temporaries generated in the wrapping expression).
In the above structure, the __builtin_coro_resume part (which corresponds to the code for the suspend case in the co_await with symmetric transfer), the pseudocode looks like this:
__builtin_coro_resume(
awaiter.await_suspend(
from_address(
__builtin_coro_frame())).address());
One of the temporaries that's generated as part of this code is the coroutine handle returned from awaiter.await_suspend() call. The call returns a handle which is a prvalue (since it's a returned value on the fly). In order to call the address() method on it, it needs to be converted into an xvalue. Hence a materialized temp is created to hold it. This temp will need to be cleaned up eventually. Now, since all cleanups happen at the end of the entire co_await expression, which is after the <coro.suspend> suspension point, the compiler will think that such a temp needs to live across suspensions, and need to be put on the coroutine frame, even though it's only used temporarily just to call address() method.
Such a phenomena not only unnecessarily increases the frame size, but can lead to ASAN failures, if the coroutine was already destroyed as part of the await_suspend() call. This is because if the coroutine was already destroyed, the frame no longer exists, and one can not store anything into it. But if the temporary object is considered to need to live on the frame, it will be stored into the frame after await_suspend() returns.
A fix attempt was done in https://reviews.llvm.org/D87470. Unfortunately it is incorrect. The reason is that cleanups in Clang works more like linearly than nested. There is one current state indicating whether it needs cleanup, and an ExprWithCleanups resets that state. This means that an ExprWithCleanups must be capable of cleaning up all temporaries created in the wrapping expression, otherwise there will be dangling temporaries cleaned up at the wrong place.
I eventually found a walk-around (https://reviews.llvm.org/D89066) that doesn't break any existing tests while fixing the issue. But it targets the final co_await only. If we ever have a co_await that's not on the final awaiter and the frame gets destroyed after suspend, we are in trouble. Hence we need a proper fix.
This patch is the proper fix. It does the folllowing things to fully resolve the issue:
1. The AST has to be generated in the order according to their nesting relationship. We should not generate AST out of order because then the code generator would incorrectly track the state of temporaries and when a cleanup is needed. So the code in buildCoawaitCalls is reorganized so that we will be generating the AST for each coawait member call in order along with their child AST.
2. await_ready() call is wrapped with an ExprWithCleanups so that temporaries in it gets cleaned up as early as possible to avoid living across suspension.
3. await_suspend() call is wrapped with an ExprWithCleanups if it's not a symmetric transfer. In the case of a symmetric transfer, in order to maintain the musttail call contract, the ExprWithCleanups is wraaped before the resume call.
4. In the end, we mark again that it needs a cleanup, so that the entire CoawaitExpr will be wrapped with a ExprWithCleanups which will clean up the Awaiter object associated with the await expression.
Differential Revision: https://reviews.llvm.org/D90990
In C++11 standard, to become implicitly movable, the expression in return
statement should be a non-volatile automatic object. CWG1579 changed the rule
to require that the expression only needs to be an automatic object. C++14
standard and C++17 standard kept this rule unchanged. C++20 standard changed
the rule back to require the expression be a non-volatile automatic object.
This should be a typo in standards, and VD should be non-volatile.
Differential Revision: https://reviews.llvm.org/D88295
The value_type is a const pointer, which makes the iteator non-copyable.
Before the patch, the normal usage like below was illegal:
```
auto It = lookupresult.begin();
...
It = lookupresult.end(); // the copy is not allowed.
```
Differential Revision: https://reviews.llvm.org/D91158
mangling support for non-type template parameters of class type and
template parameter objects.
The Itanium side of this follows the approach I proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/47 on 2020-09-06.
The MSVC side of this was determined empirically by observing MSVC's
output.
Differential Revision: https://reviews.llvm.org/D89998
Excluded folders in scan build is turned to absolute path before
comapre to 'file' in cdb. 'file' in cdb might be a path relative
to 'directory', so we need to turn it to absolute path before
comparison.
Patch by Yu Shan
Differential Revision: https://reviews.llvm.org/D90362
This patch adds three intrinsics compatible to x86's SSE 4.1 on PowerPC
target, with tests:
- _mm_insert_epi8
- _mm_insert_epi32
- _mm_insert_epi64
The intrinsics implementation is contributed by Paul Clarke.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D89242
For consistency with the IRBuilder, OpenMPIRBuilder has method names starting with 'Create'. However, the LLVM coding style has methods names starting with lower case letters, as all other OpenMPIRBuilder already methods do. The clang-tidy configuration used by Phabricator also warns about the naming violation, adding noise to the reviews.
This patch renames all `OpenMPIRBuilder::CreateXYZ` methods to `OpenMPIRBuilder::createXYZ`, and updates all in-tree callers.
I tested check-llvm, check-clang, check-mlir and check-flang to ensure that I did not miss a caller.
Reviewed By: mehdi_amini, fghanim
Differential Revision: https://reviews.llvm.org/D91109
This ports a number of OpenCL and fast-math flags for floating point
over to the new marshalling infrastructure.
As part of this, `Opt{In,Out}FFlag` were enhanced to allow other flags to
imply them, via `DefaultAnyOf<>`. For example:
```
defm signed_zeros : OptOutFFlag<"signed-zeros", ...,
"LangOpts->NoSignedZero",
DefaultAnyOf<[cl_no_signed_zeros, menable_unsafe_fp_math]>>;
```
defines `-fsigned-zeros` (`false`) and `-fno-signed-zeros` (`true`)
linked to the keypath `LangOpts->NoSignedZero`, defaulting to `false`,
but set to `true` implicitly if one of `-cl-no-signed-zeros` or
`-menable-unsafe-fp-math` is on.
Note that the initial patch was written Daniel Grumberg.
Differential Revision: https://reviews.llvm.org/D82756
So far, only used to generate Kind and implement classof().
My plan is to have this general-purpose Nodes.inc in the style of AST
DeclNodes.inc etc, and additionally a special-purpose backend generating
the actual class definitions. But baby steps...
Differential Revision: https://reviews.llvm.org/D90540
Follows through on c4cb3b10dc8c50e46c9fb1b7ae95e3c3c94975d3's FIXME
dating back to 2015. Anyone using this should migrate to
InMemoryFileSystem and/or ClangTool::mapVirtualFile.
Differential Revision: https://reviews.llvm.org/D90885
D86841 had an error where for statements with no conditional were
required to make progress. This is not true, this patch removes that
line, and adds regression tests.
Differential Revision: https://reviews.llvm.org/D91075
In C++ with -Werror=comment, multiline comments are not allowed.
clang-format could accidentally introduce multiline comments when reflowing.
This adapts clang-format to not introduce multiline comments by not allowing a
break after `\`. Note that this does not apply to comment lines that already are
multiline comments, such as comments in macros.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D90949
Add support for the Neoverse V1 CPU to the ARM and AArch64 backends.
This is based on patches from Mark Murray and Victor Campos.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D90765
This reverts commit b1878b4641. This does
fix the test but it means that ac73b73c16 is not implemented
correctly. Reverting for now, and will be reverting the commit that
causes this to fail.
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.
This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.
Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86844
In order not to modify the `tgt_target_data_update` information but still be
able to pass the extra information for non-contiguous map item (offset,
count, and stride for each dimension), this patch overload `arg` when
the maptype is set as `OMP_MAP_DESCRIPTOR`. The origin `arg` is for
passing the pointer information, however, the overloaded `arg` is an
array of descriptor_dim:
struct descriptor_dim {
int64_t offset;
int64_t count;
int64_t stride
};
and the array size is the same as dimension size. In addition, since we
have count and stride information in descriptor_dim, we can replace/overload the
`arg_size` parameter by using dimension size.
For supporting `stride` in array section, we use a dummy dimension in
descriptor to store the unit size. The formula for counting the stride
in dimension D_n: `unit size * (D_0 * D_1 ... * D_n-1) * D_n.stride`.
Demonstrate how it works:
```
double arr[3][4][5];
D0: { offset = 0, count = 1, stride = 8 } // offset, count, dimension size always be 0, 1, 1 for this extra dimension, stride is the unit size
D1: { offset = 0, count = 2, stride = 8 * 1 * 2 = 16 } // stride = unit size * (product of dimension size of D0) * D1.stride = 4 * 1 * 2 = 8
D2: { offset = 2, count = 2, stride = 8 * (1 * 5) * 1 = 40 } // stride = unit size * (product of dimension size of D0, D1) * D2.stride = 4 * 5 * 1 = 20
D3: { offset = 0, count = 2, stride = 8 * (1 * 5 * 4) * 2 = 320 } // stride = unit size * (product of dimension size of D0, D1, D2) * D3.stride = 4 * 25 * 2 = 200
// X here means we need to offload this data, therefore, runtime will transfer
// data from offset 80, 96, 120, 136, 400, 416, 440, 456
// Runtime patch: https://reviews.llvm.org/D82245
// OOOOO OOOOO OOOOO
// OOOOO OOOOO OOOOO
// XOXOO OOOOO XOXOO
// XOXOO OOOOO XOXOO
```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D84192
The strictfp metadata was added to the casting AST nodes in D85960, but
we aren't using that metadata yet. This patch adds that support.
In order to avoid lots of ad-hoc passing around of the strictfp bits I
updated the IRBuilder when moving from a function that has the Expr* to a
function that lacks it. I believe we should switch to this pattern to keep
the strictfp support from being overly invasive.
For the purpose of testing that we're picking up the right metadata, I
also made my tests use a pragma to make the AST's strictfp metadata not
match the global strictfp metadata. This exposes issues that we need to
deal with in subsequent patches, and I believe this is the right method
for most all of our clang strictfp tests.
Differential Revision: https://reviews.llvm.org/D88913
Previously we just shadowed the original implementation with a virtual
declaration - which is really bugprone in a long run.
This patch marks `CallEvent::getOriginExpr` virtual to let subclasses
override it's behavior.
At the same time, I checked all virtual functions of this class hierarchy
to make sure we don't suffer from this elsewhere.
Removes redundant declarations of `virtual` if `override` is already present.
In theory, this patch is a functional change, but no tests were broken.
I suspect that there were no meaningful changes in behavior in the
subclasses compared to the shadowed `CallEvent::getOriginExpr`.
That being said, I had a hard time coming up with unit-tests covering this.
Motivation: https://reviews.llvm.org/D74735#2370909
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D90754
Continue to dump and match on explicit template specializations, but
omit explicit instantiation declarations and definitions.
Differential Revision: https://reviews.llvm.org/D90763
This test was added in 7f38812d5b
and all the other tests make use of the COMMONIR check. So I think
this was left in by mistake for this particular test.
Reviewed By: kpn
Differential Revision: https://reviews.llvm.org/D90921
In JavaScript some @tags can be followed by `{`, and machinery that parses
these comments will fail to understand the comment if followed by a line break.
clang-format already handles this case by not breaking before `{` in comments.
However this was not working in cases when the column limit falls within `@tag`
or between `@tag` and `{`. This adapts clang-format for this case.
Reviewed By: mprobst
Differential Revision: https://reviews.llvm.org/D90908
Tremont microarchitecture only has GFNI(SSE) version, not AVX and
AVX512 version. This patch is to avoid compiling fail on Windows when
using -march=tremont to invoke one of GFNI(SSE) intrinsic.
Differential Revision: https://reviews.llvm.org/D90822
Disable the test on Windows, which should've been obvious as being
needed. The differences in diff implementations and line-endings make
this test difficult to execute on Windows.
The behavior is controlled by the `-fprebuilt-implicit-modules` option, and
allows searching for implicit modules in the prebuilt module cache paths.
The current command-line options for prebuilt modules do not allow to easily
maintain and use multiple versions of modules. Both the producer and users of
prebuilt modules are required to know the relationships between compilation
options and module file paths. Using a particular version of a prebuilt module
requires passing a particular option on the command line (e.g.
`-fmodule-file=[<name>=]<file>` or `-fprebuilt-module-path=<directory>`).
However the compiler already knows how to distinguish and automatically locate
implicit modules. Hence this proposal to introduce the
`-fprebuilt-implicit-modules` option. When set, it enables searching for
implicit modules in the prebuilt module paths (specified via
`-fprebuilt-module-path`). To not modify existing behavior, this search takes
place after the standard search for prebuilt modules. If not
Here is a workflow illustrating how both the producer and consumer of prebuilt
modules would need to know what versions of prebuilt modules are available and
where they are located.
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v1 <config 1 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v2 <config 2 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v3 <config 3 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules_v1 <config 1 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap <non-prebuilt config options>
With prebuilt implicit modules, the producer can generate prebuilt modules as
usual, all in the same output directory. The same mechanisms as for implicit
modules take care of incorporating hashes in the path to distinguish between
module versions.
Note that we do not specify the output module filename, so `-o` implicit modules are generated in the cache path `prebuilt_modules`.
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 1 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 2 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 3 options>
The user can now simply enable prebuilt implicit modules and point to the
prebuilt modules cache. No need to "parse" command-line options to decide
what prebuilt modules (paths) to use.
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules -fprebuilt-implicit-modules <config 1 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules -fprebuilt-implicit-modules <non-prebuilt config options>
This is for example particularly useful in a use-case where compilation is
expensive, and the configurations expected to be used are predictable, but not
controlled by the producer of prebuilt modules. Modules for the set of
predictable configurations can be prebuilt, and using them does not require
"parsing" the configuration (command-line options).
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D68997
For the language C++ the keyword __unaligned (a Microsoft extension) had no effect on pointers.
The reason, why there was a difference between C and C++ for the keyword __unaligned:
For C, the Method getAsCXXREcordDecl() returns nullptr. That guarantees that hasUnaligned() is called.
If the language is C++, it is not guaranteed, that hasUnaligend() is called and evaluated.
Here are some links:
The Bug: https://bugs.llvm.org/show_bug.cgi?id=47499
Thread on the cfe-dev mailing list: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066783.html
Diff, that introduced the check hasUnaligned() in getNaturalTypeAlignment(): https://reviews.llvm.org/D30166
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D90630
This adds the skeleton of the YAML Compiler for APINotes. This change
only adds the YAML IO model for the API Notes along with a new testing
tool `apinotes-test` which can be used to verify that can round trip the
YAML content properly. It provides the basis for the future work which
will add a binary serialization and deserialization format to the data
model.
This is based on the code contributed by Apple at
https://github.com/llvm/llvm-project-staging/tree/staging/swift/apinotes.
Differential Revision: https://reviews.llvm.org/D88859
Reviewed By: Gabor Marton
As described here:
https://devblogs.microsoft.com/oldnewthing/20150220-00/?p=44623
In order to allow Lambdas to be used with traditional Win32 APIs, they
emit a conversion function for (what Raymond Chen claims is all) a
number of the calling conventions. Through experimentation, we
discovered that the list isn't quite 'all'.
This patch implements this by taking the list of conversions that MSVC
emits (across 'all' architectures, I don't see any CCs on ARM), then
emits them if they are supported by the current target.
However, we also add 3 other options (which may be duplicates):
free-function, member-function, and operator() calling conventions. We
do this because we have an extension where we generate both free and
member for these cases so th at people specifying a calling convention
on the lambda will have the expected behavior when specifying one of
those two.
MSVC doesn't seem to permit specifying calling-convention on lambdas,
but we do, so we need to make sure those are emitted as well. We do this
so that clang-only conventions are supported if the user specifies them.
Differential Revision: https://reviews.llvm.org/D90634
Clang offers a `-f[no]-show-column` flag for hiding the column numbers when
printing diagnostics but there is no option for doing the same with line
numbers.
In LLDB having this option would be useful, as LLDB sometimes only knows the
file name for a SourceLocation and just assigns it the dummy line/column `1:1`.
These fake line/column numbers are confusing to the user and LLDB should be able
to tell clang to hide *both* the column and the line number when rendering text
diagnostics.
This patch adds a flag for also hiding the line numbers. It's not exposed via
the command line flags as it's most likely not very useful for any user and can
lead to ambiguous output when the user decides to only hide either the line or
the column number (where `file:1: ...` could now refer to both line 1 or column
1 depending on the compiler flags). LLDB can just access the DiagnosticOptions
directly when constructing its internal Clang instance.
The effect doesn't apply to Vi/MSVC style diagnostics because it's not defined
how these diagnostic styles would show an omitted line number (MSVC doesn't have
such an option and Vi's line mode is theory only supporting line numbers if I
understand it correctly).
Reviewed By: thakis, MaskRay
Differential Revision: https://reviews.llvm.org/D83038
Rationale:
Children of a syntax tree had forward links only, because there was no
need for reverse links.
This need appeared when we started mutating the syntax tree.
On a forward list, to remove a target node in O(1) we need a pointer to the node before the target. If we don't have this "before" pointer, we have to find it, and that requires O(n).
So in order to remove a syntax node from a tree, we would similarly need to find the node before to then remove. This is both not ergonomic nor does it have a good complexity.
Differential Revision: https://reviews.llvm.org/D90240
Some targets may add required passes via
TargetMachine::registerPassBuilderCallbacks(). We need to run those even
under -O0. As an example, BPFTargetMachine adds
BPFAbstractMemberAccessPass, a required pass.
This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust
usage of the NPM) by allowing us to share added passes like coroutines
and sanitizers between -O0 and other optimization levels.
Tests are a continuation of those added in
https://reviews.llvm.org/D89083.
In order to prevent TargetMachines from adding unnecessary optimization
passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be
changed to take an OptimizationLevel, but that will be done separately.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D89158
Summary:
Object of type `Compilation` now can keep a callback that is called
after each execution of `Command`. This must simplify adaptation of
clang in custom distributions and allow facilities like collection of
execution statistics.
Reviewers: rsmith, rjmccall, Eugene.Zelenko
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78899
Since C++11, the C++ standard has a forward progress guarantee
[intro.progress], so all such functions must have the `mustprogress`
requirement. In addition, from C11 and onwards, loops without a non-zero
constant conditional or no conditional are also required to make
progress (C11 6.8.5p6). This patch implements these attribute deductions
so they can be used by the optimization passes.
Differential Revision: https://reviews.llvm.org/D86841
The use of the new types introduced for PowerPC MMA instructions needs to be restricted.
We add a PowerPC function checking that the given type is valid in a context in which we don't allow MMA types.
This function is called from various places in Sema where we want to prevent the use of these types.
Differential Revision: https://reviews.llvm.org/D82035
Change `Module::Umbrella` from a `const void *` to a `PointerUnion` of
`FileEntry` and `DirectoryEntry`. We can drop the `HasUmbrellaDir` bit
(since `PointerUnion` includes that).
This change makes it safer to update to `FileEntryRef` and
`DirectoryEntryRef` in a future patch.
Differential Revision: https://reviews.llvm.org/D90481
Move `DirectoryEntry` and `DirectoryEntryRef` into their own header,
similar to the creation of FileEntry.h. No functionality change here,
just preparing to include it in more places to allow wider adoption of
`DirectoryEntryRef` without requiring all of `FileManager.h`.
Differential Revision: https://reviews.llvm.org/D90478
Clang now asserts for the below case:
```
void clang::CodeGen::CGOpenMPRuntime::createOffloadEntriesAndInfoMetadata(): Assertion `std::get<0>(E) && "All ordered entries must exist!"' failed.
```
The reason why Clang hit the assert is because in
`emitTargetDataCalls`, both `BeginThenGen` and `BeginElseGen` call
`registerTargetRegionEntryInfo` and try to register the Entry in
OffloadEntriesTargetRegion with same key. If changing the expression in
if clause to any constant expression, then the assert disappear. (https://godbolt.org/z/TW7haj)
The assert itself is to avoid
user from accessing elements out of bound inside `OrderedEntries` in
`createOffloadEntriesAndInfoMetadata`.
In this patch, I add a check in `registerTargetRegionEntryInfo` to avoid
register the target region more than once.
A test case that triggers assert: https://godbolt.org/z/4cnGW8
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D90704
This patch enhances the clang driver to link the runtime profile
library on AIX when the -fprofile-generate option is used.
Reviewed By: phosek
Differentail Revision: https://reviews.llvm.org/D90641
Since glibc has supported math library functions conforming IEEE 128-bit
floating point types on some platform (like ppc64le), we can fix clang's
math builtins missing this type.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D90593
Add MMA builtin decoding. These builtins use the new PowerPC-specific types __vector_pair and __vector_quad.
So to avoid pervasive changes, we use custom type descriptors and custom decoding for these builtins.
We also use custom code generation to expand builtin calls with pointers to simpler intrinsic calls with non-pointer types.
Differential Revision: https://reviews.llvm.org/D81748
415f7ee883 had a silly typo introduced when I inlined some
code into a loop from its own function.
Original commit message:
For PlayStation we offer source code compatibility with
Microsoft's dllimport/export annotations; however, our file
format is based on ELF.
To support this we translate from DLL storage class to ELF
visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the
documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options
are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
Similar to libcxx implementation of cmath function
overloads, use type promotion templates to determine
return types of multi-argument math functions.
Fixes: SWDEV-256825
Reviewed By: tra, yaxunl
Differential Revision: https://reviews.llvm.org/D90409
The path produced by getMainExecutable() may not be the right one when the files are installed in
a symlinked tree and when the real location of llvm-objdump is in a different directory.
Given that clang-offload-bundler is invoked by clang, the driver already does the job figuring out
the right path (e.g. it pays attention to -no-canonical-prefixes).
Offload bundler should use it, instead of trying to figure out the path on its
own.
Differential Revision: https://reviews.llvm.org/D90436
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.
Differential Revision: https://reviews.llvm.org/D90419
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
Made the isExpandedFromMacro matcher work on Stmt's, TypeLocs and Decls in line with the other macro expansion matchers.
Also tweaked it to take a `std::string` instead of a `StringRef`.
This prevents potential use-after-free bugs if the matcher is created with a string thats destroyed before the matcher finishes matching.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D90303
Currently for explicit template function instantiation in CUDA/HIP device
compilation clang emits instantiated kernel with external linkage
and instantiated device function with internal linkage.
This is fine for -fno-gpu-rdc since there is only one TU.
However this causes duplicate symbols for kernels for -fgpu-rdc if
the same instantiation happen in multiple TU. Or missing symbols
if a device function calls an explicitly instantiated template function
in a different TU.
To make explicit template function instantiation work for
-fgpu-rdc we need to follow the C++ linkage paradigm, i.e.
use weak_odr linkage.
Differential Revision: https://reviews.llvm.org/D90311
e00629f777 "[scan-build] Fix clang++ pathname" had
removed the -MAJOR.MINOR suffix, but since presumably LLVM 7 the suffix is only
-MAJOR, so ClangCXX (i.e., the CLANG_CXX environment variable passed to
clang/tools/scan-build/libexec/ccc-analyzer) now contained a non-existing
/path/to/clang-12++ (which apparently went largely unnoticed as
clang/tools/scan-build/libexec/ccc-analyzer falls back to just 'clang++' if the
executable denoted by CLANG_CXX does not exist).
For the new clang/test/Analysis/scan-build/cxx-name.test to be effective,
%scan-build must now take care to pass the clang executable's resolved pathname
(i.e., ending in .../clang-MAJOR rather than just .../clang) to --use-analyzer.
Differential Revision: https://reviews.llvm.org/D89481
This test was removed in 5963e028e7 because it failed on cores where
support of constrained intrinsics was limited. Now this test is enabled
only on x86.
The __isPlatformVersionAtLeast routine is an implementation of `if (@available)` check
that uses the _availability_version_check API on Darwin that's supported on
macOS 10.15, iOS 13, tvOS 13 and watchOS 6.
Differential Revision: https://reviews.llvm.org/D90367
415f7ee883 had LIT test failures on any build where the clang executable
was not called "clang". I have adjusted the LIT CHECKs to remove the
binary name to fix this.
Original commit message:
For PlayStation we offer source code compatibility with
Microsoft's dllimport/export annotations; however, our file
format is based on ELF.
To support this we translate from DLL storage class to ELF
visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the
documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options
are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
Summary:
IgnoreUnlessSpelledInSource mode should ignore these because they are
not written in the source. This matters for example when trying to
replace types or values which are templated. The new test in
TransformerTest.cpp in this commit demonstrates the problem.
In existing matcher code, users can write
`unless(isInTemplateInstantiation())` or `unless(isInstantiated())` (the
user must know which to use). The point of the
TK_IgnoreUnlessSpelledInSource mode is to allow the novice to avoid such
details. This patch changes the IgnoreUnlessSpelledInSource mode to
skip over implicit template instantiations.
This patch does not change the TK_AsIs mode.
Note: An obvious attempt at an alternative implementation would simply
change the shouldVisitTemplateInstantiations() in ASTMatchFinder.cpp to
return something conditional on the operational TraversalKind. That
does not work because shouldVisitTemplateInstantiations() is called
before a possible top-level traverse() matcher changes the operational
TraversalKind.
Reviewers: sammccall, aaron.ballman, gribozavr2, ymandel, klimek
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80961
Change `Module::ASTFile` and `ModuleFile::File` to use
`Optional<FileEntryRef>` instead of `const FileEntry *`. One of many
steps toward removing `FileEntry::getName`.
Differential Revision: https://reviews.llvm.org/D89836
TokenAnnotator::splitPenalty() was always returning 0 for opening parens if
AlignAfterOpenBracket was set to BAS_DontAlign, so the preferred point for
line breaking was always after the open paren (and was ignoring
PenaltyBreakBeforeFirstCallParameter). This change restricts the zero
penalty to the AllowAllArgumentsOnNextLine case. This results in improved
formatting for FreeBSD where we set AllowAllArgumentsOnNextLine: false
and a high value for PenaltyBreakBeforeFirstCallParameter to avoid breaking
after the open paren.
Before:
```
functionCall(
paramA, paramB, paramC);
void functionDecl(
int A, int B, int C)
```
After:
```
functionCall(paramA, paramB,
paramC);
void functionDecl(int A, int B,
int C)
```
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D90246
It's only used by Version.cpp, so limit the definition to just that one
file instead of making all of Clang recompile if you change CLANG_VENDOR.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D90516
For PS4 development we support dllimport/export annotations in
source code. This patch enables the dllimport/export attributes
on PS4 by adding a new function to query the triple for whether
dllimport/export are used and using that function to decide
whether these attributes are supported. This replaces the current
method of checking if the target is Windows.
This means we can drop the use of "TargetArch" in the .td file
(which is an improvement as dllimport/export support isn't really
a function of the architecture).
I have included a simple codgen test to show that the attributes
are accepted and have an effect on codegen for PS4. I have also
enabled the DLLExportStaticLocal and DLLImportStaticLocal
attributes, which we support downstream. However, I am unable to
write a test for these attributes until other patches for PS4
dllimport/export handling land upstream. Whilst writing this
patch I noticed that, as these attributes are internal, they do
not need to be target specific (when these attributes are added
internally in Clang the target specific checks have already been
run); however, I think leaving them target specific is fine
because it isn't harmful and they "really are" target specific
even if that has no functional impact.
Differential Revision: https://reviews.llvm.org/D90442
This patch implements the first frontend action for the Flang parser (i.e.
Fortran::parser). This action runs the preprocessor and is invoked with the
`-E` flag. (i.e. `flang-new -E <input-file>). The generated output is printed
to either stdout or the output file (specified with `-` or `-o <output-file>`).
Note that currently there is no mechanism to map options for the
frontend driver (i.e. Fortran::frontend::FrontendOptions) to options for
the parser (i.e. Fortran::parser::Options). Instead,
Frotran::parser::options are hard-coded to:
```
std::vector<std::string> searchDirectories{"."s};
searchDirectories = searchDirectories;
isFixedForm = false;
_encoding(Fortran::parser::Encoding::UTF_8);
```
These default settings are compatible with the current Flang driver. Further
work is required in order for CompilerInvocation to read and map
clang::driver::options to Fortran::parser::options.
Co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Differential Revision: https://reviews.llvm.org/D88381
Similar to -fprofile-generate=, add -fmemory-profile= which takes a
directory path. This is passed down to LLVM via a new module flag
metadata. LLVM in turn provides this name to the runtime via the new
__memprof_profile_filename variable.
Additionally, always pass a default filename (in $cwd if a directory
name is not specified vi the = form of the option). This is also
consistent with the behavior of the PGO instrumentation. Since the
memory profiles will generally be fairly large, it doesn't make sense to
dump them to stderr. Also, importantly, the memory profiles will
eventually be dumped in a compact binary format, which is another reason
why it does not make sense to send these to stderr by default.
Change the existing memprof tests to specify log_path=stderr when that
was being relied on.
Depends on D89086.
Differential Revision: https://reviews.llvm.org/D89087
Adds a diagnostic when the user annotates an `if constexpr` with a
likelihood attribute. The `if constexpr` statement is evaluated at compile
time so the attribute has no effect. Annotating the accompanied `else`
with a likelihood attribute has the same effect as annotating a generic
statement. Since the attribute there is most likely not intended, a
diagnostic will be issued. Since the attributes can't conflict, the
"conflict" won't be diagnosed for an `if constexpr`.
Differential Revision: https://reviews.llvm.org/D90336
The attribute has no effect on a do statement since the path of execution
will always include its substatement.
It adds a diagnostic when the attribute is used on an infinite while loop
since the codegen omits the branch here. Since the likelihood attributes
have no effect on a do statement no diagnostic will be issued for
do [[unlikely]] {...} while(0);
Differential Revision: https://reviews.llvm.org/D89899
Pragma 'clang fp' is extended to support a new option, 'exceptions'. It
allows to specify floating point exception behavior more flexibly.
Differential Revision: https://reviews.llvm.org/D89849
This patch mainly made the following changes:
1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.
Differential Revision: https://reviews.llvm.org/D89105
friends.
When determining whether a function has a template instantiation
pattern, look for other declarations of that function that were
instantiated from a friend function definition, rather than assuming
that checking for member specialization information on whichever
declaration name lookup found will be sufficient.
This is to enable --allow-unused-duplicates=false. This prefix appears
to be outdated and intentionally unused.
Differential Revision: https://reviews.llvm.org/D90430
I'm having trouble with bots today. Remove more cargo-cult from the
generic version of `OptionalStorage` that is failing on some (fewer)
bots (but not locally).
I expect this will fix:
```
FAILED: tools/clang/unittests/Basic/CMakeFiles/BasicTests.dir/FileEntryTest.cpp.o
/usr/bin/c++ -DGTEST_HAS_RTTI=0 -DGTEST_HAS_TR1_TUPLE=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/unittests/Basic -I/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/unittests/Basic -I/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/include -Itools/clang/include -Iinclude -I/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/include -I/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/utils/unittest/googletest/include -I/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/utils/unittest/googlemock/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -Wno-variadic-macros -fno-exceptions -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT tools/clang/unittests/Basic/CMakeFiles/BasicTests.dir/FileEntryTest.cpp.o -MF tools/clang/unittests/Basic/CMakeFiles/BasicTests.dir/FileEntryTest.cpp.o.d -o tools/clang/unittests/Basic/CMakeFiles/BasicTests.dir/FileEntryTest.cpp.o -c /home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/unittests/Basic/FileEntryTest.cpp
/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/unittests/Basic/FileEntryTest.cpp: In member function ‘virtual void {anonymous}::FileEntryTest_OptionalFileEntryRefDegradesToFileEntryPtr_Test::TestBody()’:
/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/unittests/Basic/FileEntryTest.cpp:48:46: error: use of deleted function ‘constexpr clang::OptionalFileEntryRefDegradesToFileEntryPtr::OptionalFileEntryRefDegradesToFileEntryPtr()’
OptionalFileEntryRefDegradesToFileEntryPtr M0;
^~
In file included from /home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/unittests/Basic/FileEntryTest.cpp:9:
/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/include/clang/Basic/FileEntry.h:250:3: note: ‘constexpr clang::OptionalFileEntryRefDegradesToFileEntryPtr::OptionalFileEntryRefDegradesToFileEntryPtr() noexcept’ is implicitly deleted because its exception-specification does not match the implicit exception-specification ‘’
OptionalFileEntryRefDegradesToFileEntryPtr() noexcept = default;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/unittests/Basic/FileEntryTest.cpp: In member function ‘virtual void {anonymous}::FileEntryTest_equals_Test::TestBody()’:
/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/clang/unittests/Basic/FileEntryTest.cpp:82:46: error: use of deleted function ‘constexpr clang::OptionalFileEntryRefDegradesToFileEntryPtr::OptionalFileEntryRefDegradesToFileEntryPtr()’
OptionalFileEntryRefDegradesToFileEntryPtr M0;
^~
```
In a kernel (or in general in environments where bit 55 of the address
is set) the shadow base needs to point to the end of the shadow region,
not the beginning. Bit 55 needs to be sign extended into bits 52-63
of the shadow base offset, otherwise we end up loading from an invalid
address. We can do this by using SBFX instead of UBFX.
Using SBFX should have no effect in the userspace case where bit 55
of the address is clear so we do so unconditionally. I don't think
we need a ABI version bump for this (but one will come anyway when
we switch to x20 for the shadow base register).
Differential Revision: https://reviews.llvm.org/D90424
From a code size perspective it turns out to be better to use a
callee-saved register to pass the shadow base. For non-leaf functions
it avoids the need to reload the shadow base into x9 after each
function call, at the cost of an additional stack slot to save the
caller's x20. But with x9 there is also a stack size cost, either
as a result of copying x9 to a callee-saved register across calls or
by spilling it to stack, so for the non-leaf functions the change to
stack usage is largely neutral.
It is also code size (and stack size) neutral for many leaf functions.
Although they now need to save/restore x20 this can typically be
combined via LDP/STP into the x30 save/restore. In the case where
the function needs callee-saved registers or stack spills we end up
needing, on average, 8 more bytes of stack and 1 more instruction
but given the improvements to other functions this seems like the
right tradeoff.
Unfortunately we cannot change the register for the v1 (non short
granules) check because the runtime assumes that the shadow base
register is stored in x9, so the v1 check still uses x9.
Aside from that there is no change to the ABI because the choice
of shadow base register is a contract between the caller and the
outlined check function, both of which are compiler generated. We do
need to rename the v2 check functions though because the functions
are deduplicated based on their names, not on their contents, and we
need to make sure that when object files from old and new compilers
are linked together we don't end up with a function that uses x9
calling an outlined check that uses x20 or vice versa.
With this change code size of /system/lib64/*.so in an Android build
with HWASan goes from 200066976 bytes to 194085912 bytes, or a 3%
decrease.
Differential Revision: https://reviews.llvm.org/D90422