regions.
If the lastprivate conditional is passed as shared in inner region, we
shall check if it was ever changed and use this updated value after exit
from the inner region as an update value.
Summary:
This patch hooks the `Preprocessor` trough `BugReporter` to the
`CheckerContext` so the checkers could look for macro definitions.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D69731
Summary:
This patch uses the new `DynamicSize.cpp` to serve dynamic information.
Previously it was static and probably imprecise data.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D69599
Summary:
This patch introduces a placeholder for representing the dynamic size of
regions. It also moves the `getExtent()` method of `SubRegions` to the
`MemRegionManager` as `getStaticSize()`.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D69540
The constrained fcmp intrinsics don't allow the TRUE/FALSE predicates.
Using them will assert. To workaround this I'm emitting the old X86 specific intrinsics that were never removed from the backend when we switched to using fcmp in IR. We have no way to mark them as being strict, but that's true of all target specific intrinsics so doesn't seem like we need to solve that here.
I've also added support for selecting between signaling and quiet.
Still need to support SAE which will require using a target specific
intrinsic. Also need to fix masking to not use an AND instruction
after the compare.
Differential Revision: https://reviews.llvm.org/D72906
Iterator modeling depends on container modeling,
but not vice versa. This enables the possibility
to arrange these two modeling checkers into
separate layers.
There are several advantages for doing this: the
first one is that this way we can keep the
respective modeling checkers moderately simple
and small. Furthermore, this enables creation of
checkers on container operations which only
depend on the container modeling. Thus iterator
modeling can be disabled together with the
iterator checkers if they are not needed.
Since many container operations also affect
iterators, container modeling also uses the
iterator library: it creates iterator positions
upon calling the `begin()` or `end()` method of
a containter (but propagation of the abstract
position is left to the iterator modeling),
shifts or invalidates iterators according to the
rules upon calling a container modifier and
rebinds the iterator to a new container upon
`std::move()`.
Iterator modeling propagates the abstract
iterator position, handles the relations between
iterator positions and models iterator
operations such as increments and decrements.
Differential Revision: https://reviews.llvm.org/D73547
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:
%shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
%vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)
When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.
This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:
int16x8_t v = {2,3,4,5,6,7,8,9};
a = vqdmulh_laneq_s16(a, v, 0);
b = vqdmulh_laneq_s16(b, v, 1);
c = vqdmulh_laneq_s16(c, v, 2);
d = vqdmulh_laneq_s16(d, v, 3);
[...]
In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.
We could teach the compiler to recover the lane variants, but this would likely
require its own pass. (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)
This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.
These 'lane' variants need an additional register class. The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.
Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.
This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).
Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma
Reviewed By: efriedma
Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71469
Summary:
This change adds an option to insert trailing commas into container
literals. For example, in JavaScript:
const x = [
a,
b,
^~~~~ inserted if missing.
]
This is implemented as a seperate post-processing pass after formatting
(because formatting might change whether the container literal does or
does not wrap). This keeps the code relatively simple and orthogonal,
though it has the notable drawback that the newly inserted comma is not
taken into account for formatting decisions (e.g. it might exceed the 80
char limit). To avoid exceeding the ColumnLimit, a comma is only
inserted if it fits into the limit.
Trailing comma insertion conceptually conflicts with argument
bin-packing: inserting a comma disables bin-packing, so we cannot do
both. clang-format rejects FormatStyle configurations that do both with
this change.
Reviewers: krasimir, MyDeveloperDay
Subscribers: cfe-commits
Tags: #clang
include Clang builtin headers even with -nostdinc
Some projects use -nostdinc, but need to access some intrinsics files when building specific files.
The new -ibuiltininc flag lets them use this flag when compiling these files to ensure they can
find Clang's builtin headers.
The use of -nobuiltininc after the -ibuiltininc flag does not add the builtin header
search path to the list of header search paths.
Differential Revision: https://reviews.llvm.org/D73500
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
When using -fno-builtin[-<name>], we don't attach the IR attributes to
function definitions with no Decl, like the ones created through
`CreateGlobalInitOrDestructFunction`.
This results in projects using -fno-builtin or -ffreestanding to start
seeing symbols like _memset_pattern16.
The fix changes the behavior to always add the attribute if LangOptions
requests it.
Differential Revision: https://reviews.llvm.org/D73495
This makes clang somewhat forward-compatible with new CUDA releases
without having to patch it for every minor release without adding
any new function.
If an unknown version is found, clang issues a warning (can be disabled
with -Wno-cuda-unknown-version) and assumes that it has detected
the latest known version. CUDA releases are usually supersets
of older ones feature-wise, so it should be sufficient to keep
released clang versions working with minor CUDA updates without
having to upgrade clang, too.
Differential Revision: https://reviews.llvm.org/D73231
Performing a cast where the result is ignored caused Clang to crash when
performing codegen for the conversion:
_Complex int a;
void fn1() { (_Complex double) a; }
This patch addresses the crash by not trying to emit the scalar conversions,
causing it to be a noop. Fixes PR44624.
This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions.
OpenCL tests for new built-ins are included.
Differential Revision: https://reviews.llvm.org/D72723
Currently device lib path set by environment variable HIP_DEVICE_LIB_PATH
does not work due to extra "-L" added to each entry.
This patch fixes that by allowing argument name to be empty in addDirectoryList.
Differential Revision: https://reviews.llvm.org/D73299
Summary:
This addresses issues raised in https://bugs.llvm.org/show_bug.cgi?id=44454.
There are outstanding issues with multi-line verbatim strings in C# that will be addressed in a follow-up PR.
Reviewers: krasimir, MyDeveloperDay
Reviewed By: krasimir, MyDeveloperDay
Subscribers: MyDeveloperDay
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D73492
Summary:
The 'z' length modifier, signalling that an integer format specifier
takes a `size_t` sized integer, is only supported by the C library of
MSVC 2015 and later. Earlier versions don't recognize the 'z' at all,
and respond to `printf("%zu", x)` by just printing "zu".
So, if the MS compatibility version is set to a value earlier than
MSVC2015, it's useful to warn about 'z' modifiers in printf format
strings we check.
Reviewers: aaron.ballman, lebedev.ri, rnk, majnemer, zturner
Reviewed By: aaron.ballman
Subscribers: amccarth, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73457
This required some fixes to the generic code for two issues:
1. -fsanitize=safe-stack is default on x86_64-fuchsia and is *not* incompatible with -fsanitize=leak on Fuchisa
2. -fsanitize=leak and other static-only runtimes must not be omitted under -shared-libsan (which is the default on Fuchsia)
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D73397
With LLVM_APPEND_VC_REV=NO, Modules/merge-lifetime-extended-temporary.cpp
would fail if it ran before a0f50d7316 (which changed
the serialization format) and then after, for these reasons:
1. With LLVM_APPEND_VC_REV=NO, the module hash before and after the
change was the same.
2. Modules/merge-lifetime-extended-temporary.cpp is the only test
we have that uses -fmodule-cache-path=%t that
a) actually writes to the cache path
b) doesn't do `rm -rf %t` at the top of the test
So the old run would write a module file, and then the new run would
try to load it, but the serialized format changed.
Do several things to fix this:
1. Include clang::serialization::VERSION_MAJOR/VERSION_MINOR in
the module hash, so that when the AST format changes (...and
we remember to bump these), we use a different module cache dir.
2. Bump VERSION_MAJOR, since a0f50d7316 changed the
on-disk format in a way that a gch file written before that change
can't be read after that change.
3. Add `rm -rf %t` to all tests that pass -fmodule-cache-path=%t.
This is unnecessary from a correctness PoV after 1 and 2,
but makes it so that we don't amass many cache dirs over time.
(Arguably, it also makes it so that the test suite doesn't catch
when we change the serialization format but don't bump
clang::serialization::VERSION_MAJOR/VERSION_MINOR; oh well.)
Differential Revision: https://reviews.llvm.org/D73202
whether a call is to a builtin.
We already had a general mechanism to do this but for some reason
weren't using it. In passing, check for the other unary operators that
can intervene in a reasonably-direct function call (we already handled
'&' but missed '*' and '+').
These are mostly trivial additions as both of them are reusing existing
PThreadLockChecker logic. I only needed to add the list of functions to
check and do some plumbing to make sure that we display the right
checker name in the diagnostic.
Differential Revision: https://reviews.llvm.org/D73376
regions with reductions, lastprivates or linears clauses.
If the lastprivate conditional variable is updated in inner parallel
region with reduction, lastprivate or linear clause, the value must be
considred as a candidate for lastprivate conditional. Also, tracking in
inner parallel regions is not required.
Summary:
Instead of checking the range manually, changed the checker to use assumeInclusiveRangeDual instead.
This patch was part of D28955.
Reviewers: NoQ
Reviewed By: NoQ
Subscribers: ddcc, xazax.hun, baloghadamsoftware, szepet, a.sidorin, Szelethus, donat.nagy, dkrupp, Charusso, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73062
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.
The bot failure should be fixed by D73418, committed as
af954e441a.
I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
Summary:
clang-format currently always wraps the body of non-empty arrow
functions:
const x = () => {
z();
};
This change implements support for the `AllowShortLambdasOnASingleLine`
style options, controlling the indent style for arrow function bodies
that have one or fewer statements. SLS_All puts all on a single line,
SLS_Inline only arrow functions used in an inline position.
const x = () => { z(); };
Multi-statement arrow functions continue to be wrapped. Function
expressions (`a = function() {}`) and function/method declarations are
unaffected as well.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73335
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.
This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.
The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.
Differential revision: https://reviews.llvm.org/D72703
Summary:
The MicrosoftCXXABI uses a separate mechanism for emitting vtable
type metadata, and thus didn't pick up the change from D71907
to emit the vcall_visibility metadata under -fwhole-program-vtables.
I believe this is the cause of a Windows bot failure when I committed
follow on change D71913 that required a revert. The failure occurred
in a CFI test that was expecting to not abort because it expected a
devirtualization to occur, and without the necessary vcall_visibility
metadata we would not get devirtualization.
Note in the equivalent code in CodeGenModule::EmitVTableTypeMetadata
(used by the ItaniumCXXABI), we also emit the vcall_visibility metadata
when Virtual Function Elimination is enabled. Since I am not as familiar
with the details of that optimization, I have marked that as a TODO and
am only inserting under -fwhole-program-vtables.
Reviewers: evgeny777
Subscribers: Prazek, ostannard, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73418
Children of InitListExpr are traversed twice by RAV, so this code
populates a vector to represent the possibly-multiple parents (in
reality in this situation the parent is the same and is therefore
de-duplicated).
The code for parsing of type-constraints in compound-requirements was not adapted for the new TryAnnotateTypeConstraint which
caused compound-requirements with scope specifiers to ignore them.
Also add regression tests for scope specifiers in type-constraints in more contexts.
Summary:
This allows ASTContext to store only one parent map, rather than storing
an entire parent map for each traversal mode used.
This is therefore a partial revert of commit 0a717d5b (Make it possible
control matcher traversal kind with ASTContext, 2019-12-06).
Reviewers: aaron.ballman, rsmith, rnk
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73388
This change added two new attributes, rounding mode and exception
behavior to the structure FPOptions. These attributes allow more
flexible treatment of specific floating point environment than it is
provided by #pragma STDC FENV_ACCESS.
Differential Revision: https://reviews.llvm.org/D65994
We would previously try to evaluate atomic constraints of non-template functions as-is,
and since they are now unevaluated at first, this would cause incorrect evaluation (bugs #44657, #44656).
Substitute into atomic constraints of non-template functions as we would atomic constraints
of template functions, in order to rebuild the expressions in a constant-evaluated context.
Implement a pessimistic evaluator of the minimal required size for a buffer
based on the format string, and couple that with the fortified version to emit a
warning when the buffer size is lower than the lower bound computed from the
format string.
Differential Revision: https://reviews.llvm.org/D71566
When used as qualified names, pseudo-destructors are always named as if
they were members of the type, never as members of the namespace
enclosing the type.
Reduces compile time of SemaDeclAttr.cpp down to 28s from 50s. The new
TU does a few RecursiveASTVisitor instantiations, so it takes 30s.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D73385
Summary:
As discussed in http://lists.llvm.org/pipermail/cfe-dev/2019-October/063459.html
the overflow of the souce locations (limited to 2^31 chars) can generate all sorts of
weird things (bogus warnings, hangs, crashes, miscompilation and correct compilation).
In debug mode this assert would fail. So it might be a good start, as in PR42301,
to detect the failure and exit with a proper error message.
Reviewers: rsmith, thakis, miyuki
Reviewed By: miyuki
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70183
Since 2009 (in r63846) we've been `#define`-ing OBJC_NEW_PROPERTIES all
the time on Darwin, but this macro only makes sense for `-x objective-c`
and `-x objective-c++`. Restrict it to those cases (for which there is
already separate logic).
https://reviews.llvm.org/D72970
rdar://problem/10050342
Summary:
This adds the reference types target feature. This does not enable any
more functionality in LLVM/clang for now, but this is necessary to embed
the info in the target features section, which is used by Binaryen and
Emscripten. It turned out that after D69832 `-fwasm-exceptions` crashed
because we didn't have the reference types target feature.
Reviewers: tlively
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73320
The only part of ASTContext.h that requires most AST types to be
complete is the parent map. Nothing in Clang proper uses the ParentMap,
so split it out into its own class. Make ASTContext own the
ParentMapContext so there is still a one-to-one relationship.
After this change, 562 fewer files depend on ASTTypeTraits.h, and 66
fewer depend on TypeLoc.h:
$ diff -u deps-before.txt deps-after.txt | \
grep '^[-+] ' | sort | uniq -c | sort -nr | less
562 - ../clang/include/clang/AST/ASTTypeTraits.h
340 + ../clang/include/clang/AST/ParentMapContext.h
66 - ../clang/include/clang/AST/TypeLocNodes.def
66 - ../clang/include/clang/AST/TypeLoc.h
15 - ../clang/include/clang/AST/TemplateBase.h
...
I computed deps-before.txt and deps-after.txt with `ninja -t deps`.
This removes a common and key dependency on TemplateBase.h and
TypeLoc.h.
This also has the effect of breaking the ParentMap RecursiveASTVisitor
instantiation into its own file, which roughly halves the compilation
time of ASTContext.cpp (29.75s -> 17.66s). The new file takes 13.8s to
compile.
I left behind forwarding methods for getParents(), but clients will need
to include a new header to make them work:
#include "clang/AST/ParentMapContext.h"
I noticed that this parent map functionality is unfortunately duplicated
in ParentMap.h, which only works for Stmt nodes.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D71313
This adds basic support for the Swift calling convention with WebAssembly
targets.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D71823
clang-armv7-linux-build-cache bot is complaining about undefined
references to these variables during linking, so by explicitly
placing them in some TU we should be able to fix that.
There is llvm::Value::MaximumAlignment, which is numerically
equivalent to these constants, but we can't use it directly
because we can't include llvm IR headers in clang Sema.
So instead, copy-paste the constant, and fixup the places to use it.
This was initially reviewed in https://reviews.llvm.org/D72998
Summary:
For `__builtin_assume_aligned()`, we do validate that the alignment
is not greater than `536870912` (D68824), but we don't do that for
`__attribute__((assume_aligned(N)))` attribute.
I suspect we should.
This was initially committed in a4cfb15d15
but reverted in 210f0882c9 due to
suspicious bot failures.
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D72994
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
Summary:
`alloc_align` attribute takes parameter number, not the alignment itself,
so given **just** the attribute/function declaration we can't do any
sanity checking for said alignment.
However, at call site, given the actual `Expr` that is passed
into that parameter, we //might// be able to evaluate said `Expr`
as Integer Constant Expression, and perform the sanity checks.
But since there is no requirement for that argument to be an immediate,
we may fail, and that's okay.
However if we did evaluate, we should enforce the same constraints
as with `__builtin_assume_aligned()`/`__attribute__((assume_aligned(imm)))`:
said alignment is a power of two, and is not greater than our magic threshold
This was initially committed in c2a9061ac5
but reverted in 00756b1823 because of
suspicious bot failures.
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72996
Summary:
An heuristic targetting `x && x->foo` was targed overly broadly and caused the
last T&& to be treated as a binary operator.
Reviewers: hokein
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73334
Summary:
This was reverted in e45fcfc3aa due to
libcxx build failure. This revision addresses that case.
Original commit message:
This patch will provide support for auto return type for the C++ member
functions.
This patch includes clang side implementation of this feature.
Patch by: Awanish Pandey <Awanish.Pandey@amd.com>
Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D70524
If we do, then the property_list_t length is wrong
and class_getProperty gets very sad.
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Radar-Id: rdar://problem/58804805
Differential Revision: https://reviews.llvm.org/D73219
Sending a message to `self` when it is const and within a class method
is safe because we know that `self` is the Class itself.
We can only relax this warning in ARC.
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Radar-Id: rdar://problem/58581965
Differential Revision: https://reviews.llvm.org/D72747
Differential Revision: https://reviews.llvm.org/D70268
This is a recommit of f978ea4983 with a fix for the PowerPC failure.
The issue was that:
* `CompilerInstance::ExecuteAction` calls
`getTarget().adjust(getLangOpts());`.
* `PPCTargetInfo::adjust` changes `LangOptions::HasAltivec`.
* This happens after the first few calls to `getModuleHash`.
There’s even a FIXME saying:
```
// FIXME: We shouldn't need to do this, the target should be immutable once
// created. This complexity should be lifted elsewhere.
```
This only showed up on PowerPC because it's one of the few targets that
almost always changes a hashed langopt.
I looked into addressing the fixme, but that would be a much larger
change, and it's not the only thing that happens in `ExecuteAction` that
can change the module context hash. Instead I changed the code to not
call `getModuleHash` until after it has been modified in `ExecuteAction`.
As per P1980R0, constraint expressions are unevaluated operands, and their constituent atomic
constraints only become constant evaluated during satisfaction checking.
Change the evaluation context during parsing and instantiation of constraints to unevaluated.
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html
This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.
Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.
Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.
Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.
I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.
Depends on D71907 and D71911.
Reviewers: pcc, evgeny777, steven_wu, espindola
Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71913
MSVC 2013 would refuse to pass highly aligned things (typically vectors
and aggregates) by value. Users would receive this error:
t.cpp(11) : error C2719: 'w': formal parameter with __declspec(align('32')) won't be aligned
t.cpp(11) : error C2719: 'q': formal parameter with __declspec(align('32')) won't be aligned
However, in MSVC 2015, this behavior was changed, and highly aligned
things are now passed indirectly. To avoid breaking backwards
incompatibility, objects that do not have a *required* high alignment
(i.e. double) are still passed directly, even though they are not
naturally aligned. This change implements the new behavior of passing
things indirectly.
The new behavior is:
- up to three vector parameters can be passed in [XYZ]MM0-2
- remaining arguments with required alignment greater than 4 bytes are
passed indirectly
Previously, MSVC never passed things truly indirectly, meaning clang
would always apply the byval attribute to indirect arguments. We had to
go to the trouble of adding inalloca so that non-trivially copyable C++
types could be passed in place without copying the object
representation. When inalloca was added, we asserted that all arguments
passed indirectly must use byval. With this change, that assert no
longer holds, and I had to update inalloca to handle that case. The
implicit sret pointer parameter was already handled this way, and this
change generalizes some of that logic to arguments.
There are two cases that this change leaves unfixed:
1. objects that are non-trivially copyable *and* overaligned
2. vectorcall + inalloca + vectors
For case 1, I need to touch C++ ABI code in MicrosoftCXXABI.cpp, so I
want to do it in a follow-up.
For case 2, my fix is one line, but it will require updating IR tests to
use lots of inreg, so I wanted to separate it out.
Related to D71915 and D72110
Fixes most of PR44395
Reviewed By: rjmccall, craig.topper, erichkeane
Differential Revision: https://reviews.llvm.org/D72114
Now with concepts support merged and mostly complete, we do not need -fconcepts-ts
(which was also misleading as we were not implementing the TS) and can enable
concepts features under C++2a. A warning will be generated if users still attempt
to use -fconcepts-ts.
Proper ExpressionEvaluationContext were not being entered when instantiating constraint
expressions, which caused assertion failures in certain cases, including bug #44614.
Summary:
I kind-of understand why it is restricted to integer-typed arguments,
for general enum's the value passed is not nessesairly the alignment implied,
although one might say that user would know best.
But we clearly should whitelist `std::align_val_t`,
which is just a thin wrapper over `std::size_t`,
and is the C++ standard way of specifying alignment.
Reviewers: erichkeane, rsmith, aaron.ballman, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73019
Summary:
Much like with the previous patch (D73005) with `AssumeAlignedAttr`
handling, results in mildly more readable IR,
and will improve test coverage in upcoming patch.
Note that in `AllocAlignAttr`'s case, there is no requirement
for that alignment parameter to end up being an I-C-E.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73006
Summary:
This should be mostly NFC - we still lower the same alignment
knowledge to the IR. The main reasoning here is that
this somewhat improves readability of IR like this,
and will improve test coverage in upcoming patch.
Even though the alignment is guaranteed to always be an I-C-E,
we don't always materialize it as llvm's Alignment Attribute because:
1. There may be a non-zero offset
2. We may be sanitizing for alignment
Note that if there already was an IR alignment attribute
on return value, we union them, and thus the alignment
only ever rises.
Also, there is a second relevant clang attribute `AllocAlignAttr`,
so that is why `AbstractAssumeAlignedAttrEmitter` is templated.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73005
Summary:
I initially encountered those assertions when trying to create
this IR `alignment` attribute from clang's `__attribute__((assume_aligned(imm)))`,
because until D72994 there is no sanity checking for the value of `imm`.
But even then, we have `llvm::Value::MaximumAlignment` constant (which is `536870912`),
which is enforced for clang attributes, and then there are some other magical constant
(`0x40000000` i.e. `1073741824` i.e. `2 * 536870912`) in
`Attribute::getWithAlignment()`/`AttrBuilder::addAlignmentAttr()`.
I strongly suspect that `0x40000000` is incorrect,
and that also should be `llvm::Value::MaximumAlignment`.
Reviewers: erichkeane, hfinkel, jdoerfert, gchatelet, courbet
Reviewed By: erichkeane
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D72998
Summary:
`alloc_align` attribute takes parameter number, not the alignment itself,
so given **just** the attribute/function declaration we can't do any
sanity checking for said alignment.
However, at call site, given the actual `Expr` that is passed
into that parameter, we //might// be able to evaluate said `Expr`
as Integer Constant Expression, and perform the sanity checks.
But since there is no requirement for that argument to be an immediate,
we may fail, and that's okay.
However if we did evaluate, we should enforce the same constraints
as with `__builtin_assume_aligned()`/`__attribute__((assume_aligned(imm)))`:
said alignment is a power of two, and is not greater than our magic threshold
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72996
Summary:
For `__builtin_assume_aligned()`, we do validate that the alignment
is not greater than `536870912` (D68824), but we don't do that for
`__attribute__((assume_aligned(N)))` attribute.
I suspect we should.
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D72994
Summary:
First patch to support Safe Whole Program Devirtualization Enablement,
see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html
Always emit !vcall_visibility metadata under -fwhole-program-vtables,
and not just for -fvirtual-function-elimination. The vcall visibility
metadata will (in a subsequent patch) be used to communicate to WPD
which vtables are safe to devirtualize, and we will optionally convert
the metadata to hidden visibility at link time. Subsequent follow on
patches will help enable this by adding vcall_visibility metadata to the
ThinLTO summaries, and always emit type test intrinsics under
-fwhole-program-vtables (and not just for vtables with hidden
visibility).
In order to do this safely with VFE, since for VFE all vtable loads must
be type checked loads which will no longer be the case, this patch adds
a new "Virtual Function Elim" module flag to communicate to GlobalDCE
whether to perform VFE using the vcall_visibility metadata.
One additional advantage of using the vcall_visibility metadata to drive
more WPD at LTO link time is that we can use the same mechanism to
enable more aggressive VFE at LTO link time as well. The link time
option proposed in the RFC will convert vcall_visibility metadata to
hidden (aka linkage unit visibility), which combined with
-fvirtual-function-elimination will allow it to be done more
aggressively at LTO link time under the same conditions.
Reviewers: pcc, ostannard, evgeny777, steven_wu
Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71907
This patch implements P1141R2 "Yet another approach for constrained declarations".
General strategy for this patch was:
- Expand AutoType to include optional type-constraint, reflecting the wording and easing the integration of constraints.
- Replace autos in parameter type specifiers with invented parameters in GetTypeSpecTypeForDeclarator, using the same logic
previously used for generic lambdas, now unified with abbreviated templates, by:
- Tracking the template parameter lists in the Declarator object
- Tracking the template parameter depth before parsing function declarators (at which point we can match template
parameters against scope specifiers to know if we have an explicit template parameter list to append invented parameters
to or not).
- When encountering an AutoType in a parameter context we check a stack of InventedTemplateParameterInfo structures that
contain the info required to create and accumulate invented template parameters (fields that were already present in
LambdaScopeInfo, which now inherits from this class and is looked up when an auto is encountered in a lambda context).
Resubmit after fixing MSAN failures caused by incomplete initialization of AutoTypeLocs in TypeSpecLocFiller.
Differential Revision: https://reviews.llvm.org/D65042
If local allocator was declared and used in the allocate clause, it was
not captured in inner region. It leads to a compiler crash, need to
capture the allocator declarator.
Summary:
CodeCompletion was not being triggered after successfully parsed
initializer lists, e.g.
```cpp
void foo(int, bool);
void bar() {
foo({1}^, false);
}
```
CodeCompletion would suggest the function foo as an overload candidate up until
the point marked with `^` but after that point we do not trigger signature help
since parsing succeeds.
This patch handles that case by failing in parsing expression lists whenever we
see a codecompletion token, in addition to getting an invalid subexpression.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73177
Contributed by jbcoe!
Summary: Unless SpaceBeforeParensOptions is set to SBPO_Never, a space will be put between `using` and `(` in C# code.
Reviewers: klimek, MyDeveloperDay, krasimir
Reviewed By: krasimir
Subscribers: MyDeveloperDay, cfe-commits
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D72144
Summary:
Immediate vmvnq is code-generated as a simple vector constant in IR,
and left to the backend to recognize that it can be created with an
MVE VMVN instruction. The predicated version is represented as a
select between the input and the same constant, and I've added a
Tablegen isel rule to turn that into a predicated VMVN. (That should
be better than the previous VMVN + VPSEL: it's the same number of
instructions but now it can fold into an adjacent VPT block.)
The unpredicated forms of VBIC and VORR are done by enabling the same
isel lowering as for NEON, recognizing appropriate immediates and
rewriting them as ARMISD::VBICIMM / ARMISD::VORRIMM SDNodes, which I
then instruction-select into the right MVE instructions (now that I've
also reworked those instructions to use the same MC operand encoding).
In order to do that, I had to promote the Tablegen SDNode instance
`NEONvorrImm` to a general `ARMvorrImm` available in MVE as well, and
similarly for `NEONvbicImm`.
The predicated forms of VBIC and VORR are represented as a vector
select between the original input vector and the output of the
unpredicated operation. The main convenience of this is that it still
lets me use the existing isel lowering for VBICIMM/VORRIMM, and not
have to write another copy of the operand encoding translation code.
This intrinsic family is the first to use the `imm_simd` system I put
into the MveEmitter tablegen backend. So, naturally, it showed up a
bug or two (emitting bogus range checks and the like). Fixed those,
and added a full set of tests for the permissible immediates in the
existing Sema test.
Also adjusted the isel pattern for `vmovlb.u8`, which stopped matching
because lowering started turning its input into a VBICIMM. Now it
recognizes the VBICIMM instead.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72934
Profile TypeConstraints in ProfileTemplateParameterList so we can distinguish
between partial specializations which differ in their TemplateParameterList
type constraints.
Recommit, now profiling the IDC so that we can deal with situations where the
TemplateArgsAsWritten are nullptr (happens when canonicalizing type constraints).
Profile TypeConstraints in ProfileTemplateParameterList so we can distinguish
between partial specializations which differ in their TemplateParameterList
type constraints
There's going to be a lot of common code between RecordDecl and
CXXRecordDecl, factor out some of the logic in preparation for
adding the RecordDecl side.
Summary:
We see a significant regression (~40% slower on large codebases) in expression evaluation after https://reviews.llvm.org/rL364771. A sampling profile shows the extra time is spent in SavedImportPathsTy::operator[] when called from ASTImporter::Import. I believe this is because ASTImporter::Import adds an element to the SavedImportPaths map for each decl unconditionally (see 7b81c3f879/clang/lib/AST/ASTImporter.cpp (L8256)).
To fix this, we call SavedImportPathsTy::erase on the declaration rather than clearing its value vector. That way we do not accidentally introduce new empty elements. (With this patch the performance is restored, and we do not see SavedImportPathsTy::operator[] in the profile anymore.)
Reviewers: martong, teemperor, a.sidorin, shafik
Reviewed By: martong
Subscribers: rnkovacs, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73166
This patch implements P1141R2 "Yet another approach for constrained declarations".
General strategy for this patch was:
- Expand AutoType to include optional type-constraint, reflecting the wording and easing the integration of constraints.
- Replace autos in parameter type specifiers with invented parameters in GetTypeSpecTypeForDeclarator, using the same logic
previously used for generic lambdas, now unified with abbreviated templates, by:
- Tracking the template parameter lists in the Declarator object
- Tracking the template parameter depth before parsing function declarators (at which point we can match template
parameters against scope specifiers to know if we have an explicit template parameter list to append invented parameters
to or not).
- When encountering an AutoType in a parameter context we check a stack of InventedTemplateParameterInfo structures that
contain the info required to create and accumulate invented template parameters (fields that were already present in
LambdaScopeInfo, which now inherits from this class and is looked up when an auto is encountered in a lambda context).
Resubmit after incorrect check in NonTypeTemplateParmDecl broke lldb.
Differential Revision: https://reviews.llvm.org/D65042
Add a simple cache for constraint satisfaction results. Whether or not this simple caching
would be permitted in final C++2a is currently being discussed but it is required for
acceptable performance so we use it in the meantime, with the possibility of adding some
cache invalidation mechanisms later.
Differential Revision: https://reviews.llvm.org/D72552
Do not export __llvm_profile_counter_bias when profiling is enabled
because this symbol is hidden and cannot be exported.
Should fix this bot error:
```
URL: http://green.lab.llvm.org/green/job/clang-stage1-RA/5678/consoleFull
Problem: Command Output (stdout):
--
ld: warning: cannot export hidden symbol ___llvm_profile_counter_bias
from
/Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/lib/clang/11.0.0/lib/darwin/libclang_rt.profile_osx.a(InstrProfilingBiasVar.c.o)
ld: warning: cannot export hidden symbol ___llvm_profile_counter_bias
from
/Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/lib/clang/11.0.0/lib/darwin/libclang_rt.profile_osx.a(InstrProfilingBiasVar.c.o)
```
This patch implements P1141R2 "Yet another approach for constrained declarations".
General strategy for this patch was:
- Expand AutoType to include optional type-constraint, reflecting the wording and easing the integration of constraints.
- Replace autos in parameter type specifiers with invented parameters in GetTypeSpecTypeForDeclarator, using the same logic
previously used for generic lambdas, now unified with abbreviated templates, by:
- Tracking the template parameter lists in the Declarator object
- Tracking the template parameter depth before parsing function declarators (at which point we can match template
parameters against scope specifiers to know if we have an explicit template parameter list to append invented parameters
to or not).
- When encountering an AutoType in a parameter context we check a stack of InventedTemplateParameterInfo structures that
contain the info required to create and accumulate invented template parameters (fields that were already present in
LambdaScopeInfo, which now inherits from this class and is looked up when an auto is encountered in a lambda context).
Differential Revision: https://reviews.llvm.org/D65042
Summary:
We previously listed first declared members, then implicit operator=,
then implicit operator==, then implicit destructors. Per discussion on
https://github.com/itanium-cxx-abi/cxx-abi/issues/88, put the implicit
equality comparison operators at the very end, after all special member
functions.
This reinstates add2b7e44a, reverted in
commit 89e43f04ba, with a fix for 32-bit
targets.
Reviewers: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72897
The issue was reported by @xazax.hun here: https://reviews.llvm.org/D69825#1827826
"This patch (D69825) breaks scan-build-py which parses the output of "-###" to get -cc1 command. There might be other tools with the same problems. Could we either remove (in-process) from CC1Command::Print or add a line break?
Having the last line as a valid invocation is valuable and there might be tools relying on that."
Differential Revision: https://reviews.llvm.org/D72982
When Wrange-loop-analysis issues a diagnostic on a dependent type in a
template the diagnostic may not be valid for all instantiations. Therefore
the diagnostic is suppressed during the instantiation. Non dependent types
still issue a diagnostic.
The same can happen when using macros. Therefore the diagnostic is
disabled for macros.
Fixes https://bugs.llvm.org/show_bug.cgi?id=44556
Differential Revision: https://reviews.llvm.org/D73007
Summary: Just an NFC code cleanup i stumbled upon when stumbling through clang alignment attribute handling.
Reviewers: erichkeane, gchatelet, courbet, jdoerfert
Reviewed By: gchatelet
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72993
Summary:
We shouldn't be just giving up if we find one of them
(like we currently do with `AssumeAlignedAttr`),
we should emit them all.
As the tests show, even if we materialized good knowledge
from `__attribute__((assume_aligned(32)`, it doesn't mean
`__attribute__((alloc_align([...])))` info won't be useful.
It might be, but that isn't given.
Reviewers: erichkeane, jdoerfert, aaron.ballman
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72979
When constrained floating point is enabled the SystemZ-specific builtins
don't use constrained intrinsics in some cases. Fix that.
Differential Revision: https://reviews.llvm.org/D72722
This change replaces the manual building of executable paths
using llvm::sys::path::append with GetProgramPath.
This enables adding other paths in case executables reside
in different directories and makes the code easier to read.
Differential Revision: https://reviews.llvm.org/D72903
Summary:
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
short a;
int b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.
The AAPCS Release 2019Q1.1
(https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths
of bit-fields where the container overlaps with a non-bit-field member.
This is because the C/C++ memory model defines these as being separate
memory locations, which can be accessed by two threads
simultaneously. For this reason, compilers must be permitted to use a
narrower memory access width (including splitting the access
into multiple instructions) to avoid writing to a different memory location.
```
I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
- it won't overlap with any other non-bit-field member
- we only access memory inside the bounds of the record
Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.
Reviewers: rsmith, rjmccall, eli.friedman, ostannard
Subscribers: ostannard, kristof.beyls, cfe-commits, carwil, olista01
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72932
Target regions have implicit outer region which may erroneously capture
some globals when it should not. It may lead to a compiler crash at the
compile time.
Summary:
clang-format currently treats the nullish coalescing operator `??` like
the ternary operator. That causes multiple nullish terms to be each
indented relative to the last `??`, as they would in a ternary.
The `??` operator is often used in chains though, and as such more
similar to other binary operators, such as `||`. So to fix the indent,
set its token type to `||`, so it inherits the same treatment.
This opens up the question of operator precedence. However, `??` is
required to be parenthesized when mixed with `||` and `&&`, so this is
not a problem that can come up in syntactically legal code.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73026
Summary:
Printing policy was not propogated to functiondecls when creating a
completion string which resulted in canonical template parameters like
`foo<type-parameter-0-0>`. This patch propogates printing policy to those as
well.
Fixes https://github.com/clangd/clangd/issues/76
Reviewers: ilya-biryukov
Subscribers: jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72715
Summary:
We previously listed first declared members, then implicit operator=,
then implicit operator==, then implicit destructors. Per discussion on
https://github.com/itanium-cxx-abi/cxx-abi/issues/88, put the implicit
equality comparison operators at the very end, after all special member
functions.
Reviewers: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72897
a temporary.
We previously failed to materialize a temporary when performing an
implicit conversion to a reference type, resulting in our thinking the
argument was a value rather than a reference in some cases.
Summary: This diff expands the SpacesAroundConditions option added in D68346 to include adding spaces to catch statements.
Reviewed By: MyDeveloperDay
Patch by: timwoj
Differential Revision: https://reviews.llvm.org/D72793
Summary:
The documentation for IndentCaseLabels claimed that the "Switch
statement body is always indented one level more than case labels". This
is technically false for the code block immediately following the label.
Its closing bracket aligns with the start of the label.
If the case label are not indented, it leads to a style where the
closing bracket of the block aligns with the closing bracket of the
switch statement, which can be hard to parse.
This change introduces a new option, IndentCaseBlocks, which when true
treats the block as a scope block (which it technically is).
(Note: regenerated ClangFormatStyleOptions.rst using tools/dump_style.py)
Reviewed By: MyDeveloperDay
Patch By: capn
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D72276
Implement support for C++2a requires-expressions.
Re-commit after compilation failure on some platforms due to alignment issues with PointerIntPair.
Differential Revision: https://reviews.llvm.org/D50360
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.
- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute
AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).
Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.
Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.
-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.
Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.
This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.
AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
A TemplateIdAnnotation represents only a template-id, not a
nested-name-specifier plus a template-id. Don't make a redundant copy of
the CXXScopeSpec and store it on the template-id annotation.
This slightly improves error recovery by more properly handling the case
where we would form an invalid CXXScopeSpec while parsing a typename
specifier, instead of accidentally putting the token stream into a
broken "annot_template_id with a scope specifier, but with no preceding
annot_cxxscope token" state.
This is an alternative to the continous mode that was implemented in
D68351. This mode relies on padding and the ability to mmap a file over
the existing mapping which is generally only available on POSIX systems
and isn't suitable for other platforms.
This change instead introduces the ability to relocate counters at
runtime using a level of indirection. On every counter access, we add a
bias to the counter address. This bias is stored in a symbol that's
provided by the profile runtime and is initially set to zero, meaning no
relocation. The runtime can mmap the profile into memory at abitrary
location, and set bias to the offset between the original and the new
counter location, at which point every subsequent counter access will be
to the new location, which allows updating profile directly akin to the
continous mode.
The advantage of this implementation is that doesn't require any special
OS support. The disadvantage is the extra overhead due to additional
instructions required for each counter access (overhead both in terms of
binary size and performance) plus duplication of counters (i.e. one copy
in the binary itself and another copy that's mmapped).
Differential Revision: https://reviews.llvm.org/D69740
Extend -fxray-instrumentation-bundle to split function-entry and
function-exit into two separate options, so that it is possible to
instrument only function entry or only function exit. For use cases
that only care about one or the other this will save significant overhead
and code size.
Differential Revision: https://reviews.llvm.org/D72890
XRay allows tuning by minimum function size, but also always instruments
functions with loops in them. If the minimum function size is set to a
large value the loop instrumention ends up causing most functions to be
instrumented anyway. This adds a new flag, -fxray-ignore-loops, to disable
the loop detection logic.
Differential Revision: https://reviews.llvm.org/D72873
[this re-applies c0176916a4
with the correct commit message and phabricator link]
This addresses point 1 of PR44213.
https://bugs.llvm.org/show_bug.cgi?id=44213
The DW_AT_LLVM_sysroot attribute is used for Clang module debug info,
to allow LLDB to import a Clang module from source. Currently it is
part of each DW_TAG_module, however, it is the same for all modules in
a compile unit. It is more efficient and less ambiguous to store it
once in the DW_TAG_compile_unit.
This should have no effect on DWARF consumers other than LLDB.
Differential Revision: https://reviews.llvm.org/D71732
Summary:
When compiling with -munwind-tables, the SEH filter funclet needs the uwtable
function attribute, which gets automatically added if we use
SetInternalFunctionAttributes. The filter funclet is internal so this seems
appropriate.
Reviewers: rnk
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72786
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.
This attribute only appears in Clang module debug info.
Differential Revision: https://reviews.llvm.org/D71722
Summary: Currently, an attempt to rewrite source code inside a macro expansion succeeds, but results in empty text, rather than failing with an error. This patch restructures to the code to explicitly validate ranges before attempting to edit them.
Reviewers: gribozavr
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72274
Right now every dataflow algorithm uses its own worklist implementation.
This is a first step to reduce this duplication. Some upcoming
algorithms such as the lifetime analysis is going to use the factored
out implementations.
Differential Revision: https://reviews.llvm.org/D72380
Summary:
tslint and tsc (the TypeScript compiler itself) use comment pragmas of
the style:
// tslint:disable-next-line:foo
// @ts-ignore
These must not be wrapped and must stay on their own line, in isolation.
For tslint, this required adding it to the pragma regexp. The comments
starting with `@` are already left alone, but this change adds test
coverage for them.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72907
Summary:
Revision a75f8d98d7 fixed spacing for operators,
but caused the const and non-const versions to diverge:
```
// With Style.PointerAlignment = FormatStyle::PAS_Left:
struct A {
operator char*() { return ""; }
operator const char *() const { return ""; }
};
```
The code was checking if the type specifier was directly preceded by `operator`.
However there could be comments and `const/volatile` in between.
Reviewers: mprobst
Reviewed By: mprobst
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72911
Summary:
Including `do`, `for`, and `while`, `if`, `else`, `try`, `catch`, in
addition to the previously handled fields. The unit test explicitly uses
methods, but this code path handles both fields and methods.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72827
Partially reverts 0a2be46cfd as it turned
out to cause redundant module rebuilds in multi-process incremental builds.
When a module was getting out of date, all compilation processes started at the
same time were marking it as `ToBuild`. So each process was building the same
module instead of checking if it was built by someone else and using that
result. In addition to the work duplication, contention on the same .pcm file
wasn't making builds faster.
Note that for a single-process build this change would cause redundant module
reads and validations. But reading a module is faster than building it and
multi-process builds are more common than single-process. So I'm willing to
make such a trade-off.
rdar://problem/54395127
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D72860
When LLVM_APPEND_VC_REV=OFF is set, the current git hash is no
longer embedded into binaries (mostly for --version output).
Without it, most binaries need to relink after every single
commit, even if they didn't change otherwise (due to, say,
a documentation-only commit).
LLVM_APPEND_VC_REV is ON by default, so this doesn't change the
default behavior of anything.
With this, all clients of GenerateVersionFromVCS.cmake honor
LLVM_APPEND_VC_REV.
Differential Revision: https://reviews.llvm.org/D72855
Use floating-point instead of integer zero constants to avoid
creating implicit conversions, which currently cause suboptimal
code to be generated with -ffp-exception-behavior=strict.
NFC otherwise.
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.
In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.
This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.
The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.
Reviewers: tejohnson, davidxl
Reviewed By: tejohnson
Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72547
$URL$ is an SVN keyword substitution enabled via
`svn propset svn:keywords "URL" tools/clang/lib/Basic/Version.cpp`.
Now that we no longer use SVN, it's no longer being replaced by
anything, and we no longer offer svn exports. So remove the
$URL$-specific logic.
The "cfe" path prefix removal also no longer makes sense now that
we're on git: Both CLANG_REPOSITORY and LLVM_REPOSITORY are usually
set to https://github.com/llvm/llvm-project.git
So remove that too, and remove the "llvm" prefix removal for symmetry.
With the github url, "llvm" _is_ found in the string, but not in
the place the function expected. Nobody noticed since the llvm
repository path is only used if CLANG_REVISION and LLVM_REVISION are
different, which in the git monorepo world they never should be.
(I might remove the "// Support LLVM in a separate repository"
block in a separate commit.)
Differential Revision: https://reviews.llvm.org/D72848
ConceptSpecializationExprs (CSEs) were being created with nullptr
TemplateArgsAsWritten during TemplateTemplateParmDecl canonicalization, and
we were relying on them during profiling which caused sporadic crashes
in test/CXX/.../temp.arg.template/p3-2a.cpp introduced in D44352.
Change profiling of CSEs to instead rely on the actual converted template
arguments and concept named.
Summary:
This change implements the expansion in two parts:
- Add a utility function emitAMDGPUPrintfCall() in LLVM.
- Invoke the above function from Clang CodeGen, when processing a HIP
program for the AMDGPU target.
The printf expansion has undefined behaviour if the format string is
not a compile-time constant. As a sufficient condition, the HIP
ToolChain now emits -Werror=format-nonliteral.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D71365
This needs somewhat careful disambiguation, as C++2a explicit(bool) is a
breaking change. We only enable it in cases where the source construct
could not possibly be anything else.
expanded by the deduced pack.
We recently started also deducing the arity of separately-expanded packs
that are merely mentioned within the pack in question, which is
incorrect.
If current kind of the translation unit is TU_Prefix and it is not
complete, cannot decide what to do with virtual members/table at that
time, need to delay it to later stages.
Flags are clang's default UI is flags.
We can have an env var in addition to that, but in D69825 nobody has yet
mentioned why this needs an env var, so omit it for now. If someone
needs to set the flag via env var, the existing CCC_OVERRIDE_OPTIONS
mechanism works for it (set CCC_OVERRIDE_OPTIONS=+-fno-integrated-cc1
for example).
Also mention the cc1-in-process change in the release notes.
Also spruce up the test a bit so it actually tests something :)
Differential Revision: https://reviews.llvm.org/D72769
This is applied to the vector types defined in <arm_mve.h> for use
with the intrinsics for the ARM MVE vector architecture.
Its purpose is to inhibit lax vector conversions, but only in the
context of overload resolution of the MVE polymorphic intrinsic
functions. This solves an ambiguity problem with polymorphic MVE
intrinsics that take a vector and a scalar argument: the scalar
argument can often have the wrong integer type due to default integer
promotions or unsuffixed literals, and therefore, the type of the
vector argument should be considered trustworthy when resolving MVE
polymorphism.
As part of the same change, I've added the new attribute to the
declarations generated by the MveEmitter Tablegen backend (and
corrected a namespace issue with the other attribute while I was
there).
Reviewers: aaron.ballman, dmgreen
Reviewed By: aaron.ballman
Subscribers: kristof.beyls, JDevlieghere, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72518
Summary:
Previously, the -fdollars-in-identifiers flag allows the '$' symbol to be used
in an identifier but the universal character name equivalent '\u0024' is not
allowed.
This patch changes this, so that \u0024 is valid in identifiers.
Reviewers: rsmith, jordan_rose
Reviewed By: rsmith
Subscribers: dexonsmith, simoncook, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71758
These driver options perform some checking and delegate to MC options -x86-align-branch* and -x86-branches-within-32B-boundaries.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D72463
Today the optimization is limited to:
- `[ClassName alloc]`
- `[self alloc]` when within a class method
However it means that when code is written this way:
```
@interface MyObject
- (id)copyWithZone:(NSZone *)zone
{
return [[self.class alloc] _initWith...];
}
@end
```
... then the optimization doesn't kick in and `+[NSObject alloc]` ends
up in IMP caches where it could have been avoided. It turns out that
`+alloc` -> `+[NSObject alloc]` is the most cached SEL/IMP pair in the
entire platform which is rather silly).
There's two theoretical risks allowing this optimization:
1. if the receiver is nil (which it can't be today), but it turns out
that `objc_alloc()`/`objc_alloc_init()` cope with a nil receiver,
2. if the `Clas` type for the receiver is a lie. However, for such a
code to work today (and not fail witn an unrecognized selector
anyway) you'd have to have implemented the `-alloc` **instance
method**.
Fortunately, `objc_alloc()` doesn't assume that the receiver is a
Class, it basically starts with a test that is similar to
`if (receiver->isa->bits & hasDefaultAWZ) { /* fastpath */ }`.
This bit is only set on metaclasses by the runtime, so if an instance
is passed to this function by accident, its isa will fail this test,
and `objc_alloc()` will gracefully fallback to `objc_msgSend()`.
The one thing `objc_alloc()` doesn't support is tagged pointer
instances. None of the tagged pointer classes implement an instance
method called `'alloc'` (actually there's a single class in the
entire Apple codebase that has such a method).
Differential Revision: https://reviews.llvm.org/D71682
Radar-Id: rdar://problem/58058316
Reviewed-By: Akira Hatanaka
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
list constructor when initializing from {}.
We would previously pick between calling an initializer list constructor
and calling a default constructor unstably in this situation, depending
on whether the inherited default constructor had already been used
elsewhere in the program.
Add support for type-constraints in template type parameters.
Also add support for template type parameters as pack expansions (where the type constraint can now contain an unexpanded parameter pack).
Differential Revision: https://reviews.llvm.org/D44352
Summary:
Before this change, X86_32ABIInfo::classifyArgument would be called
twice on vector arguments to vectorcall functions. This function has
side effects to track GPR register usage, and this would lead to
incorrect GPR usage in some cases. The specific case I noticed is from
running out of XMM registers with mixed FP and vector arguments and no
aggregates of any kind. Consider this prototype:
void __vectorcall vectorcall_indirect_vec(
double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
__m128 xmm5,
__m128 ecx,
int edx,
__m128 mem);
classifyArgument has no effects when called on a plain FP type, but when
called on a vector type, it modifies FreeRegs to model GPR consumption.
However, this should not happen during the vector call first pass.
I refactored the code to unify vectorcall HVA logic with regcall HVA
logic. The conventions pass HVAs in registers differently (expanded vs.
not expanded), but if they do not fit in registers, they both pass them
indirectly by address.
Reviewers: erichkeane, craig.topper
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72110
Summary:
Before this patch adding a new /D flag when compiling a source file that consumed a PCH with clang-cl would issue a diagnostic and then fail. With the patch, the diagnostic is still issued but the definition is accepted. This matches the msvc behavior. The fuzzy-pch-msvc.c is a clone of the existing fuzzy-pch.c tests with some msvc specific rework.
msvc diagnostic:
warning C4605: '/DBAR=int' specified on current command line, but was not specified when precompiled header was built
Output of the CHECK-BAR test prior to the code change:
<built-in>(1,9): warning: definition of macro 'BAR' does not match definition in precompiled header [-Wclang-cl-pch]
#define BAR int
^
D:\repos\llvm\llvm-project\clang\test\PCH\fuzzy-pch-msvc.c(12,1): error: unknown type name 'BAR'
BAR bar = 17;
^
D:\repos\llvm\llvm-project\clang\test\PCH\fuzzy-pch-msvc.c(23,4): error: BAR was not defined
# error BAR was not defined
^
1 warning and 2 errors generated.
Reviewers: rnk, thakis, hans, zturner
Subscribers: mikerice, aganea, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72405
For IR input files, we currently use LLVM diagnostic handler even the
compilation is from clang. As a result, we are not able to use -Rpass
to get the transformation reports. Some warnings are not handled
properly either: We found many mysterious warnings in our ThinLTO backend
compilations in SamplePGO and CSPGO. An example of the warning:
"warning: net/proto2/public/metadata_lite.h:51:21: 0.02% (1 / 4999)"
This turns out to be a warning by Wmisexpect, which is supposed to be
filtered out by default. But since the filter is in clang's
diagnostic hander, we emit these incomplete warnings from LLVM's
diagnostic handler.
This patch uses clang diagnostic handler for IR input files. We create
a fake backendconsumer just to install the diagnostic handler.
With this change, we will have proper handling of all the warnings and we can
use -Rpass* options in IR input files compilation.
Also note that with is patch, LLVM's diagnostic options, like
"-mllvm -pass-remarks=*", are no longer be able to get optimization remarks.
Differential Revision: https://reviews.llvm.org/D72523
Allow to build PCH's (with -building-pch-with-obj and the extra .o file)
with -fmodules-codegen -fmodules-debuginfo to allow emitting shared code
into the extra .o file, similarly to how it works with modules. A bit of
a misnomer, but the underlying functionality is the same. This saves up
to 20% of build time here.
Differential Revision: https://reviews.llvm.org/D69778
If a header contains 'extern template', then the template should be provided
somewhere by an explicit instantiation, so it is not necessary to generate
a copy. Worse, this can lead to an unresolved symbol, because the codegen's
object file will not actually contain functions from such a template
because of the GVA_AvailableExternally, but the object file for the explicit
instantiation will not contain them either because it will be blocked
by the information provided by the module.
Differential Revision: https://reviews.llvm.org/D69779
There are no special virtual function handlers (like __cxa_pure_virtual)
defined for NVPTX target, so just emit such functions as null pointers
to prevent issues with linking and unresolved references.
Summary:
This patch adds an option to limit debug info by only emitting complete class
type information when its constructor is emitted. This applies to classes
that have nontrivial user defined constructors.
I implemented the option by adding another level to `DebugInfoKind`, and
a flag `-flimit-debug-info-constructor`.
Total object file size on Windows, compiling with RelWithDebInfo:
before: 4,257,448 kb
after: 2,104,963 kb
And on Linux
before: 9,225,140 kb
after: 4,387,464 kb
According to the Windows clang.pdb files, here is a list of types that are no
longer complete with this option enabled: https://reviews.llvm.org/P8182
Reviewers: rnk, dblaikie
Subscribers: aprantl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72427
Use cast<>/castAs<> instead of dyn_cast<>/getAs<> since the pointers are always dereferenced and cast<>/castAs<> will perform the null assertion for us.
Add checks for some structural invariants when building and mutating
the syntax trees.
Fix a bug failing the invariants after mutations: the parent of nodes
added into the tree was null.
Summary:
Previously, since these aggregates are > 2*XLen, Clang would think they
were being returned indirectly and thus would decrease the number of
available GPRs available by 1. For long argument lists this could lead
to a struct argument incorrectly being passed indirectly.
Reviewers: asb, lenary
Reviewed By: asb, lenary
Subscribers: luismarques, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69590
The option will limit debug info by only emitting complete class
type information when its constructor is emitted.
This patch changes comparisons with LimitedDebugInfo to use the new
level instead.
Differential Revision: https://reviews.llvm.org/D72427
We currently treat noexcept(not-convertible-to-bool) as 'none', which
results in the typeloc info being a different size, and causing an
assert later on in the process. In order to make recovery less
destructive, replace this with noexcept(false) and a constructed 'false'
expression.
Bug Report: https://bugs.llvm.org/show_bug.cgi?id=44514
Differential Revision: https://reviews.llvm.org/D72621
Just marking a symbol as weak_odr/linkonce_odr isn't enough for
actually tolerating multiple copies of it at linking on windows,
it has to be made a proper comdat; make it comdat for all platforms
for consistency.
This should hopefully fix
https://bugzilla.mozilla.org/show_bug.cgi?id=1566288.
Differential Revision: https://reviews.llvm.org/D71572
GCC supports the conditional operator on VectorTypes that acts as a
'select' in C++ mode. This patch implements the support. Types are
converted as closely to GCC's behavior as possible, though in a few
places consistency with our existing vector type support was preferred.
Note that this implementation is different from the OpenCL version in a
number of ways, so it unfortunately required a different implementation.
First, the SEMA rules and promotion rules are significantly different.
Secondly, GCC implements COND[i] != 0 ? LHS[i] : RHS[i] (where i is in
the range 0- VectorSize, for each element). In OpenCL, the condition is
COND[i] < 0 ? LHS[i]: RHS[i].
In the process of implementing this, it was also required to make the
expression COND ? LHS : RHS type dependent if COND is type dependent,
since the type is now dependent on the condition. For example:
T ? 1 : 2;
Is not typically type dependent, since the result can be deduced from
the operands. HOWEVER, if T is a VectorType now, it could change this
to a 'select' (basically a swizzle with a non-constant mask) with the 1
and 2 being promoted to vectors themselves.
While this is a change, it is NOT a standards incompatible change. Based
on my (and D. Gregor's, at the time of writing the code) reading of the
standard, the expression is supposed to be type dependent if ANY
sub-expression is type dependent.
Differential Revision: https://reviews.llvm.org/D71463
Built libdispatch with clang interface stubs. Ran into some block
related issues. Basically VarDecl symbols can leak out because I wasn't
checking the case where a VarDecl is contained inside a BlockDecl
(versus a method or function).
This patch checks that a VarDecl is not a child decl of a BlockDecl.
This patch also does something very similar for c++ lambdas as well.
Differential Revision: https://reviews.llvm.org/D71301
With this patch, the clang tool will now call the -cc1 invocation directly inside the same process. Previously, the -cc1 invocation was creating, and waiting for, a new process.
This patch therefore reduces the number of created processes during a build, thus it reduces build times on platforms where process creation can be costly (Windows) and/or impacted by a antivirus.
It also makes debugging a bit easier, as there's no need to attach to the secondary -cc1 process anymore, breakpoints will be hit inside the same process.
Crashes or signaling inside the -cc1 invocation will have the same side-effect as before, and will be reported through the same means.
This behavior can be controlled at compile-time through the CLANG_SPAWN_CC1 cmake flag, which defaults to OFF. Setting it to ON will revert to the previous behavior, where any -cc1 invocation will create/fork a secondary process.
At run-time, it is also possible to tweak the CLANG_SPAWN_CC1 environment variable. Setting it and will override the compile-time setting. A value of 0 calls -cc1 inside the calling process; a value of 1 will create a secondary process, as before.
Differential Revision: https://reviews.llvm.org/D69825
which is the default TLS model for non-PIC objects. This allows large/
many thread local variables or a compact/fast code in an executable.
Specification is same as that of GCC. For example, the code model
option precedes the TLS size option.
TLS access models other than local-exec are not changed. It means
supoort of the large code model is only in the local exec TLS model.
Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>)
Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard
Reviewd By: peter.smith
Committed by: peter.smith
Differential Revision: https://reviews.llvm.org/D71688
Summary:
This patch will provide support for auto return type for the C++ member
functions.
This patch includes clang side implementation of this feature.
Patch by: Awanish Pandey <Awanish.Pandey@amd.com>
Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D70524
Summary:
The analysis for const-ness of local variables required a view generally useful
matchers that are extracted into its own patch.
They are `decompositionDecl` and `forEachArgumentWithParamType`, that works
for calls through function pointers as well.
Reviewers: aaron.ballman
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72505
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately within mangleCallingConvention and castAs will perform the null assertion for us.
No longer generate a diagnostic when a small trivially copyable type is
used without a reference. Before the test looked for a POD type and had no
size restriction. Since the range-based for loop is only available in
C++11 and POD types are trivially copyable in C++11 it's not required to
test for a POD type.
Since copying a large object will be expensive its size has been
restricted. 64 bytes is a common size of a cache line and if the object is
aligned the copy will be cheap. No performance impact testing has been
done.
Differential Revision: https://reviews.llvm.org/D72212
D41910 introduced a recursive visitor to MarkUsedTemplateParameters, but
disregarded the 'Depth' parameter, and had incorrect assertions. This fixes
the visitor and removes the assertions.
All 130+ f_Group flags that take an argument allow it after a '=',
except for fdebug-complation-dir. Add a Joined<> alias so that
it behaves consistently with all the other f_Group flags.
(Keep the old Separate flag for backwards compat.)
type computation, in preparation for P0388R4, which adds another few
cases here.
We now properly handle forming multi-level composite pointer types
involving nested Objective-C pointer types (as is consistent with
including them as part of the notion of 'similar types' on which this
rule is based). We no longer lose non-CVR qualifiers on nested pointer
types.
This fixes a regression introduced in
2b4fa5348e that caused us to emit
shutdown-time destruction for variables with ARC ownership, using
C++-specific functions that don't exist in C implementations.
Follow-up of D72014. It is more appropriate to use a target
feature instead of a SubTypeArch to express the difference.
Reviewed By: #powerpc, jhibbits
Differential Revision: https://reviews.llvm.org/D72433
In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.
Reviewed By: ostannard, MaskRay
Differential Revision: https://reviews.llvm.org/D72222
This feature is generic. Make it applicable for AArch64 and X86 because
the backend has only implemented NOP insertion for AArch64 and X86.
Reviewed By: nickdesaulniers, aaron.ballman
Differential Revision: https://reviews.llvm.org/D72221
Summary:
This checker verifies if default placement new is provided with pointers
to sufficient storage capacity.
Noncompliant Code Example:
#include <new>
void f() {
short s;
long *lp = ::new (&s) long;
}
Based on SEI CERT rule MEM54-CPP
https://wiki.sei.cmu.edu/confluence/display/cplusplus/MEM54-CPP.+Provide+placement+new+with+properly+aligned+pointe
This patch does not implement checking of the alignment.
Reviewers: NoQ, xazax.hun
Subscribers: mgorny, whisperity, xazax.hun, baloghadamsoftware, szepet,
rnkovacs, a.sidorin, mikhail.ramalho, donat
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71612
Summary:
Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a
new `"guard_nocf"` function attribute to indicate that checks should not be
added on indirect calls within that function. Add support for
`__declspec(guard(nocf))` following the same syntax as MSVC.
Reviewers: rnk, dmajor, pcc, hans, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72167
Update the IRBuilder to generate constrained FP comparisons in
CreateFCmp when IsFPConstrained is true, similar to the other
places in the IRBuilder.
Also, add a new CreateFCmpS to emit signaling FP comparisons,
and use it in clang where comparisons are supposed to be signaling
(currently, only when emitting code for the <, <=, >, >= operators).
Note that there is currently no way to add fast-math flags to a
constrained FP comparison, since this is implemented as an intrinsic
call that returns a boolean type, and FMF are only allowed for calls
returning a floating-point type. However, given the discussion around
https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself
really shouldn't have any FMF either, so this is probably OK.
Reviewed by: craig.topper
Differential Revision: https://reviews.llvm.org/D71467
If a system header provides an (inline) implementation of some of their
function, clang still matches on the function name and generate the appropriate
llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation «
users may not provide their own version of symbols » but doesn't account for the
fact that glibc itself can provide inline version of some functions.
It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that
case an inline version of memcpy calls __memcpy_chk, a function that performs
extra runtime checks. Clang currently ignores the inline version and thus
provides no runtime check.
This code fixes the issue by detecting functions whose name is a builtin name
but also have an inline implementation.
Differential Revision: https://reviews.llvm.org/D71082
down to pass builder in ltobackend.
Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.
Differential Revision: https://reviews.llvm.org/D72386
The language wording change forgot to update overload resolution to rank
implicit conversion sequences based on qualification conversions in
reference bindings. The anticipated resolution for that oversight is
implemented here -- we order candidates based on qualification
conversion, not only on top-level cv-qualifiers, including ranking
reference bindings against non-reference bindings if they differ in
non-top-level qualification conversions.
For OpenCL/C++, this allows reference binding between pointers with
differing (nested) address spaces. This makes the behavior of reference
binding consistent with that of implicit pointer conversions, as is the
purpose of this change, but that pre-existing behavior for pointer
conversions is itself probably not correct. In any case, it's now
consistently the same behavior and implemented in only one place.
This reinstates commit de21704ba9,
reverted in commit d8018233d1, with
workarounds for some overload resolution ordering problems introduced by
CWG2352.
explicit functions that are not candidates.
It's not always obvious that the reason a conversion was not possible is
because the function you wanted to call is 'explicit', so explicitly say
if that's the case.
It would be nice to rank the explicit candidates higher in the
diagnostic if an implicit conversion sequence exists for their
arguments, but unfortunately we can't determine that without potentially
triggering non-immediate-context errors that we're not permitted to
produce.
This change introduces three new builtins (which work on both pointers
and integers) that can be used instead of common bitwise arithmetic:
__builtin_align_up(x, alignment), __builtin_align_down(x, alignment) and
__builtin_is_aligned(x, alignment).
I originally added these builtins to the CHERI fork of LLVM a few years ago
to handle the slightly different C semantics that we use for CHERI [1].
Until recently these builtins (or sequences of other builtins) were
required to generate correct code. I have since made changes to the default
C semantics so that they are no longer strictly necessary (but using them
does generate slightly more efficient code). However, based on our experience
using them in various projects over the past few years, I believe that adding
these builtins to clang would be useful.
These builtins have the following benefit over bit-manipulation and casts
via uintptr_t:
- The named builtins clearly convey the semantics of the operation. While
checking alignment using __builtin_is_aligned(x, 16) versus
((x & 15) == 0) is probably not a huge win in readably, I personally find
__builtin_align_up(x, N) a lot easier to read than (x+(N-1))&~(N-1).
- They preserve the type of the argument (including const qualifiers). When
using casts via uintptr_t, it is easy to cast to the wrong type or strip
qualifiers such as const.
- If the alignment argument is a constant value, clang can check that it is
a power-of-two and within the range of the type. Since the semantics of
these builtins is well defined compared to arbitrary bit-manipulation,
it is possible to add a UBSAN checker that the run-time value is a valid
power-of-two. I intend to add this as a follow-up to this change.
- The builtins avoids int-to-pointer casts both in C and LLVM IR.
In the future (i.e. once most optimizations handle it), we could use the new
llvm.ptrmask intrinsic to avoid the ptrtoint instruction that would normally
be generated.
- They can be used to round up/down to the next aligned value for both
integers and pointers without requiring two separate macros.
- In many projects the alignment operations are already wrapped in macros (e.g.
roundup2 and rounddown2 in FreeBSD), so by replacing the macro implementation
with a builtin call, we get improved diagnostics for many call-sites while
only having to change a few lines.
- Finally, the builtins also emit assume_aligned metadata when used on pointers.
This can improve code generation compared to the uintptr_t casts.
[1] In our CHERI compiler we have compilation mode where all pointers are
implemented as capabilities (essentially unforgeable 128-bit fat pointers).
In our original model, casts from uintptr_t (which is a 128-bit capability)
to an integer value returned the "offset" of the capability (i.e. the
difference between the virtual address and the base of the allocation).
This causes problems for cases such as checking the alignment: for example, the
expression `if ((uintptr_t)ptr & 63) == 0` is generally used to check if the
pointer is aligned to a multiple of 64 bytes. The problem with offsets is that
any pointer to the beginning of an allocation will have an offset of zero, so
this check always succeeds in that case (even if the address is not correctly
aligned). The same issues also exist when aligning up or down. Using the
alignment builtins ensures that the address is used instead of the offset. While
I have since changed the default C semantics to return the address instead of
the offset when casting, this offset compilation mode can still be used by
passing a command-line flag.
Reviewers: rsmith, aaron.ballman, theraven, fhahn, lebedev.ri, nlopes, aqjune
Reviewed By: aaron.ballman, lebedev.ri
Differential Revision: https://reviews.llvm.org/D71499
Summary:
A few of the ARM MVE builtins directly return a structure type. This
causes an assertion failure at code-gen time if you try to assign the
result of the builtin to a variable, because the `RValue` created in
`EmitBuiltinExpr` from the `llvm::Value` produced by codegen is always
made by `RValue::get()`, which creates a non-aggregate `RValue` that
will fail an assertion when `AggExprEmitter::withReturnValueSlot` calls
`Src.getAggregatePointer()`. A similar failure occurs if you try to use
the struct return value directly to extract one field, e.g.
`vld2q(address).val[0]`.
The existing code-gen tests for those MVE builtins pass the returned
structure type directly to the C `return` statement, which apparently
managed to avoid that particular code path, so we didn't notice the
crash.
Now `EmitBuiltinExpr` checks the evaluation kind of the builtin's return
value, and does the necessary handling for aggregate returns. I've added
two extra test cases, both of which crashed before this change.
Reviewers: dmgreen, rjmccall
Reviewed By: rjmccall
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72271
Architecturally, it's allowed to have MVE-I without an FPU, thus
-mfpu=none should not disable MVE-I, or moves to/from FP-registers.
This patch removes `+/-fpregs` from features unconditionally added to
target feature list, depending on FPU and moves the logic to Clang
driver, where the negative form (`-fpregs`) is conditionally added to
the target features list for the cases of `-mfloat-abi=soft`, or
`-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative
form is added by the driver, the positive one is derived from other
features in the backend.
Differential Revision: https://reviews.llvm.org/D71843
Function trailing requires clauses now parsed, supported in overload resolution and when calling, referencing and taking the address of functions or function templates.
Differential Revision: https://reviews.llvm.org/D43357
`APFLoat::convertFromString` returns `Expected` result, which must be
"checked" if the LLVM_ENABLE_ABI_BREAKING_CHECKS preprocessor flag is
set.
To mark an `Expected` result as "checked" we must consume the `Error`
within.
In many cases, we are only interested in knowing if an error occured,
without the need to examine the error info. This is achieved, easily,
with the `errorToBool()` API.
Summary:
This allows the use of '-target powerpcspe-unknown-linux-gnu' or
'powerpcspe-unknown-freebsd' to be used, instead of
'-target powerpc-unknown-linux-gnu -mspe'.
Reviewed By: dim
Differential Revision: https://reviews.llvm.org/D72014
This matcher matches any node and at the same time executes all its
inner matchers to produce any possbile result bindings.
This is useful when a user wants certain supplementary information
that's not always present along with the main match result.
An empty string literal in an asm label does not make a whole lot of sense. GCC
does not diagnose such a construct, but it also generates code that cannot be
assembled by gas should two symbols have an empty asm label within the same TU.
This does not affect an asm statement with an empty string literal, which is
still a useful construct.
Apple's CPUs are called A7-A13 in official communication, occasionally with
weird suffixes which we probably don't need to care about. This adds each one
and describes its features. It also switches the default CPU to the canonical
name for Cyclone, but leaves legacy support in so that existing bitcode still
compiles.
According to D53384, the default was switched from -fno-PIC to -fPIC to
work around a -fsanitize=leak bug on big-endian.
This gratuitous difference between little-endian and big-endian is
undesired, and not acceptable on powerpc64-unknown-freebsd. If
-fsanitize=leak still has the problem, we should consider defaulting to
-fPIC/-fPIE only when -fsanitize=leak is specified (see SanitizerArgs::requiresPIE())
powerpc64-ibm-aix is unaffected: it still defaults to -fPIC.
powerpc64-linux-musl is unaffected (-fPIE since D39588): it still defaults to -fPIE.
Reviewed By: #powerpc, jhibbits
Differential Revision: https://reviews.llvm.org/D72363
Summary:
Every powerpc64le platform uses elfv2.
For powerpc64, the environments "elfv1" and "elfv2" were added for
FreeBSD ELFv1->ELFv2 migration in D61950. FreeBSD developers have
decided to use OS versions to select ABI, and no one is relying on the
environments.
Also use elfv2 on powerpc64-linux-musl.
Users can always use -mabi=elfv1 and -mabi=elfv2 to override the default
ABI.
Reviewed By: adalava
Differential Revision: https://reviews.llvm.org/D72352
Use canonical decls instead of mangled names in the set of already
emitted decls. This allows to reduce the number of function calls for
getting declarations mangled names and speedup the compilation.
If standalone OpenMP declaration pragma, like declare mapper or declare
reduction, is declared in the class context, it may reference a member
(data or function) in its internal expressions/statements. So, the
parsing of such pragmas must be dalayed just like the parsing of the
member initializers/definitions before the completion of the class
declaration.
It turns out it is useful to be able to define the deref type as void.
In case we have a type erased owner, we want to express that the pointee
can be basically any type. It should not be unnatural to have a void
deref type as we already familiar with "pointers to void".
Differential Revision: https://reviews.llvm.org/D72097
declare simd.
According to the standard, a list-item that appears in a linear clause without the ref modifier must be of integral or pointer type, or must be a reference to an integral or pointer type. Added check that this restriction is applied only to non-ref items.
pack expansion.
Previously, if all parameter / argument pairs for a pack expansion
deduction were non-deduced contexts, we would not deduce the arity of
the pack, and could end up deducing a different arity (leading to
failures during substitution) or defaulting to an arity of 0 (leading to
bad diagnostics about passing the wrong number of arguments to a
variadic function). Instead, we now always deduce the arity for all
involved packs any time we deduce a pack expansion.
This will result in less substitution happening in some cases, which
could avoid non-SFINAEable errors, and should generally improve the
quality of diagnostics when passing initializer lists to variadic
functions.
properties of the protocol it inherits
This fixes a bug where the type string for a @dynamic property of an
@implementation didn't have 'D' in it when the protocol it conforms to
redeclares the property declared in the base protocol.
rdar://problem/45503561
Summary:
Found a bug introduced with BraceWrappingFlags AfterControlStatement MultiLine. This feature conflicts with the existing BeforeCatch and BeforeElse flags.
For example, our team uses BeforeElse.
if (foo ||
bar) {
doSomething();
}
else {
doSomethingElse();
}
If we enable MultiLine (which we'd really love to do) we expect it to work like this:
if (foo ||
bar)
{
doSomething();
}
else {
doSomethingElse();
}
What we actually get is:
if (foo ||
bar)
{
doSomething();
}
else
{
doSomethingElse();
}
Reviewers: MyDeveloperDay, Bouska, mitchell-stellar
Patch by: pastey
Subscribers: Bouska, cfe-commits
Tags: clang
Differential Revision: https://reviews.llvm.org/D71939
Linux' current addLibCxxIncludePaths and addLibStdCxxIncludePaths
are actually almost non-Linux-specific at all, and can be reused
almost as such for all gcc toolchains. Only keep
Android/Freescale/Cray hacks in Linux's version.
Patch by sthibaul (Samuel Thibault)
Differential Revision: https://reviews.llvm.org/D69758
-mpacked-stack is currently not supported with -mbackchain, so this should
result in a compilation error message instead of being silently ignored.
Review: Ulrich Weigand
Before:
class Foo {
@CommandLineFlags
.Add
@Features.foo
public void test() {}
}
Now:
class Foo {
@Features.foo
@CommandLineFlags.Add
public void test() { }
}
See also https://crbug.com/1034115
The OpenMP specification disallows having zero-length array
sections in the depend clause (OpenMP 5.0 2.17.11).
Differential Revision: https://reviews.llvm.org/D71969
Summary:
this allow much better support of codebases like the linux kernel that mix tabs and spaces.
-ftabstop=//Width// allow specifying how large tabs are considered to be.
Reviewers: xbolva00, aaron.ballman, rsmith
Reviewed By: aaron.ballman
Subscribers: mstorsjo, cfe-commits, jyknight, riccibruno, rsmith, nathanchance
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71037
When they are free-standing, e.g. `struct X;` or `struct X {};`.
Although this complicates the common case (of free-standing class
declarations), this ensures the less common case (e.g. `struct X {} a;`)
are handled uniformly and produce similar syntax trees.
Added codegen support for lastprivate conditional. According to the
standard, if when the conditional modifier appears on the clause, if an
assignment to a list item is encountered in the construct then the
original list item is assigned the value that is assigned to the new
list item in the sequentially last iteration or lexically last section
in which such an assignment is encountered.
We look for the assignment operations and check if the left side
references lastprivate conditional variable. Then the next code is
emitted:
if (last_iv_a <= iv) {
last_iv_a = iv;
last_a = lp_a;
}
At the end the implicit barrier is generated to wait for the end of all
threads and then in the check for the last iteration the private copy is
assigned the last value.
if (last_iter) {
lp_a = last_a; // <--- new code
a = lp_a; // <--- store of private value to the original variable.
}
Adds a new ASTMatcher condition called 'hasInitStatement()' that matches if,
switch and range-for statements with an initializer. Reworked clang-tidy
readability-else-after-return to handle variables in the if condition or init
statements in c++17 ifs. Also checks if removing the else would affect object
lifetimes in the else branch.
Fixes PR44364.
There's quite a lot of references to Polly in the LLVM CMake codebase. However
the registration pattern used by Polly could be useful to other external
projects: thanks to that mechanism it would be possible to develop LLVM
extension without touching the LLVM code base.
This patch has two effects:
1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it
with a generic mechanism
2. Provide a generic mechanism to register compiler extensions.
A compiler extension is similar to a pass plugin, with the notable difference
that the compiler extension can be configured to be built dynamically (like
plugins) or statically (like regular passes).
As a result, people willing to add extra passes to clang/opt can do it using a
separate code repo, but still have their pass be linked in clang/opt as built-in
passes.
Differential Revision: https://reviews.llvm.org/D61446
Summary: `getListOfPossibleValues()` formatted incorrectly when there is only one value, emitting something like `expected 'conditional' or in OpenMP clause 'lastprivate'`.
Reviewers: jdoerfert, ABataev
Reviewed By: jdoerfert
Subscribers: guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71884
The Wrange-loop-analyses warns if a copy is made. Suppress this warning when
a temporary is bound to a rvalue reference.
While fixing this issue also found a copy-paste error in test6, which is also
fixed.
Differential Revision: https://reviews.llvm.org/D71806
clang/lib/CodeGen/CodeGenModule performs the -mpie-copy-relocations
check and sets dso_local on applicable global variables. We don't need
to duplicate the work in TargetMachine shouldAssumeDSOLocal.
Verified that -mpie-copy-relocations can still emit PC relative
relocations for external variable accesses.
clang -target x86_64 -fpie -mpie-copy-relocations -c => R_X86_64_PC32
clang -target aarch64 -fpie -mpie-copy-relocations -c => R_AARCH64_ADR_PREL_PG_HI21+R_AARCH64_LDST64_ABS_LO12_NC
This reverts commit b47b35ff51.
This caused failed asserts (Assertion `FIDAndOffset.second >
ColNo && "Column number smaller than file offset?"' failed.)
on a source file with a single line containing
"int main (void) { for( int i = 0; i < 9; i++ ); return 0; }".
We already recognize the __builtin versions of these, might as well
recognize the libcall version.
Differential Revision: https://reviews.llvm.org/D72028