Previoulsy debug-info-for-profiling and pseudo-probe-for-profiling are mutual exclusive because they compete the dwarf discrimnator for callsites on the IR. This changes allows to use the two switches together. The side effect is that callsite discriminators will be taken by pseudo probe, while discriminators for other instructions are still available for AutoFDO use. This is less than ideal, however, it still allows us a chance to smoothly transition from AutoFDO to CSSPGO, by collecting both profiles from a CSSPGO binary.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D107876
This patch adjusts the intrinsics definition of
llvm.matrix.column.major.load and llvm.matrix.column.major.store to
allow overloading the type of the stride. The bitwidth of the stride is
used to perform the offset computation.
This fixes a crash when using __builtin_matrix_column_major_load or
__builtin_matrix_column_major_store on 32 bit platforms. The stride argument
of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit
platforms.
Note that we still perform offset computations with 64 bit width on 32
bit platforms for accesses that do not take a user-specified stride.
This can be fixed separately.
Fixes PR51304.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D107349
This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107393
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.
With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`. `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.
Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:
thinlto/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
"dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
"dwarfehprepare.NumNoUnwind": 28744,
a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.
Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D105225
Target-dependent constant folding will fold these down to simple
constants (or at least, expressions that don't involve a GEP). We don't
need heroics to try to optimize the form of the expression before that
happens.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 .
Differential Revision: https://reviews.llvm.org/D107116
On ELF, an SHT_INIT_ARRAY outside a section group is a GC root. The current
codegen abuses SHT_INIT_ARRAY in a section group to mean a GC root.
On PE/COFF, the dynamic initialization for `__declspec(selectany)` in a comdat
can be garbage collected by `-opt:ref`.
Call `addUsedGlobal` for the two cases to fix the abuse/bug.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D106925
Constructor homing reduces the amount of class type info that is emitted
by emitting conmplete type info for a class only when a constructor for
that class is emitted.
This will mainly reduce the amount of duplicate debug info in object
files. In Chrome enabling ctor homing decreased total build directory sizes
by about 30%.
It's also expected that some class types (such as unused classes)
will no longer be emitted in the debug info. This is fine, since we wouldn't
expect to need these types when debugging.
In some cases (e.g. libc++, https://reviews.llvm.org/D98750), classes
are used without calling the constructor. Since this is technically
undefined behavior, enabling constructor homing should be fine.
However Clang now has an attribute
`__attribute__((standalone_debug))` that can be used on classes to
ignore ctor homing.
Bug: https://bugs.llvm.org/show_bug.cgi?id=46537
Differential Revision: https://reviews.llvm.org/D106084
DIEnumerator stores an APInt as of April 2020, so now we don't need to
truncate the enumerator value to 64 bits. Fixes assertions during IRGen.
Split from D105320, thanks to Matheus Izvekov for the test case and
report.
Differential Revision: https://reviews.llvm.org/D106585
Summary:
The AIX linker will produce errors on unresolved weak symbols. Change the
generated code to not check for the initialization function but just call
it and ensure that it always exists. Also, the AIX atexit routine has a
different name (and signature) so call it correctly. Update the lit tests
to test on AIX appropriately.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: hubert.reinterpretcast (Hubert Tong)
Differential Revision: https://reviews.llvm.org/D104420
It's noteworthy that GCC has the same bug here, which is a bit
surprising. Both Clang and GCC's bug is only for function template
arguments that are themselves templates with default template arguments
(f1<t1<int[, missing_default_here]>>). Probably because function name
matching isn't generally necessary - whereas type matching is necessary
for DWARF consumers to associate declarations and definitions across
translation units, so the bug's been addressed there already - but
continued to exist for function templates since it's fairly benign
there.
I came across this while working on a change that could reconstitute
these pretty printed names based on the rest of the DWARF, reducing the
size of the DWARF by not having to encode all the template parameters in
the name string. That reconstitution code can't tell the difference
between a defaulted argument or not, so couldn't create the current
buggy-ish output.
Making the names more consistent between direct and indirect references,
and between function and class templates seems all to the good.
(I fixed the function template version of this a few years back in
9fdd09a4cc - clearly I should've looked
more closely and generalized the code better so it only had to be fixed
once - well, doing that here now)
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
If the instantiation of a member variable makes it possible to
compute a previously undeduced type, we should use that piece of
information.
Fix bug#50590
Differential Revision: https://reviews.llvm.org/D103849
Reapply with fixes for clang tests.
-----
This is a simple enum attribute. Test changes are because enum
attributes are sorted before type attributes, so mustprogress is
now in a different position.
When building the member call to a user conversion function during an
implicit cast, the expression was not being checked for immediate
invocation, so we were never adding the ConstantExpr node to AST.
This would cause the call to the user conversion operator to be emitted
even if it was constantexpr evaluated, and this would even trip an
assert when said user conversion was declared consteval:
`Assertion failed: !cast<FunctionDecl>(GD.getDecl())->isConsteval() && "consteval function should never be emitted", file clang\lib\CodeGen\CodeGenModule.cpp, line 3530`
Fixes PR48855.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D105446
copy/dispose helper functions
We found out that these fake functions would cause clang to crash if the
changes proposed in https://reviews.llvm.org/D98799 were made.
The original patch was reverted in f681fd927e
because debug locations were missing in the body of the block byref
helper functions. This patch fixes the bug by calling CreateArtificial
after the calls to StartFunction.
Differential Revision: https://reviews.llvm.org/D104082
Non-throwing allocators currently will always get null-check code. However, if the non-throwing allocator is explicitly annotated with returns_nonnull the null check should be elided.
Testing:
ninja check-all
added test case correctly elides
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D102820
On x86_64 mingw, long doubles are always passed indirectly as
arguments (see an existing case in WinX86_64ABIInfo::classify);
generalize the existing code for reading varargs - any non-aggregate
type that is larger than 64 bits (which would be both long double
in mingw, and __int128) are passed indirectly too.
This makes reading varargs consistent with how they're passed,
fixing interop with both gcc and clang callers, for long double
and __int128.
Differential Revision: https://reviews.llvm.org/D103452
This fixes PR49198: Wrong usage of __dso_handle in user code leads to
a compiler crash.
When Init is an address of the global itself, we need to track it
across RAUW. Otherwise the initializer can be destroyed if the global
is replaced.
Differential Revision: https://reviews.llvm.org/D101156
All fuchsia targets will now use the relative-vtables ABI by default.
Also remove -fexperimental-relative-c++-abi-vtables from test RUNs targeting fuchsia.
Differential Revision: https://reviews.llvm.org/D102374
At the moment, the matrix support in CheckCXXCStyleCast (added in
D101696) breaks function-style constructor calls that take a
single matrix value, because it is treated as matrix cast.
Instead, unify the C++ matrix cast handling by moving the logic to
TryStaticCast and only handle the case where both types are matrix
types. Otherwise, fall back to the generic mis-match detection.
Suggested by @rjmccall
Reviewed By: SaurabhJha
Differential Revision: https://reviews.llvm.org/D103163
This relands commit 13dd65b3a1.
The original commit contained a test, which failed when compiled
for a MACH-O target.
This patch changes the test to run for x86_64-linux instead of
`%itanium_abi_triple`, to avoid having invalid syntax for MACH-O
sections. The patch itself does not care about section attribute
syntax and a x86 backend does not even need to be included in the
build.
Differential Revision: https://reviews.llvm.org/D102693
When a const-qualified object has a section attribute, that
section is set to read-only and clang outputs a LLVM IR constant
for that object. This is incorrect for dynamically initialised
objects.
For example:
int init() { return 15; }
__attribute__((section("SA")))
const int a = init();
a is allocated to a read-only section and is left
unintialised (zero-initialised).
This patch adds checks if an initialiser is a constant expression
and allocates objects to sections as follows:
* const-qualified objects
- no initialiser or constant initialiser: .rodata
- dynamic initializer: .bss
* non const-qualified objects
- no initialiser or dynamic initialiser: .bss
- constant initialiser: .data
(".rodata", ".data", and ".bss" names used just for explanatory
purpose)
Differential Revision: https://reviews.llvm.org/D102693
When -gstrict-dwarf is specified, generate DW_TAG_rvalue_reference_type
at DWARF 4 or above
Reviewed By: dblaikie, aprantl
Differential Revision: https://reviews.llvm.org/D100630
.byte supports string, so if the whole byte list are printable,
we can actually print the string for readability and LIT tests maintainence.
.byte 'H,'e,'l,'l,'o,',,' ,'w,'o,'r,'l,'d
->
.byte "Hello, world"
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D102814
It turns out we have not correctly supported exception spec all along in
Emscripten EH. Emscripten EH supports `throw()` but not `throw` with
types. See https://bugs.llvm.org/show_bug.cgi?id=50396.
Wasm EH also only supports `throw()` but not `throw` with types, and we
have been printing a warning message for the latter. This prints the
same warning message for `throw` with types when Emscripten EH is used,
or more precisely, when Wasm EH is not used. (So this will print the
warning messsage even when `-fno-exceptions` is used but I think that
should be fine. It's cumbersome to do a complilcated option checking in
CGException.cpp and options checkings are mostly done in elsewhere.)
Reviewed By: dschuff, kripken
Differential Revision: https://reviews.llvm.org/D102791
We use `CHECK-LABEL: define` to divide input stream into functions,
this works well on most platforms.
But there are cases that some platforms (eg: AIX) may have different
codegen , especially for global constructor and descructors.
On AIX, the codegen will have two more functions: __dtor_b,
__finalize_b, which will fail the test.
The fix is to use specific function name so that we can safely ignore
those unrelated codegen differences.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102654
I wouldn't recommend writing code like the testcase; a function
parameter isn't atomic, so using an atomic type doesn't really make
sense. But it's valid, so clang shouldn't crash on it.
The code was assuming hasAggregateEvaluationKind(Ty) implies Ty is a
RecordType, which isn't true. Just use isRecordType() instead.
Differential Revision: https://reviews.llvm.org/D102015
I've taken the following steps to add unwinding support from inline assembly:
1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:
```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
to label %exit unwind label %uexit
```
2.) Add Bitcode writing/reading support + LLVM-IR parsing.
3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.
4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.
5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.
6.) Don't allow unwinding callbr.
Reviewed By: Amanieu
Differential Revision: https://reviews.llvm.org/D95745
The original change was reverted because it was discovered
that clang mishandles thunks, and they receive wrong
attributes for their this/return types - the ones for the function
they will call, not the ones they have.
While i have tried to fix this in https://reviews.llvm.org/D100388
that patch has been up and stuck for a month now,
with little signs of progress.
So while it will be good to solve this for real,
for now we can simply avoid introducing the bug,
by not annotating this/return for thunks.
This reverts commit 6270b3a1ea,
relanding 0aa0458f14.
As it was discovered in post-commit feedback
for 0aa0458f14,
we handle thunks incorrectly, and end up annotating
their this/return with attributes that are valid
for their callees, not for thunks themselves.
While it would be good to fix this properly,
and keep annotating them on thunks,
i've tried doing that in https://reviews.llvm.org/D100388
with little success, and the patch is stuck for a month now.
So for now, as a stopgap measure, subj.
Non-comprehensive list of cases:
* Dumping template arguments;
* Corresponding parameter contains a deduced type;
* Template arguments are for a DeclRefExpr that hadMultipleCandidates()
Type information is added in the form of prefixes (u8, u, U, L),
suffixes (U, L, UL, LL, ULL) or explicit casts to printed integral template
argument, if MSVC codeview mode is disabled.
Differential revision: https://reviews.llvm.org/D77598
Commit 5baea05601 set the CurCodeDecl
because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField,
But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec
and cause corruption of the EHStack.
Revert the part of the commit that changes the CurCodeDecl, and instead
adjust the assert to check for a null CurCodeDecl.
Differential Revision: https://reviews.llvm.org/D102027
This implements the flag proposed in RFC
http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through a
compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store -fc++-abi= in a LangOpt. This isn't stored in a CodeGenOpt because
there are instances outside of codegen where Clang needs to know what the
ABI is (particularly through ASTContext::createCXXABI), and we should be
able to override the target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
If a return value is explicitly rounded to 64 bits, an additional zext
instruction is emitted, and in some cases it prevents tail call
optimization.
As discussed in D100225, this rounding is not necessary and can be
disabled.
Differential Revision: https://reviews.llvm.org/D100591
The code example:
```
constexpr const char kEta[] = "Eta";
template <const char*, typename T> class Column {};
using quick = Column<kEta,double>;
void lookup() {
quick c1;
c1.ls();
}
```
emits error: no member named 'ls' in 'Column<&kEta, double>'. The patch fixes
the printed type name by not printing the ampersand for array types.
Differential Revision: https://reviews.llvm.org/D36368
Currently Clang does not add mustprogress to inifinite loops with a
known constant condition, matching C11 behavior. The forward progress
guarantee in C++11 and later should allow us to add mustprogress to any
loop (http://eel.is/c++draft/intro.progress#1).
This allows us to simplify the code dealing with adding mustprogress a
bit.
Reviewed By: aaron.ballman, lebedev.ri
Differential Revision: https://reviews.llvm.org/D96418
The NSS FileCheck variables at the end of the
CodeGenCXX/split-stacks.cpp clang testcase are off by 1, resulting in
the use of an undefined variable (NSS3). One of the CHECK-NOT is also
redundant because _Z8tnosplitIiEiv uses the same attribute as _Z3foov
without split stack. This commit fixes that.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D99839
GCC 8 introduced these new pragmas to control loop unrolling. We should support them for compatibility reasons and the implementation itself requires few lines of code, since everything needed is already implemented for #pragma unroll/nounroll.
This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.
The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.
Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.
Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D99517
ICC permits this, and after some extensive testing it looks like we can
support this with very little trouble. We intentionally don't choose to
do this with attribute-target (despite it likely working as well!)
because GCC does not support that, and introducing said
incompatibility doesn't seem worth it.
The existing Windows Itanium patches for dllimport/export
behaviour w.r.t vtables/rtti can't be adopted for PS4 due to
backwards compatibility reasons (see comments on
https://reviews.llvm.org/D90299).
This commit adds our PS4 scheme for this to Clang.
Differential Revision: https://reviews.llvm.org/D93203
Summary: The tags DW_LANG_C_plus_plus_14 and DW_LANG_C_plus_plus_11, introduced in Dwarf-5, are unexpected in previous versions. Fixing the mismathing doesn't have any drawbacks for any other debuggers, but helps dbx.
Reviewed By: aprantl, shchenz
Differential Revision: https://reviews.llvm.org/D99250
This ensures these types have distinct names if they are distinct types
(eg: if one is an instantiation with a type in one inline namespace, and
another from a type with the same simple name, but in a different inline
namespace).
As it is being noted in D99249, lack of alignment information on `this`
has been preventing LICM from happening.
For some time now, lack of alignment attribute does *not* imply
natural alignment, but an alignment of `1`.
Also, we used to treat dereferenceable as implying alignment,
but we no longer do, so it's a bugfix.
Differential Revision: https://reviews.llvm.org/D99790
Commit f495de43bd forgot two lines when
removing checks for strong and weak equality, resulting in the use of an
undefined FileCheck variable.
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99838
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
08196e0b2e exposed LowerExpectIntrinsic's
internal implementation detail in the form of
LikelyBranchWeight/UnlikelyBranchWeight options to the outside.
While this isn't incorrect from the results viewpoint,
this is suboptimal from the layering viewpoint,
and causes confusion - should transforms also use those weights,
or should they use something else, D98898?
So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight
internal again, and fixing all the code that used it directly,
which currently is only clang codegen, thankfully,
to emit proper @llvm.expect intrinsics instead.
This patch is a second attempt at fixing a link error for MSVC
entry points when calling conventions are specified using a flag.
Calling conventions specified using flags should not be applied to MSVC
entry points. The default calling convention is set in this case. The
default calling convention for MSVC entry points main and wmain is cdecl.
For WinMain, wWinMain and DllMain, the default calling convention is
stdcall on 32 bit Windows.
Explicitly specified calling conventions are applied to MSVC entry points.
For MinGW, the default calling convention for all MSVC entry points is
cdecl.
First attempt: 4cff1b40da
Revert of first attempt: bebfc3b92d
Differential Revision: https://reviews.llvm.org/D97941
This reverts commit 809a1e0ffd.
Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local.
Differential Revision: https://reviews.llvm.org/D98458
The condition variable is in scope in the loop increment, so we need to
emit the jump destination from wthin the scope of the condition
variable.
For GCC compatibility (and compatibility with real-world 'FOR_EACH'
macros), 'continue' is permitted in a statement expression within the
condition of a for loop, though, so there are two cases here:
* If the for loop has no condition variable, we can emit the jump
destination before emitting the condition.
* If the for loop has a condition variable, we must defer emitting the
jump destination until after emitting the variable. We diagnose a
'continue' appearing in the initializer of the condition variable,
because it would jump past the initializer into the scope of that
variable.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D98816
This was motivated by the fact that constructor type homing (debug info
optimization that we want to turn on by default) drops some libc++ types,
so an attribute would allow us to override constructor homing and emit
them anyway. I'm currently looking into the particular libc++ issue, but
even if we do fix that, this issue might come up elsewhere and it might be
nice to have this.
As I've implemented it now, the attribute isn't specific to the
constructor homing optimization and overrides all of the debug info
optimizations.
Open to discussion about naming, specifics on what the attribute should do, etc.
Differential Revision: https://reviews.llvm.org/D97411
Commit 1b04bdc2f3 added support for
capturing the 'this' pointer in a SEH context (__finally or __except),
But the case in which the 'this' pointer is part of a lambda capture
was not handled properly
Differential Revision: https://reviews.llvm.org/D97687
This patch implements the conditional select operator for
ext_vector_types in C++. It does so by using the same semantics as for
C.
D71463 added support for the conditional select operator for VectorType
in C++. Unfortunately the semantics between ext_vector_type in C are
different to VectorType in C++. Select for ext_vector_type is based on
the MSB of the condition vector, whereas for VectorType it is `!= 0`.
This unfortunately means that the behavior is inconsistent between
ExtVectorType and VectorType, but I think using the C semantics for
ExtVectorType in C++ as well should be less surprising for users.
Reviewed By: erichkeane, aaron.ballman
Differential Revision: https://reviews.llvm.org/D98055
Background:
Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses
Itanium-based libraries and ABIs with some modifications.
`__clang_call_terminate` is a wrapper generated in Clang's Itanium C++
ABI implementation. It contains this code, in C-style pseudocode:
```
void __clang_call_terminate(void *exn) {
__cxa_begin_catch(exn);
std::terminate();
}
```
So this function is a wrapper to call `__cxa_begin_catch` on the
exception pointer before termination.
In Itanium ABI, this function is called when another exception is thrown
while processing an exception. The pointer for this second, violating
exception is passed as the argument of this `__clang_call_terminate`,
which calls `__cxa_begin_catch` with that pointer and calls
`std::terminate` to terminate the program.
The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch`
says,
```
When the personality routine encounters a termination condition, it
will call __cxa_begin_catch() to mark the exception as handled and then
call terminate(), which shall not return to its caller.
```
In wasm EH's Clang implementation, this function is called from
cleanuppads that terminates the program, which we also call terminate
pads. Cleanuppads normally don't access the thrown exception and the
wasm backend converts them to `catch_all` blocks. But because we need
the exception pointer in this cleanuppad, we generate
`wasm.get.exception` intrinsic (which will eventually be lowered to
`catch` instruction) as we do in the catchpads. But because terminate
pads are cleanup pads and should run even when a foreign exception is
thrown, so what we have been doing is:
1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure
terminate pads are in this simple shape:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
```
2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the
pipeline, we attach a `catch_all` to terminate pads, so they will be in
this form:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
catch_all
call @std::terminate()
unreachable
```
In `catch_all` part, we don't have the exception pointer, so we call
`std::terminate()` directly. The reason we ran HandleEHTerminatePads at
the end of the pipeline, separate from LateEHPrepare, was it was
convenient to assume there was only a single `catch` part per `try`
during CFGSort and CFGStackify.
---
Problem:
While it thinks terminate pads could have been possibly split or calls
to `__clang_call_terminate` could have been duplicated,
`WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate
pads contain no more than calls to `__clang_call_terminate` and
`unreachable` instruction. I assumed that because in LLVM very limited
forms of transformations are done to catchpads and cleanuppads to
maintain the scoping structure. But it turned out to be incorrect;
passes can merge cleanuppads into one, including terminate pads, as long
as the new code has a correct scoping structure. One pass that does this
I observed was `SimplifyCFG`, but there can be more. After this
transformation, a single cleanuppad can contain any number of other
instructions with the call to `__clang_call_terminate` and can span many
BBs. It wouldn't be practical to duplicate all these BBs within the
cleanuppad to generate the equivalent `catch_all` blocks, only with
calls to `__clang_call_terminate` replaced by calls to `std::terminate`.
Unless we do more complicated transformation to split those calls to
`__clang_call_terminate` into a separate cleanuppad, it is tricky to
solve.
---
Solution (?):
This CL just disables the generation and use of `__clang_call_terminate`
and calls `std::terminate()` directly in its place.
The possible downside of this approach can be, because the Itanium ABI
intended to "mark" the violating exception handled, we don't do that
anymore. What `__cxa_begin_catch` actually does is increment the
exception's handler count and decrement the uncaught exception count,
which in my opinion do not matter much given that we are about to
terminate the program anyway. Also it does not affect info like stack
traces that can be possibly shown to developers.
And while we use a variant of Itanium EH ABI, we can make some
deviations if we choose to; we are already different in that in the
current version of the EH spec we don't support two-phase unwinding. We
can possibly consider a more complicated transformation later to
reenable this, but I don't think that has high priority.
Changes in this CL contains:
- In Clang, we don't generate a call to `wasm.get.exception()` intrinsic
and `__clang_call_terminate` function in terminate pads anymore; we
simply generate calls to `std::terminate()`, which is the default
implementation of `CGCXXABI::emitTerminateForUnexpectedException`.
- Remove `WebAssembly::ensureSingleBBTermPads() function and
`WebAssemblyHandleEHTerminatePads` pass, because terminate pads are
already `catch_all` now (because they don't need the exception
pointer) and we don't need these transformations anymore.
- Change tests to use `std::terminate` directly. Also removes tests that
tested `LateEHPrepare::ensureSingleBBTermPads` and
`HandleEHTerminatePads` pass.
- Drive-by fix: Add some function attributes to EH intrinsic
declarations
Fixes https://github.com/emscripten-core/emscripten/issues/13582.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D97834
Use a WeakTrackingVH to cope with the stmt emission logic that cleans up
unreachable blocks. This invalidates the reference to the deferred
replacement placeholder. Cope with it.
Fixes PR25102 (from 2015!)
`GetAddrOfGlobalTemporary` previously tried to emit the initializer of
a global temporary before updating the global temporary map. Emitting the
initializer could recurse back into `GetAddrOfGlobalTemporary` for the same
temporary, resulting in an infinite recursion.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D97733
Simply make sure that the CodeGenFunction::CXXThisValue and CXXABIThisValue
are correctly initialized to the recovered value.
For lambda capture, we also need to make sure to fill the LambdaCaptureFields
Differential Revision: https://reviews.llvm.org/D97534
For ELF targets, GCC 11 will set SHF_GNU_RETAIN on the section of a
`__attribute__((retain))` function/variable to prevent linker garbage
collection. (See AttrDocs.td for the linker support).
This patch adds `retain` functions/variables to the `llvm.used` list, which has
the desired linker GC semantics. Note: `retain` does not imply `used`,
so an unused function/variable can be dropped by Sema.
Before 'retain' was introduced, previous ELF solutions require inline asm or
linker tricks, e.g. `asm volatile(".reloc 0, R_X86_64_NONE, target");`
(architecture dependent) or define a non-local symbol in the section and use
`ld -u`. There was no elegant source-level solution.
With D97448, `__attribute__((retain))` will set `SHF_GNU_RETAIN` on ELF targets.
Differential Revision: https://reviews.llvm.org/D97447
An global value in the `llvm.used` list does not have GC root semantics on ELF targets.
This will be changed in a subsequent backend patch.
Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to
prevent undesired GC root semantics.
Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets.
GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections",
which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior).
For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that
the future LLD change will not affect them.
rnk kindly categorized the changes:
```
ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac.
MS C++ ABI stuff: wants GC root semantics, no change
OpenMP: unsure, but GC root semantics probably don't hurt
CodeGenModule: affected in this patch to *not* use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics
Coverage: Probably want GC root semantics
CGExpr.cpp: refers to LTO, wants GC root
CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run)
CGDecl.cpp: Changed in this patch for __attribute__((used))
```
Differential Revision: https://reviews.llvm.org/D97446
It would be beneficial to allow not_tail_called attribute to be applied to
virtual functions. I don't see any drawback of allowing this.
Differential Revision: https://reviews.llvm.org/D96832
Patch takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time.
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D97116
Follow-up to fe2dcd89ac.
Update test per review comments, restoring the "D" type to its
original state, and adding new "L" type. (Sorry, this was intended to
be included in the prior commit)
Differential Revision: https://reviews.llvm.org/D96044
Previously, the definition was so-marked, but the declaration was
not. This resulted in LLVM's dwarf emission treating the function as
being external, and incorrectly emitting DW_AT_external.
Differential Revision: https://reviews.llvm.org/D96044
When WPD is enabled, via WholeProgramVTables, emit type metadata for
available_externally vtables. Additionally, add the vtables to the
llvm.compiler.used global so that they are not prematurely eliminated
(before *LTO analysis).
This is needed to avoid devirtualizing calls to a function overriding a
class defined in a header file but with a strong definition in a shared
library. Without type metadata on the available_externally vtables from
the header, the WPD analysis never sees what a derived class is
overriding. Even if the available_externally base class functions are
pure virtual, because shared library definitions are already treated
conservatively (committed patches D91583, D96721, and D96722) we will
not devirtualize, which would be unsafe since the library might contain
overrides that aren't visible to the LTO unit.
An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.
Differential Revision: https://reviews.llvm.org/D96919
This patch generates the `-f[no-]finite-loops` arguments from `CompilerInvocation` (added in D96419), fixing test failures of Clang built with `-DCLANG_ROUND_TRIP_CC1_ARGS=ON`.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D96761
This patch ensures that vector predication and vectorization width
pragmas work together correctly/as expected. Specifically, this patch
fixes the issue that when vectorization_width > 1, the vector
predication behaviour (this would matter if it has NOT been disabled
explicitly by a pragma) was getting ignored, which was incorrect.
The fix here removes the dependence of vector predication on the
vectorization width. The loop metadata corresponding to clang loop
pragma vectorize_predicate is always emitted, if the pragma is
specified, even if vectorization is disabled by vectorize_width(1)
or vectorize(disable) since the option is also used for interleaving
by the LoopVectorize pass.
Reviewed By: dmgreen, Meinersbur
Differential Revision: https://reviews.llvm.org/D94779
This patch adds 2 new options to control when Clang adds `mustprogress`:
1. -ffinite-loops: assume all loops are finite; mustprogress is added
to all loops, regardless of the selected language standard.
2. -fno-finite-loops: assume no loop is finite; mustprogress is not
added to any loop or function. We could add mustprogress to
functions without loops, but we would have to detect that in Clang,
which is probably not worth it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D96419
class types.
The goal is to provide a way to bypass constructor homing when emitting
class definitions and force class definitions in the debug info.
Not sure about the wording of the attribute, or whether it should be
specific to classes with constructors
The ability to specify alignment was recently added, and it's an
important property which we should ensure is set as expected by
Clang. (Especially before making further changes to Clang's code in
this area.) But, because it's on the end of the lines, the existing
tests all ignore it.
Therefore, update all the tests to also verify the expected alignment
for atomicrmw and cmpxchg. While I was in there, I also updated uses
of 'load atomic' and 'store atomic', and added the memory ordering,
where that was missing.
variable's destruction if it didn't do so during construction.
The standard doesn't give any guidance as to what to do here, but this
approach seems reasonable and conservative, and has been proposed to the
standard committee.
As Itanium ABI[http://itanium-cxx-abi.github.io/cxx-abi/abi.html#once-ctor]
points out:
"The size of the guard variable is 64 bits. The first byte (i.e. the byte at
the address of the full variable) shall contain the value 0 prior to
initialization of the associated variable, and 1 after initialization is complete."
Differential Revision: https://reviews.llvm.org/D95822
doubly-nested implicit CXXConstructExprs.
Ensure that we transform the parameter initializer using
TransformInitializer rather than TransformExpr so that we properly strip
down and rebuild the initialization, including any necessary
CXXBindTemporaryExprs. Otherwise we can end up forgetting to destroy
temporary objects used to construct a constructor parameter.
Normally, Clang will not make dllimport functions available for inlining
if they reference non-imported symbols, as this can lead to confusing
link errors. But if the function is marked always_inline, the user
presumably knows what they're doing and the attribute should be honored.
Differential revision: https://reviews.llvm.org/D95673
with fix to test case and stringrefs.
Currently (for codeview) lambdas have a string like `<lambda_0>` in
their mangled name, and don't have any display name. This change uses the
`<lambda_0>` as the display name, which helps distinguish between lambdas
in -gline-tables-only, since there are no linkage names there.
It also changes how we display lambda names; previously we used
`<unnamed-tag>`; now it will show `<lambda_0>`.
I added a function to the mangling context code to create this string;
for Itanium it just returns an empty string.
Bug: https://bugs.llvm.org/show_bug.cgi?id=48432
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D95187
This reverts 9b21d4b943
Currently (for codeview) lambdas have a string like `<lambda_0>` in
their mangled name, and don't have any display name. This change uses the
`<lambda_0>` as the display name, which helps distinguish between lambdas
in -gline-tables-only, since there are no linkage names there.
It also changes how we display lambda names; previously we used
`<unnamed-tag>`; now it will show `<lambda_0>`.
I added a function to the mangling context code to create this string;
for Itanium it just returns an empty string.
Bug: https://bugs.llvm.org/show_bug.cgi?id=48432
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D95187
The only test that needed change had 'QUAL' as an unused prefix. The
rest of the changes are to simplify the prefix lists.
Differential Revision: https://reviews.llvm.org/D95499
If an initial value is given for a bitfield that does not fit in the
bitfield, the value should be truncated. Constant folding for
expressions did not account for this truncation in the case of union
member functions, despite a warning being emitted. In some contexts,
evaluation of expressions was not enabled unless C++11, ROPI or RWPI
was enabled.
Differential Revision: https://reviews.llvm.org/D93101
The Clang enable_if extension is mangled as an <extended-qualifier>,
which is supposed to contain <template-args>. However, we were
unconditionally emitting X/E around its arguments, neglecting the fact
that <expr-primary> should be emitted directly without the surrounding
X/E.
Differential Revision: https://reviews.llvm.org/D95488
Previously, we were emitting an extraneous X .. E in <template-arg>
around an <expr-primary> if the template argument was constructed from
an expression (rather than an already-evaluated literal value). In
such a case, we would then e.g. emit 'XLi0EE' instead of 'Li0E'.
We had one special-case for DeclRefExpr expressions, in particular, to
omit them the mangled-name without the surrounding X/E. However,
unfortunately, that special case also triggered for ParmVarDecl (a
subtype of VarDecl), and _incorrectly_ emitted 'L_Z .. E' instead of
the proper 'Xfp_E'.
This change causes mangleExpression itself to be responsible for
emitting X/E around non-primary expressions, which removes the
special-case, and corrects both these problems.
Differential Revision: https://reviews.llvm.org/D95487
The two operations have acted differently since Clang 8, but were
unfortunately mangled the same. The new mangling uses new "vendor
extended expression" syntax proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112
GCC had the same mangling problem, https://gcc.gnu.org/PR88115, and
will hopefully be switching to the same mangling as implemented here.
Additionally, fix the mangling of `__uuidof` to use the new extension
syntax, instead of its previous nonstandard special-case.
Adjusts the demangler accordingly.
Differential Revision: https://reviews.llvm.org/D93922
The test ms-lookup-template-base-classes.cpp added in d972d4c749
is failing on some builtbot that don't include x86.
This patch should fix that (following the patterns in the test directory).
In the PPC32 SVR4 ABI, a va_list has copies of registers from the function call.
va_arg looked in the wrong registers for (the pointer representation of) an
object in Objective-C, and for some types in C++. Fix va_arg to look in the
general-purpose registers, not the floating-point registers. Also fix va_arg
for some C++ types, like a member function pointer, that are aggregates for
the ABI.
Anthony Richardby found the problem in Objective-C. Eli Friedman suggested
part of this fix.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47921
Reviewed By: efriedma, nemanjai
Differential Revision: https://reviews.llvm.org/D90329
Combined with 'da98651 - Revert "DR2064:
decltype(E) is only a dependent', this change (5a391d3) caused verifier
errors when building Chromium. See https://crbug.com/1168494#c1 for a
reproducer.
Additionally it reverts changes that were dependent on this one, see
below.
> Following up on PR48517, fix handling of template arguments that refer
> to dependent declarations.
>
> Treat an id-expression that names a local variable in a templated
> function as being instantiation-dependent.
>
> This addresses a language defect whereby a reference to a dependent
> declaration can be formed without any construct being value-dependent.
> Fixing that through value-dependence turns out to be problematic, so
> instead this patch takes the approach (proposed on the core reflector)
> of allowing the use of pointers or references to (but not values of)
> dependent declarations inside value-dependent expressions, and instead
> treating template arguments as dependent if they evaluate to a constant
> involving such dependent declarations.
>
> This ends up affecting a bunch of OpenMP tests, due to OpenMP
> imprecisely handling instantiation-dependent constructs, bailing out
> early instead of processing dependent constructs to the extent possible
> when handling the template.
>
> Previously committed as 8c1f2d15b8, and
> reverted because a dependency commit was reverted.
This reverts commit 5a391d38ac.
It also restores clang/test/SemaCXX/coroutines.cpp to its state before
da986511fb.
Revert "[c++20] P1907R1: Support for generalized non-type template arguments of scalar type."
> Previously committed as 9e08e51a20, and
> reverted because a dependency commit was reverted. This incorporates the
> following follow-on commits that were also reverted:
>
> 7e84aa1b81 by Simon Pilgrim
> ed13d8c667 by me
> 95c7b6cadb by Sam McCall
> 430d5d8429 by Dave Zarzycki
This reverts commit 4b574008ae.
Revert "[msabi] Mangle a template argument referring to array-to-pointer decay"
> [msabi] Mangle a template argument referring to array-to-pointer decay
> applied to an array the same as the array itself.
>
> This follows MS ABI, and corrects a regression from the implementation
> of generalized non-type template parameters, where we "forgot" how to
> mangle this case.
This reverts commit 18e093faf7.
applied to an array the same as the array itself.
This follows MS ABI, and corrects a regression from the implementation
of generalized non-type template parameters, where we "forgot" how to
mangle this case.
if E is merely instantiation-dependent."
This change leaves us unable to distinguish between different function
templates that differ in only instantiation-dependent ways, for example
template<typename T> decltype(int(T())) f();
template<typename T> decltype(int(T(0))) f();
We'll need substantially better support for types that are
instantiation-dependent but not dependent before we can go ahead with
this change.
This reverts commit e3065ce238.
Previously committed as 9e08e51a20, and
reverted because a dependency commit was reverted. This incorporates the
following follow-on commits that were also reverted:
7e84aa1b81 by Simon Pilgrim
ed13d8c667 by me
95c7b6cadb by Sam McCall
430d5d8429 by Dave Zarzycki
the nested-name-specifier when determining whether a qualified type is
instantiation-dependent.
Previously reverted in 25a02c3d1a due to
causing us to reject some code. It turns out that the rejected code was
ill-formed (no diagnostic required).
if E is merely instantiation-dependent.
Previously reverted in 34e72a146111dd986889a0f0ec8767b2ca6b2913;
re-committed with a fix to an issue that caused name mangling to assert.
for function scopes, rather than using the qualified name.
In line-tables-only mode, we used to emit qualified names as the display name for functions when using CodeView.
This patch changes to emitting the parent scopes instead, with forward declarations for class types.
The total object file size ends up being slightly smaller than if we use the full qualified names.
Differential Revision: https://reviews.llvm.org/D94639
There is currently a driver test but no test for its effect on linkageName & pass pipeline.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D94381
Clang generates `wasm.get.exception` and `wasm.get.ehselector`
intrinsics, which respectively return a caught exception value (a
pointer to some C++ exception struct) and a selector (an integer value
that tells which C++ `catch` clause the current exception matches, or
does not match any).
WasmEHPrepare is a pass that does some IR-level preparation before
instruction selection. Previously one of things we did in this pass was
to convert `wasm.get.exception` intrinsic calls to
`wasm.extract.exception` intrinsics. Their semantics were the same
except `wasm.extract.exception` did not have a token argument. We
maintained these two separate intrinsics with the same semantics because
instruction selection couldn't handle token arguments. This
`wasm.extract.exception` intrinsic was later converted to
`extract_exception` instruction in instruction selection, which was a
pseudo instruction to implement `br_on_exn`. Because `br_on_exn` pushed
an extracted value onto the value stack after the `end` instruction of a
`block`, but LLVM does not have a way of modeling that kind of behavior,
so this pseudo instruction was used to pull an extracted value out of
thin air, like this:
```
block $l0
...
br_on_exn $cpp_exception $l0
...
end
extract_exception ;; pushes values onto the stack
```
In the new spec, we don't need this pseudo instruction anymore because
`catch` itself returns a value and we don't have `br_on_exn` anymore. In
the spec `catch` returns multiple values (like `br_on_exn`), but here we
assume it only returns a single i32, which is sufficient to support C++.
So this renames `wasm.get.exception` intrinsic to `wasm.catch`. Because
this CL does not yet contain instruction selection for `wasm.catch`
intrinsic, all `RUN` lines in exception.ll, eh-lsda.ll, and
cfg-stackify-eh.ll, and a single `RUN` line in wasm-eh.cpp (which is an
end-to-end test from C++ source to assembly) fail. So this CL
temporarily disables those `RUN` lines, and for those test files without
any valid remaining `RUN` lines, adds a dummy `RUN` line to make them
pass. These tests will be reenabled in later CLs.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D94039
Like @aprantl suggested, modify to use the canonicalized DIFile, if we
don't know the loc info and filename for the compiler generated
functions for example static initialization functions.
Reviewed By: dblaikie, aprantl
Differential Revision: https://reviews.llvm.org/D87147
exception thrown during construction in a new-expression.
Instead, when performing deallocation function lookup for a
new-expression, ignore all destroying operator delete candidates, and
fall back to global operator delete if there is no member operator
delete other than a destroying operator delete.
Use of destroying operator delete only makes sense when there is an
object to destroy, which there isn't in this case. The language wording
doesn't cover this case; this oversight has been reported to WG21, with
the approach in this patch as the proposed fix.
`wasm_rethrow_in_catch` intrinsic and builtin are used in order to
rethrow an exception when the exception is caught but there is no
matching clause within the current `catch`. For example,
```
try {
foo();
} catch (int n) {
...
}
```
If the caught exception does not correspond to C++ `int` type, it should
be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch`
because at the time I thought there would be another intrinsic for C++'s
`throw` keyword, which rethrows an exception. It turned out that `throw`
keyword doesn't require wasm's `rethrow` instruction, so we rename
`rethrow_in_catch` to just `rethrow` here.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D94038
This patch adds support for two new variants of the vectorize_width
pragma:
1. vectorize_width(X[, fixed|scalable]) where an optional second
parameter is passed to the vectorize_width pragma, which indicates if
the user wishes to use fixed width or scalable vectorization. For
example the user can now write something like:
#pragma clang loop vectorize_width(4, fixed)
or
#pragma clang loop vectorize_width(4, scalable)
In the absence of a second parameter it is assumed the user wants
fixed width vectorization, in order to maintain compatibility with
existing code.
2. vectorize_width(fixed|scalable) where the width is left unspecified,
but the user hints what type of vectorization they prefer, either
fixed width or scalable.
I have implemented this by making use of the LLVM loop hint attribute:
llvm.loop.vectorize.scalable.enable
Tests were added to
clang/test/CodeGenCXX/pragma-loop.cpp
for both the 'fixed' and 'scalable' optional parameter.
See this thread for context: http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html
Differential Revision: https://reviews.llvm.org/D89031
Like the VarDecl that gets its type updated based on an init-list, this
patch corrects the MaterializeTemporaryExpr's type to make sure it isn't
creating an incomplete type, which leads to a handful of CodeGen crashes
(see PR 47636).
Based on @rsmith 's comments on D88236
Differential Revision: https://reviews.llvm.org/D88298
The idea is that the CC1 default for ELF should set dso_local on default
visibility external linkage definitions in the default -mrelocation-model pic
mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar.
The refactoring is made available by 2820a2ca3a.
Currently only x86 supports local aliases. We move the decision to the driver.
There are three CC1 states:
* -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable.
* (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local
* -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.
Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.
ELF -cc1 -mrelocation-model pic will default to no semantic interposition plus
setting dso_local on default visibility external linkage definitions, so that
COFF, Mach-O and ELF output will be similar.
This patch makes tests immune to the differences.
For a default visibility external linkage definition, dso_local is set for ELF
-fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar
to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences.
To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition,
which sets dso_local for default visibility external linkage definitions.
To make this flip smooth and enable future (dso_local as definition default),
this patch replaces (function) `define ` with `define{{.*}} `,
(variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} `.
* static relocation model: always
* other relocation models: if isStrongDefinitionForLinker
This will make LLVM IR emitted for COFF/Mach-O and executable ELF similar.
For a definition (of most linkage types), dso_local is set for ELF -fno-pic/-fpie
and COFF, but not for Mach-O. This nuance causes unneeded binary format differences.
This patch replaces (function) `define ` with `define{{.*}} `,
(variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} `
if there is an explicit linkage.
* Clang will set dso_local for Mach-O, which is currently implied by TargetMachine.cpp. This will make COFF/Mach-O and executable ELF similar.
* Eventually I hope we can make dso_local the textual LLVM IR default (write explicit "dso_preemptable" when applicable) and -fpic ELF will be similar to everything else. This patch helps move toward that goal.
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93793
UBSan was using the complete-object align rather than nv alignment
when checking the "this" pointer of a method.
Furthermore, CGF.CXXABIThisAlignment was also being set incorrectly,
due to an incorrectly negated test. The latter doesn't appear to have
had any impact, due to it not really being used anywhere.
Differential Revision: https://reviews.llvm.org/D93072
If the source instruction has !annotation metadata, all instructions
created during combining should also have it. Tell the builder to
add it.
The !annotation system was discussed on llvm-dev as part of
'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)
This patch is based on an earlier patch by Francis Visoiu Mistrih.
Reviewed By: thegameg, lebedev.ri
Differential Revision: https://reviews.llvm.org/D91444
The `assume` attribute is a way to provide additional, arbitrary
information to the optimizer. For now, assumptions are restricted to
strings which will be accumulated for a function and emitted as comma
separated string function attribute. The key of the LLVM-IR function
attribute is `llvm.assume`. Similar to `llvm.assume` and
`__builtin_assume`, the `assume` attribute provides a user defined
assumption to the compiler.
A follow up patch will introduce an LLVM-core API to query the
assumptions attached to a function. We also expect to add more options,
e.g., expression arguments, to the `assume` attribute later on.
The `omp [begin] asssumes` pragma will leverage this attribute and
expose the functionality in the absence of OpenMP.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D91979
For the Itanium ABI, this implements the mangling rule suggested in
https://github.com/itanium-cxx-abi/cxx-abi/issues/47, namely mangling
such template arguments as being cast to the parameter type in the case
where the template name is overloadable. This can cause a mangling
change for rare cases, where
* the template argument declaration is converted from its declared type
to the type of the template parameter, and
* the template parameter either has a deduced type or is a parameter of
a function template.
However, such changes are necessary to avoid mangling collisions. The
ABI changes can be reversed with -fclang-abi-compat=11 or earlier.
Re-commit with a fix for a couple of regressions.
Differential Revision: https://reviews.llvm.org/D91488
This patch enables marshalling of the exception model options while enforcing their mutual exclusivity. The clang driver interface remains the same, this only affects the cc1 command line.
Depends on D93215.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D93216
For the Itanium ABI, this implements the mangling rule suggested in
https://github.com/itanium-cxx-abi/cxx-abi/issues/47, namely mangling
such template arguments as being cast to the parameter type in the case
where the template name is overloadable. This can cause a mangling
change for rare cases, where
* the template argument declaration is converted from its declared type
to the type of the template parameter, and
* the template parameter either has a deduced type or is a parameter of
a function template.
However, such changes are necessary to avoid mangling collisions. The
ABI changes can be reversed with -fclang-abi-compat=11 or earlier.
Re-commit with a fix for the regression introduced last time: don't
expect parameters and arguments to line up inside an <unresolved-name>
mangling.
Differential Revision: https://reviews.llvm.org/D91488
For the Itanium ABI, this implements the mangling rule suggested in
https://github.com/itanium-cxx-abi/cxx-abi/issues/47, namely mangling
such template arguments as being cast to the parameter type in the case
where the template name is overloadable. This can cause a mangling
change for rare cases, where
* the template argument declaration is converted from its declared type
to the type of the template parameter, and
* the template parameter either has a deduced type or is a parameter of
a function template.
However, such changes are necessary to avoid mangling collisions. The
ABI changes can be reversed with -fclang-abi-compat=11 or earlier.
Differential Revision: https://reviews.llvm.org/D91488
I tried to put it in the same place in the pipeline as the legacy PM.
Fixes PR48399.
Reviewed By: asbirlea, nikic
Differential Revision: https://reviews.llvm.org/D93002
Swiftcall does it's own target-independent argument type classification,
since it is not designed to be ABI compatible with anything local on the
target that isn't LLVM-based. This means it never uses inalloca.
However, we have duplicate logic for checking for inalloca parameters
that runs before call argument setup. This logic needs to know ahead of
time if inalloca will be used later, and we can't move the
CGFunctionInfo calculation earlier.
This change gets the calling convention from either the
FunctionProtoType or ObjCMethodDecl, checks if it is swift, and if so
skips the stackbase setup.
Depends on D92883.
Differential Revision: https://reviews.llvm.org/D92944
Sometimes people get minimal crash reports after a UBSAN incident. This change
tags each trap with an integer representing the kind of failure encountered,
which can aid in tracking down the root cause of the problem.
Such fields will likely have offset zero making
__sanitizer_dtor_callback poisoning wrong regions.
E.g. it can poison base class member from derived class constructor.
Differential Revision: https://reviews.llvm.org/D92727
shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes
RTTI objects to be emitted hidden. Update two tests that didn't
expect this to happen for the default triple.
Also rename iOS64CXXABI to AppleARM64CXXABI, since it's used for
arm64-apple-macos triples too.
Part of PR46644.
Differential Revision: https://reviews.llvm.org/D91904
Summary:
AIX uses the existing EH infrastructure in clang and llvm.
The major differences would be
1. AIX do not have CFI instructions.
2. AIX uses a new personality routine, named __xlcxx_personality_v1.
It doesn't use the GCC personality rountine, because the
interoperability is not there yet on AIX.
3. AIX do not use eh_frame sections. Instead, it would use a eh_info
section (compat unwind section) to store the information about
personality routine and LSDA data address.
Reviewed By: daltenty, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D91455
This patch diagnoses invalid references of global host variables in device,
global, or host device functions.
Differential Revision: https://reviews.llvm.org/D91281
Thanks to D77248, we can bypass the use of stubs altogether and use PLT
relocations if they are available for the target. LLVM and LLD support the
R_AARCH64_PLT32 relocation, so we can also guarantee a static PLT relocation on AArch64.
Not emitting these stubs saves a lot of extra binary size.
Differential Revision: https://reviews.llvm.org/D83812
After D17993, with -fno-delete-null-pointer-checks we add the dereferenceable attribute to the `this` pointer.
We have observed that one internal target which worked before fails even with -fno-delete-null-pointer-checks.
Switching to dereferenceable_or_null fixes the problem.
dereferenceable currently does not always respect NullPointerIsValid and may
imply nonnull and lead to aggressive optimization. The optimization may be
related to `CallBase::isReturnNonNull`, `Argument::hasNonNullAttr`, or
`Value::getPointerDereferenceableBytes`. See D66664 and D66618 for some discussions.
Reviewed By: bkramer, rsmith
Differential Revision: https://reviews.llvm.org/D92297
Similar to Windows Itanium, PS4 is also an Itanium C++ ABI variant
which shares the goal of semantic compatibility with Microsoft C++
code that uses dllimport/export.
This change introduces a new function to determine from the triple
if an environment aims for compatibility with MS C++ code w.r.t to
these attributes and guards the relevant code paths using that
function.
Differential Revision: https://reviews.llvm.org/D90299
Previously we only considered using a substitution for a template-name
after already having mangled its prefix, so we'd produce nonsense
manglings like NS3_S4_IiEE where we should simply produce NS4_IiEE.
This is not ABI-compatible with previous Clang versions, and the old
behavior is restored by -fclang-abi-compat=11.0 or earlier.
Ensure that the DSO Locality of the globals in the IR is derived from
their final visibility when using -fvisibility-from-dllstorageclass.
To accomplish this we reset the DSO locality of globals (before
setting their visibility from their dllstorageclass) at the end of
IRGen in Clang. This removes any effects that visibility options or
annotations may have had on the DSO locality.
The resulting DSO locality of the globals will be pessimistic
w.r.t. to the normal compiler IRGen.
Differential Revision: https://reviews.llvm.org/D91779
Technically 'noexcept' isn't a qualifier, so this should be a separate conversion.
Also make the test a pure frontend test.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D67112
arguments.
* Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments
* Gates 'nonnull' on -f(no-)delete-null-pointer-checks
* Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to
explicitly test the behavior of this change
* Refactors hundreds of over-constrained clang tests to permit these
attributes, where needed
* Updates Clang12 patch notes mentioning this change
Reviewed-by: rsmith, jdoerfert
Differential Revision: https://reviews.llvm.org/D17993
This patch updates Clang's IRGen to add !annotation nodes with an
"auto-init" annotation to all stores for auto-initialization.
As discussed in 'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)
this allows using optimization remarks to track down where auto-init
code was inserted (and not removed by optimizations).
There are a few cases in the tests where !annotation gets dropped by
optimizations. Those optimizations will be updated in subsequent
patches.
This patch is based on a patch by Francis Visoiu Mistrih.
Reviewed By: thegameg, paquette
Differential Revision: https://reviews.llvm.org/D91417
parameters.
It appears that LLVM isn't able to generate a DW_AT_const_value for a
constant of class type, but if it could, we'd match GCC's debug info in
this case, and in the interim we no longer crash.
See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485
the implementation is known-broken for certain inputs,
the bugreport was up for a significant amount of timer,
and there has been no activity to address it.
Therefore, just completely rip out all of misexpect handling.
I suspect, fixing it requires redesigning the internals of MD_misexpect.
Should anyone commit to fixing the implementation problem,
starting from clean slate may be better anyways.
This reverts commit 7bdad08429,
and some of it's follow-ups, that don't stand on their own.
This RUN line was added as a temporary measure to undo the damage done
by D79655. Everyone's build system should be fine by now.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D91448
For dllexported default constructors with default arguments, we export
default constructor closures which pass in the default args. (See D8331
for a good explanation.)
For templates, that means those default args must be instantiated even
if the function isn't called. That is done by the
InstantiateDefaultCtorDefaultArgs() function, but it wasn't done for
explicit specializations, causing asserts (see bug).
Differential revision: https://reviews.llvm.org/D91089
The Itanium CXX ABI grammer has been extended to support parameterized
vendor extended types [1].
This patch updates Clang's mangling for matrix types to use the new
extension.
[1] b359d28971
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D91253
mangling support for non-type template parameters of class type and
template parameter objects.
The Itanium side of this follows the approach I proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/47 on 2020-09-06.
The MSVC side of this was determined empirically by observing MSVC's
output.
Differential Revision: https://reviews.llvm.org/D89998
As described here:
https://devblogs.microsoft.com/oldnewthing/20150220-00/?p=44623
In order to allow Lambdas to be used with traditional Win32 APIs, they
emit a conversion function for (what Raymond Chen claims is all) a
number of the calling conventions. Through experimentation, we
discovered that the list isn't quite 'all'.
This patch implements this by taking the list of conversions that MSVC
emits (across 'all' architectures, I don't see any CCs on ARM), then
emits them if they are supported by the current target.
However, we also add 3 other options (which may be duplicates):
free-function, member-function, and operator() calling conventions. We
do this because we have an extension where we generate both free and
member for these cases so th at people specifying a calling convention
on the lambda will have the expected behavior when specifying one of
those two.
MSVC doesn't seem to permit specifying calling-convention on lambdas,
but we do, so we need to make sure those are emitted as well. We do this
so that clang-only conventions are supported if the user specifies them.
Differential Revision: https://reviews.llvm.org/D90634
Since C++11, the C++ standard has a forward progress guarantee
[intro.progress], so all such functions must have the `mustprogress`
requirement. In addition, from C11 and onwards, loops without a non-zero
constant conditional or no conditional are also required to make
progress (C11 6.8.5p6). This patch implements these attribute deductions
so they can be used by the optimization passes.
Differential Revision: https://reviews.llvm.org/D86841
415f7ee883 had a silly typo introduced when I inlined some
code into a loop from its own function.
Original commit message:
For PlayStation we offer source code compatibility with
Microsoft's dllimport/export annotations; however, our file
format is based on ELF.
To support this we translate from DLL storage class to ELF
visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the
documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options
are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
415f7ee883 had LIT test failures on any build where the clang executable
was not called "clang". I have adjusted the LIT CHECKs to remove the
binary name to fix this.
Original commit message:
For PlayStation we offer source code compatibility with
Microsoft's dllimport/export annotations; however, our file
format is based on ELF.
To support this we translate from DLL storage class to ELF
visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the
documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options
are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
The attribute has no effect on a do statement since the path of execution
will always include its substatement.
It adds a diagnostic when the attribute is used on an infinite while loop
since the codegen omits the branch here. Since the likelihood attributes
have no effect on a do statement no diagnostic will be issued for
do [[unlikely]] {...} while(0);
Differential Revision: https://reviews.llvm.org/D89899
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
As mentioned in the defect, the lambda static invoker does not follow
the calling convention of the lambda itself, which seems wrong. This
patch ensures that the calling convention of operator() is passed onto
the invoker and conversion-operator type.
This is accomplished by extracting the calling-convention determination
code out into a separate function in order to better reflect the 'thiscall'
work, as well as somewhat better support the future implementation of
https://devblogs.microsoft.com/oldnewthing/20150220-00/?p=44623
For any target (basically just win32) that has a different free and
static function calling convention, this generates BOTH alternatives.
This required some work to get the Windows mangler to work correctly for
this, as well as some tie-breaking for the unary operators.
Differential Revision: https://reviews.llvm.org/D89559
We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.
This changes CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.
Bug: https://bugs.llvm.org/show_bug.cgi?id=47580
Differential Revision: https://reviews.llvm.org/D89072
This reverts commit 504615353f.
Define the __vector_pair and __vector_quad types that are used to manipulate
the new accumulator registers introduced by MMA on PowerPC. Because these two
types are specific to PowerPC, they are defined in a separate new file so it
will be easier to add other PowerPC specific types if we need to in the future.
Differential Revision: https://reviews.llvm.org/D81508
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.
I changed CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.
Bug: https://bugs.llvm.org/show_bug.cgi?id=47580
Differential Revision: https://reviews.llvm.org/D89072
This allows using annotation in a much more contexts than it currently has.
especially when annotation with template or constexpr.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D88645
This fixes miscomputation of __builtin_constant_evaluated in the
initializer of a variable that's not usable in constant expressions, but
is readable when constant-folding.
If evaluation of a constant initializer fails, we throw away the
evaluated result instead of keeping it as a non-constant-initializer
value for the variable, because it might not be a correct value.
To avoid regressions for initializers that are foldable but not formally
constant initializers, we now try constant-evaluating some globals in
C++ twice: once to check for a constant initializer (in an mode where
is_constannt_evaluated returns true) and again to determine the runtime
value if the initializer is not a constant initializer.
The name is unfortunate because it is similar to the driver option -ftest-coverage.
It turns out aside from one occurrence in a test, this option is not used.
This implements the likelihood attribute for the switch statement. Based on the
discussion in D85091 and D86559 it only handles the attribute when placed on
the case labels or the default labels.
It also marks the likelihood attribute as feature complete. There are more QoI
patches in the pipeline.
Differential Revision: https://reviews.llvm.org/D89210
initialization a little smarter.
Look through casts that preserve zero-ness when determining if an
initializer is zero, so that we can handle cases like an {0} initializer
whose corresponding field is a type other than 'int'.
References to different declarations of the same entity aren't different
values, so shouldn't have different representations.
Recommit of e6393ee813, most recently
reverted in 9a33f027ac due to a bug caused
by ObjCInterfaceDecls not propagating availability attributes along
their redeclaration chains; that bug was fixed in
e2d4174e9c.
Jeremy Morse discovered an issue with the lit test introduced in D88363. The
test gives different results for Sony's `-O1`.
The test needs to run at `-O1` otherwise the likelihood attribute will be
ignored. Instead of running all `-O1` passes it only runs the lower-expect pass
which is needed to lower `__builtin_expect`.
Differential Revision: https://reviews.llvm.org/D89204
Previously, when clang was compiled with -DLLVM_ENABLE_ASSERTIONS=ON, the added tests were displaying:
inlinable function call in a function with debug info must have a !dbg location
call void @"??1?$c@UB@@@@QEAA@XZ"(%struct.c* @"?f@?1??d@@YAPEAU?$c@UB@@@@XZ@4U2@A")
fatal error: error in backend: Broken module found, compilation aborted!
Stack dump:
0. Program arguments: <f:\svn\buildninja\bin\clang -cc1 -emit-llvm debug-info-no-location.cpp> -gcodeview -debug-info-kind=limited
1. <eof> parser at end of file
2. Per-function optimization
Fixes PR43012
Differential Revision: https://reviews.llvm.org/D66328
Bruno De Fraine discovered some issues with D85091. The branch weights
generated for `logical not` and `ternary conditional` were wrong. The
`logical and` and `logical or` differed from the code generated of
`__builtin_predict`.
Adjusted the generated code for the likelihood to match
`__builtin_predict`. The patch is based on Bruno's suggestions.
Differential Revision: https://reviews.llvm.org/D88363
On some targets, preferred alignment is larger than ABI alignment in some cases. For example,
on AIX we have special power alignment rules which would cause that. Previously, to support
those cases, we added a “PreferredAlignment” field in the `RecordLayout` to store the AIX
special alignment values in “PreferredAlignment” as the community suggested.
However, that patch alone is not enough. There are places in the Clang where `PreferredAlignment`
should have been used instead of ABI-specified alignment. This patch is aimed at fixing those
spots.
Differential Revision: https://reviews.llvm.org/D86790
Add class types to the retained types list to make sure they
don't get dropped if the constructor is optimized out later.
Differential Revision: https://reviews.llvm.org/D88522
This reverts commit 55c4ff91bd.
Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
Extend -fsanitize=nullability-arg to handle call sites which accept C++
member pointers.
rdar://62476022
Differential Revision: https://reviews.llvm.org/D88336
References to different declarations of the same entity aren't different
values, so shouldn't have different representations.
Recommit of e6393ee813 with fixed handling
for weak declarations. We now look for attributes on the most recent
declaration when determining whether a declaration is weak. (Second
recommit with further fixes for mishandling of weak declarations. Our
behavior here is fundamentally unsound -- see PR47663 -- but this
approach attempts to not make things worse.)
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
PAC/BTI-related codegen in the AArch64 backend is controlled by a set
of LLVM IR function attributes, added to the function by Clang, based
on command-line options and GCC-style function attributes. However,
functions, generated in the LLVM middle end (for example,
asan.module.ctor or __llvm_gcov_write_out) do not get any attributes
and the backend incorrectly does not do any PAC/BTI code generation.
This patch record the default state of PAC/BTI codegen in a set of
LLVM IR module-level attributes, based on command-line options:
* "sign-return-address", with non-zero value means generate code to
sign return addresses (PAC-RET), zero value means disable PAC-RET.
* "sign-return-address-all", with non-zero value means enable PAC-RET
for all functions, zero value means enable PAC-RET only for
functions, which spill LR.
* "sign-return-address-with-bkey", with non-zero value means use B-key
for signing, zero value mean use A-key.
This set of attributes are always added for AArch64 targets (as
opposed, for example, to interpreting a missing attribute as having a
value 0) in order to be able to check for conflicts when combining
module attributed during LTO.
Module-level attributes are overridden by function level attributes.
All the decision making about whether to not to generate PAC and/or
BTI code is factored out into AArch64FunctionInfo, there shouldn't be
any places left, other than AArch64FunctionInfo, which directly
examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which
is/will-be handled by a separate patch.
Differential Revision: https://reviews.llvm.org/D85649
Passing them directly is likely to be non-conforming, since it usually
involves copying the bytes of the record. For unknown architectures, we
don't know what MSVC does or will do, but we should at least try to
conform as well as we can.
Regardless of the target architecture, we should always use the C rules
(RAA_Default) for records that "canBePassedInRegisters". Those are
trivially copyable things, and things marked with [[trivial_abi]].
This should be NFC, although it changes where the final decision about
x86_32 overaligned records is made. The current x86_32 C rules say that
overaligned things are passed indirectly, so there is no functional
difference.
constructors.
This changes the code to avoid using constructor homing for aggregate
classes and classes with trivial default constructors, instead of trying
to loop through the constructors.
Differential Revision: https://reviews.llvm.org/D87808