Summary: When -strict-dwarf=true is specified, the calling convention info
DW_CC_pass_by_value or DW_CC_pass_by_reference can only be generated at DWARF5.
Reviewed By: shchenz, dblaikie
Differential Revision: https://reviews.llvm.org/D103300
This patch solves an error such as:
incompatible operand types ('vbool4_t' (aka '__rvv_bool4_t') and '__rvv_bool4_t')
when one of the value is a TypedefType of the other value in ?:.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D103603
LLDB is currently not selected in LLVM release testing and thus it
doesnt make its way into prebuilt binaries which build with default
configuration. This patch enables LLDB by default in test-release
script.
Assuming LLDB build by default was disabled back in 2016 LLDB support
for various architectures has a long way since then. It has buildbots
for most architectures and supports a case to be included by default.
Also lldb build can easily be disabled in case some release managers
choose to do so.
Reviewed By: tstellar
Differential Revision: https://reviews.llvm.org/D101864
Don't propagate launch bound related attributes to
address taken functions and their callees. The idea
is to do a traversal over the call graph starting at
address taken functions and erase the attributes
set by previous logic i.e. process().
This two phase approach makes sure that we don't
miss out on deep nested callees from address taken
functions as a function might be called directly as
well as indirectly.
This patch is also reattempt to D94585 as latent issues
are fixed in hasAddressTaken function in the recent
past.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D103138
Before packing LDS globals into a sorted structure, make sure that
their alignment is properly updated based on their size. This will make
sure that the members of sorted structure are properly aligned, and
hence it will further reduce the probability of unaligned LDS access.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103261
Some platforms (eg: Trusty) are extremelly memory constrained, which
doesn't necessarily work well with some of Scudo's current assumptions.
`Vector` by default (and as such `String` and `ScopedString`) maps a
page, which is a bit of a waste. This CL changes `Vector` to use a
buffer local to the class first, then potentially map more memory if
needed (`ScopedString` currently are all stack based so it would be
stack data). We also want to allow a platform to prevent any dynamic
resizing, so I added a `CanGrow` templated parameter that for now is
always `true` but would be set to `false` on Trusty.
Differential Revision: https://reviews.llvm.org/D103641
Make extended binary the default output format for CSSPGO. This avoids having to pass flag every time when generating profile. It also matches llvm-profdata where binary profile is the default (should we switch to extbinary as default for llvm-profdata?).
We plan to compress name table for context profile, which depends on the built-in compression of extbinary.
Differential Revision: https://reviews.llvm.org/D103650
These started failing on one of our buildbots. I didn't completely root cause the situation and just added the explicit includes that correct the issue.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D103657
This revision adds assembly state tracking for uses of symbols, allowing for go-to-definition and references support for SymbolRefAttrs.
Differential Revision: https://reviews.llvm.org/D103585
Currently some amdgcn builtins are defined with long int type,
which causes invalid IR on Windows since long int is 32 bit
on Windows whereas these builtins have 64 bit arguments.
long long int type cannot be used since it is 128 bit in OpenCL.
This patch uses 64 bit int type instead of long int to define 64 bit int
arguments or return for amdgcn builtins.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D103563
Some floating point lib calls have ABI attributes that need to be set on
the caller. Found via D103412.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D103415
When a subroutine or function symbol is defined in an INTERFACE
block, it's okay if a symbol of the same name appears in a
scope between the global scope and the scope of the INTERFACE.
Differential Revision: https://reviews.llvm.org/D103580
In a `-DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on`
build, libLLVM-13git.so is 2% smaller and libclang-cpp.so is 1% smaller (on top of -Wl,-Bsymbolic-functions).
There may be some small performance improvement as well because GCC
-fPIC suppresses interprocedural optimizations for non-inline
definitions by default.
Note: we cannot add -fno-semantic-interposition for Clang<13. Clang<13's
implementation additionally optimizes global variables, which is incompatible
with unfortunate ELF -fno-pic default: direct access relocations for external
data. If the executable has a -fno-pic object file referencing a global variable
declared in a public header, the direct access relocation will cause a copy
relocation. The executable and libLLVM.so/libclang-cpp.so will disagree on the
address.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D102453
Add some missing error messages, and permit the appearance
of EntityDetails symbols in dummy argument type characterization.
Differential Revision: https://reviews.llvm.org/D103576
When a procedure pointer with no interface is called by a
function reference, complain about the lack.
Differential Revision: https://reviews.llvm.org/D103573
In something like "ASSOCIATE(X=>T(1))", the "T(1)" is parsed
as a Variable because it looks like a function reference or
array reference; if it turns out to be a structure constructor,
which is something we can't know until we're able to attempt
generic interface resolution in semantics, the parse tree needs
to be fixed up by replacing the Variable with an Expr.
The compiler could already do this for putative function references
encapsulated as Exprs, so this patch moves some code around and
adds parser::Selector to the overloads of expression analysis.
Differential Revision: https://reviews.llvm.org/D103572
Building LLVM with the LLVM_ENABLE_MODULES cmake option fails when the modules are being compiled due to missing includes. This is a side effect of some transitive includes that changed recently.
Differential Revision: https://reviews.llvm.org/D103645
This moves the implementations for HandleTagMismatch, __hwasan_tag_mismatch4,
and HwasanAtExit from hwasan_linux.cpp to hwasan.cpp and declares them in hwasan.h.
This way, calls to those functions can be shared with the fuchsia implementation
without duplicating code.
Differential Revision: https://reviews.llvm.org/D103562
The constexpr-capable class evaluate::DynamicType represented
CHARACTER length only with a nullable pointer into the declared
parameters of types in the symbol table, which works fine for
anything with a declaration but turns out to not suffice to
describe the results of the ACHAR() and CHAR() intrinsic
functions. So extend DynamicType to also accommodate known
constant CHARACTER lengths, too; use them for ACHAR & CHAR;
clean up several use sites and fix regressions found in test.
Differential Revision: https://reviews.llvm.org/D103571
A recent change (D99683) to support ThinLTO for HIP caused a regression
when compiling cuda code with -flto=thin -fwhole-program-vtables.
Specifically, we now get an error:
error: invalid argument '-fwhole-program-vtables' only allowed with '-flto'
This error is coming from the device offload cc1 action being set up for
the cuda compile, for which -flto=thin doesn't apply and gets dropped.
This is a regression, but points to a potential issue that was silently
occurring before the patch, details below.
Before D99683, the check for fwhole-program-vtables in the driver looked
like:
if (WholeProgramVTables) {
if (!D.isUsingLTO())
D.Diag(diag::err_drv_argument_only_allowed_with)
<< "-fwhole-program-vtables"
<< "-flto";
CmdArgs.push_back("-fwhole-program-vtables");
}
And D.isUsingLTO() returned true since we have -flto=thin. However,
because the cuda cc1 compile is doing device offloading, which didn't
support any LTO, there was other code that suppressed -flto* options
from being passed to the cc1 invocation. So the cc1 invocation silently
had -fwhole-program-vtables without any -flto*. This seems potentially
problematic, since if we had any virtual calls we would get type test
assume sequences without the corresponding LTO pass that handles them.
However, with the patch, which adds support for device offloading LTO
option -foffload-lto=thin, the code has changed so that we set a bool
IsUsingLTO based on either -flto* or -foffload-lto*, depending on
whether this is the device offloading action. For the device offload
action in our compile, since we don't have -foffload-lto, IsUsingLTO is
false, and the check for LTO with -fwhole-program-vtables now fails.
What we should do is only pass through -fwhole-program-vtables to the
cc1 invocation that has LTO enabled (either the device offload action
with -foffload-lto, or the non-device offload action with -flto), and
otherwise drop the -fwhole-program-vtables for the non-LTO action.
Then we should error only if we have -fwhole-program-vtables without any
-f*lto* options.
Differential Revision: https://reviews.llvm.org/D103579
This builds on D103584. The change eliminates the coupling between unroll heuristic and implementation w.r.t. knowing when the passed in trip count is an exact trip count or a max trip count. In theory the new code is slightly less powerful (since it relies on exact computable trip counts), but in practice, it appears to cover all the same cases. It can also be extended if needed.
The test change shows what appears to be a bug in the existing code around the interaction of peeling and unrolling. The original loop only ran 8 iterations. The previous output had the loop peeled by 2, and then an exact unroll of 8. This meant the loop ran a total of 10 iterations which appears to have been a miscompile.
Differential Revision: https://reviews.llvm.org/D103620
A procedure pointer is allowed to name a specific intrinsic function
from F'2018 table 16.2 as its interface, but not other intrinsic
procedures. Catch this error, and thereby also fix a crash resulting
from a failure later in compilation from failed characteristics;
while here, also catch the similar error with initializers.
Differential Revision: https://reviews.llvm.org/D103570
In this particular example, we had a crash when compiling it
for several architectures. This patch extends the legalization
of extract_subvector to avoid this problem.
Differential Revision: https://reviews.llvm.org/D103344
As a benign extension common to other Fortran compilers,
accept BOZ literals in array constructors w/o explicit
types, treating them as integers.
Differential Revision: https://reviews.llvm.org/D103569
PPC_FP128 determines isZero/isNan/isInf using high-order double value
only. Checking isZero/isNegative might return the isNullValue unexpectedly.
eg:
0xM0000000000000000FFFFFFFFFFFFFFFFF
isZero, but it is not NullValue.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D103634
`__profd_*` variables are referenced by code only when value profiling is
enabled. If disabled (e.g. default -fprofile-instr-generate), the symbols just
waste space on ELF/Mach-O. We change the comdat symbol from `__profd_*` to
`__profc_*` because an internal symbol does not provide deduplication features
on COFF. The choice doesn't matter on ELF.
(In -DLLVM_BUILD_INSTRUMENTED_COVERAGE=on build, there is now no `__profd_*` symbols.)
On Windows this enables further optimization. We are no longer affected by the
link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE can
cause duplicate definition error.
https://lists.llvm.org/pipermail/llvm-dev/2021-May/150758.html
We can thus use llvm.compiler.used instead of llvm.used like ELF (D97585).
This avoids many `/INCLUDE:` directives in `.drectve`.
Here is rnk's measurement for Chrome:
```
This reduced object file size of base_unittests.exe, compiled with coverage, optimizations, and gmlt debug info by 10%:
#BEFORE
$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
1047758867
$ du -cksh base_unittests.exe
82M base_unittests.exe
82M total
# AFTER
$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
937886499
$ du -cksh base_unittests.exe
78M base_unittests.exe
78M total
```
Reviewed By: davidxl, rnk
Differential Revision: https://reviews.llvm.org/D103372
The code for folding calls to the intrinsic function CMPLX was
incorrectly dependent on the number of arguments to distinguish its
two cases (conversion from one kind of complex to another, and
composition of a complex value from real & imaginary parts).
This was wrong since the optional KIND= argument has already been
taken into account by intrinsic processing; instead, the type of
the first argument should decide the issue.
Differential Revision: https://reviews.llvm.org/D103568
One exit is unpredictable, the other has a known trip count. For
one function the predictable exit is the latch exit, for the other
the non-latch exit. Currently they are treated differently.
In error recovery situations, the mappings from source locations
to scopes were failing in a way that tripped some asserts.
Specifically, FindPureProcedureContaining() wasn't coping well
when starting at the global scope. (And since the global scope
no longer has a source range, clean up the Semantics constructor
to avoid confusion.)
Differential Revision: https://reviews.llvm.org/D103567