Before, the CLANG_DEFAULT_LINKER cmake option was a global override for
the linker that shall be used on all toolchains. The linker binary
specified that way may not be available on toolchains with custom
linkers. Eg, the only linker for VE is named 'nld' - any other linker
invalidates the toolchain.
This patch removes the hard override and instead lets the generic
toolchain implementation default to CLANG_DEFAULT_LINKER. Toolchains
can now deviate with a custom linker name or deliberatly default to
CLANG_DEFAULT_LINKER.
Reviewed By: MaskRay, phosek
Differential Revision: https://reviews.llvm.org/D115045
Change C++ header files placement to support multiple LLVM_RUNTIME_TARGETS
build. Also modifies regression test for it.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D114527
Some projects [1,2,3] have flex-generated files besides bison-generated
ones.
Unfortunately, the comment `"/* A lexical scanner generated by flex */"`
generated by the tools is not necessarily at the beginning of the file,
thus we need to quickly skim through the file for this needle string.
Luckily, StringRef can do this operation in an efficient way.
That being said, now the bison comment is not required to be at the very
beginning of the file. This allows us to detect a couple more cases
[4,5,6].
Alternatively, we could say that we only allow whitespace characters
before matching the bison/flex header comment. That would prevent the
(probably) unnecessary string search in the buffer. However, I could not
verify that these tools would actually respect this assumption.
Additionally to this, e.g. the Twin project [1] has other non-whitespace
characters (some preprocessor directives) before the flex-generated
header comment. So the heuristic in the previous paragraph won't work
with that.
Thus, I would advocate the current implementation.
According to my measurement, this patch won't introduce measurable
performance degradation, even though we will do 2 linear scans.
I introduce the ignore-bison-generated-files and
ignore-flex-generated-files to disable skipping these files.
Both of these options are true by default.
[1]: https://github.com/cosmos72/twin/blob/master/server/rcparse_lex.cpp#L7
[2]: 22362cdcf9/sandbox/count-words/lexer.c (L6)
[3]: 11abdf6462/lab1/lex.yy.c (L6)
[4]: 47f5b2cfe2/B_yacc/1/y1.tab.h (L2)
[5]: 71d1bf9b1e/src/VBox/Additions/x11/x11include/xorg-server-1.8.0/parser.h (L2)
[6]: 3f773ceb13/Framework/OpenEars.framework/Versions/A/Headers/jsgf_parser.h (L2)
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D114510
When targeting FreeBSD on a Linux host with a copy
of system libc++, Clang prepends /usr/include/c++/v1
to the search paths even with -ffreestanding, and
fails to compile a program with a
single #include <xmmintrin.h>
Dropping the path with -nostdlibinc.
Differential Revision: https://reviews.llvm.org/D114497
Swap AIC and IC neighbouring in pipeline. This looks more natural and even
almost has no effect for now (three slightly touched tests of test-suite). Also
this could be the first step towards merging AIC (or its part) to -O2 pipeline.
After several changes in AIC (like D108091, D108201, D107766, D109515, D109236)
there've been observed several regressions (like PR52078, PR52253, PR52289)
that were fixed in different passes (see D111330, D112721) by extending their
functionality, but these regressions were exposed since changed AIC prevents IC
from making some of early optimizations.
This is common problem and it should be fixed by just moving AIC after IC
which looks more logically by itself: make aggressive instruction combining
only after failed ordinary one.
Fixes PR52289
Reviewed By: spatel, RKSimon
Differential Revision: https://reviews.llvm.org/D113179
The ray_origin, ray_dir and ray_inv_dir arguments should all be vec3 to
match how the hardware instruction works.
Don't change the API of the corresponding OpenCL builtins.
Differential Revision: https://reviews.llvm.org/D115032
Support for builtin setjmp/longjmp was removed by https://reviews.llvm.org/D51487. An
error should be created when compiling C code using __builtin_setjmp or __builtin_longjmp.
Reviewed By: dcederman
Differential Revision: https://reviews.llvm.org/D108901
Building -march=armv6k Linux kernels with -mtp=cp15 fails to
compile:
error: hardware TLS register is not supported for the arm
sub-architecture
@ardb found docs for ARM1176JZF-S (ARMv6K) that reference hard thread
pointer.
Relax our ARMv6 check for cases where we're targeting ARM via -marm (vs
Thumb1 via -mthumb). This more closely matches the KConfig requirements
for where we plan to use these (ie. ARMv6K, ARMv7 (arm or thumb2)).
As @peter.smith mentions:
on armv5 we can write the instruction to read/write to CP15 C13 with
the ThreadID opcode. However on no armv5 implementation will the CP15
C13 have a Thread ID register. The GCC intent seems to be whether the
instruction is encodable rather than check what the CPU supports.
Link: https://github.com/ClangBuiltLinux/linux/issues/1502
Link: https://developer.arm.com/documentation/ddi0301/h/system-control-coprocessor/system-control-processor-registers/c13--thread-and-process-id-registers
Reviewed By: ardb, peter.smith
Differential Revision: https://reviews.llvm.org/D114116
With C++17 the exception specification has been made part of the
function type, and therefore part of mangled type names.
However, it's valid to convert function pointers with an exception
specification to function pointers with the same argument and return
types but without an exception specification, which means that e.g. a
function of type "void () noexcept" can be called through a pointer
of type "void ()". We must therefore consider the two types to be
compatible for CFI purposes.
We can do this by stripping the exception specification before mangling
the type name, which is what this patch does.
Differential Revision: https://reviews.llvm.org/D115015
This fixes a bug in 740057d. There's two ways to describe the issue:
* One caller hasn't yet proven nocapture on the argument. Given that, the inference routine is responsible for bailing out on a potential capture.
* Even if we know the argument is nocapture, the access inference needs to traverse the exact set of users the capture tracking would (or exit conservatively). Even if capture tracking can prove a store is non-capturing (e.g. to a local alloc which doesn't escape), we still need to track the copy of the pointer to see if it's later reloaded and accessed again.
Note that all the test changes except the newly added ones appear to be false negatives. That is, cases where we could prove writeonly, but the current code isn't strong enough. That's why I didn't spot this originally.
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.
AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.
Differential Revision: https://reviews.llvm.org/D114455
Need to do the analysis of the captured expressions in the clauses.
Previously the compiler ignored them and it may lead to a compiler crash
trying to get the address of the mapped variables.
Differential Revision: https://reviews.llvm.org/D114546
This does similar thing to 6b1341e, but fixes single element 128-bit
float type: `struct { long double x; }`.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D114937
Glibc 2.32 and newer uses these symbol names to support IEEE-754 128-bit
float. GCC transforms name of these builtins to align with Glibc header
behavior.
Since Clang doesn't have all GCC-compatible builtins implemented, this
patch only mutates the implemented part.
Note nexttoward is a special case (no nexttowardf128) so it's also
handled here.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D112401
Currently the last value of linear is calculated as var = init + num_iters * step.
Replaced it with var = var_priv, i.e. original variable gets the value
of the last private copy.
Differential Revision: https://reviews.llvm.org/D105151
Need to postpone anlysis of the ranged for loops till the actual
instantiation to avoid erroneous emission of error messages.
Differential Revision: https://reviews.llvm.org/D114560
This change extends the current logic for inferring readonly and readnone argument attributes to also infer writeonly.
This change is deliberately minimal; there's a couple of areas for follow up.
* I left out all call handling and thus any benefit from the SCC walk. When examining the test changes, I realized the existing code is imprecise, and am going to fix that in it's own revision before adding in the writeonly handling. (Mostly because updating the tests is hard when I, the human, can't figure out whether the result is correct.)
* I left out handling for storing a value (as opposed to storing to a pointer). This should benefit readonly/readnone as well, and applies to a bunch of other instructions. Seemed worth having as a separate review.
Differential Revision: https://reviews.llvm.org/D114963
Need to postpone analysis for addressable lvalue in a depend clause with
iterators, otherwise the incorrect error message is emitted.
Differential Revision: https://reviews.llvm.org/D114653
If clang's output is set to bitcode and LTO is enabled, clang would
unconditionally add the flag to the module. Unfortunately, if the input were a
bitcode or IR file and had the flag set, this would result in two copies of the
flag, which is illegal IR. Guard the setting of the flag by checking whether it
already exists. This follows existing practice for the related "ThinLTO" module
flag.
Differential Revision: https://reviews.llvm.org/D112177
This patch changes the `-fopenmp-target-new-runtime` option which controls if
the new or old device runtime is used to be true by default. Disabling this to
use the old runtime now requires using `-fno-openmp-target-new-runtime`.
Reviewed By: JonChesterfield, tianshilei1992, gregrodgers, ronlieb
Differential Revision: https://reviews.llvm.org/D114890
Add mapping for CUDA address spaces for HIP to SPIR-V
translation. This change allows HIP device code to be
emitted as valid SPIR-V by mapping unqualified pointers
to generic address space and by mapping __device__ and
__shared__ AS to their equivalent AS in SPIR-V
(CrossWorkgroup and Workgroup, respectively).
Cuda's __constant__ AS is handled specially. In HIP
unqualified pointers (aka "flat" pointers) can point to
__constant__ objects. Mapping this AS to ConstantMemory
would produce to illegal address space casts to
generic AS. Therefore, __constant__ AS is mapped to
CrossWorkgroup.
Patch by linjamaki (Henry Linjamäki)!
Differential Revision: https://reviews.llvm.org/D108621
This reverts commit f02c5f3478 and
addresses the issue mentioned in D114619 differently.
Repeating the issue here:
Currently, during symbol simplification we remove the original member
symbol from the equivalence class (`ClassMembers` trait). However, we
keep the reverse link (`ClassMap` trait), in order to be able the query
the related constraints even for the old member. This asymmetry can lead
to a problem when we merge equivalence classes:
```
ClassA: [a, b] // ClassMembers trait,
a->a, b->a // ClassMap trait, a is the representative symbol
```
Now let,s delete `a`:
```
ClassA: [b]
a->a, b->a
```
Let's merge ClassA into the trivial class `c`:
```
ClassA: [c, b]
c->c, b->c, a->a
```
Now, after the merge operation, `c` and `a` are actually in different
equivalence classes, which is inconsistent.
This issue manifests in a test case (added in D103317):
```
void recurring_symbol(int b) {
if (b * b != b)
if ((b * b) * b * b != (b * b) * b)
if (b * b == 1)
}
```
Before the simplification we have these equivalence classes:
```
trivial EQ1: [b * b != b]
trivial EQ2: [(b * b) * b * b != (b * b) * b]
```
During the simplification with `b * b == 1`, EQ1 is merged with `1 != b`
`EQ1: [b * b != b, 1 != b]` and we remove the complex symbol, so
`EQ1: [1 != b]`
Then we start to simplify the only symbol in EQ2:
`(b * b) * b * b != (b * b) * b --> 1 * b * b != 1 * b --> b * b != b`
But `b * b != b` is such a symbol that had been removed previously from
EQ1, thus we reach the above mentioned inconsistency.
This patch addresses the issue by making it impossible to synthesise a
symbol that had been simplified before. We achieve this by simplifying
the given symbol to the absolute simplest form.
Differential Revision: https://reviews.llvm.org/D114887
Many of the SVE ACLE tests have gained entries as follows:
REQUIRES: aarch64-registered-target || arm-registered-target
which can cause test failures when only arm-registered-target is
available because only aarch64-registered-target supports SVE.
In C++23, discarded statements and if consteval statements can nest
arbitrarily. To support that, we keep track of whether the parent of
the current evaluation context is discarded or immediate.
This is done at the construction of an evaluation context
to improve performance.
Fixes https://bugs.llvm.org/show_bug.cgi?id=52231
The CLANG_DEFAULT_LINKER flag overrides the default toolchain linker.
VE strictly requires 'nld' to be the default linker. This causes a test
failure in test/Driver/ve-toolchain.cpp when configured with
CLANG_DEFAULT_LINKER!=ld
Failure in clang-ppc64le-rhel
(https://lab.llvm.org/buildbot/#/builders/57/builds/12628)
Until default linker selection with CLANG_DEFAULT_LINKER!=ld is fixed
proper, we manually specify '-fuse-ld=ld' (ie the toolchain default
linker) in the ve-toolchain tests.
MSVC says this should be 202002L for /std:c++20, and of VS16.11
that's indeed the case (older versions warn that they don't
understand /std:c++20, and then cl.exe defaults to C++14 and
sets _MSVC_LANG to 201402 accordingly).
Differential Revision: https://reviews.llvm.org/D114867
This test which was just introduced in the PACBTI-M frontend
patch (https://reviews.llvm.org/D112421) is currently failing on some
platforms. Removing temporarily.
Handle branch protection option on the commandline as well as a function
attribute. One patch for both mechanisms, as they use the same underlying
parsing mechanism.
These are recorded in a set of LLVM IR module-level attributes like we do for
AArch64 PAC/BTI (see https://reviews.llvm.org/D85649):
- command-line options are "translated" to module-level LLVM IR
attributes (metadata).
- functions have PAC/BTI specific attributes iff the
__attribute__((target("branch-protection=...))) was used in the function
declaration.
- command-line option -mbranch-protection to armclang targeting Arm,
following this grammar:
branch-protection ::= "-mbranch-protection=" <protection>
protection ::= "none" | "standard" | "bti" [ "+" <pac-ret-clause> ]
| <pac-ret-clause> [ "+" "bti"]
pac-ret-clause ::= "pac-ret" [ "+" <pac-ret-option> ]
pac-ret-option ::= "leaf" ["+" "b-key"] | "b-key" ["+" "leaf"]
b-key is simply a placeholder to make it consistent with AArch64's
version. In Arm, however, it triggers a warning informing that b-key is
unsupported and a-key will be selected instead.
- Handle _attribute_((target(("branch-protection=..."))) for AArch32 with the
same grammer as the commandline options.
This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension
The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:
https://developer.arm.com/documentation/ddi0553/latest
The following people contributed to this patch:
- Momchil Velikov
- Victor Campos
- Ties Stuij
Reviewed By: vhscampos
Differential Revision: https://reviews.llvm.org/D112421
The invocation of a unary or binary operator for type-dependent expressions is represented as a CXXOperatorCallExpr. Upon template instantiation, TreeTransform::RebuildCXXOperatorCallExpr checks for the case of an overloaded operator, but not for a (non-ObjC) PseudoObject, and will directly create a UnaryOperator or BinaryOperator.
Generalizing commit 0f99537eca from @akyrtzi to handle non-ObjC pseudo objects (and also handle the case of unary pseudo object inc/dec).
This fixes https://bugs.llvm.org/show_bug.cgi?id=51855
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D111639
Fix for the case when there are no instructions in the entry basic block before the call
to `createSections`
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114143
This patch changes clang-offload-bundler to use the original file extension for
the device archive member when unbundling archives instead of printing a warning
and defaulting to ".o".
Differential Revision: https://reviews.llvm.org/D114776
Allow toggling of -fnew-infallible so last instance takes precedence
Testing:
ninja check-all
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D113523
We've found that when profiling, counts are only generated for the real definition of constructor aliases (C2 in mangled name). However, when compiling the C1 version is present at the callsite and leads to a lack of counts due to this aliasing. This causes us to miss out on inlining an otherwise hot constructor.
-mconstructor-aliases is AFAICT an optimization, so having a disabling flag if wanted seems valuable.
Testing:
ninja check-all
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D114130