Adds initial parsing and sema for the 'adjust_args' clause.
Note that an AST clause is not created as it instead adds its expressions
to the OMPDeclareVariantAttr.
Differential Revision: https://reviews.llvm.org/D99905
During explicit modular build, PCM files are typically specified via the `-fmodule-file=<path>` command-line option. Early during the compilation, Clang uses the `ASTReader` to read their contents and caches the result so that the module isn't loaded implicitly later on. A listener is attached to the `ASTReader` to collect names of the modules read from the PCM files. However, if the PCM has already been loaded previously via PCH:
1. the `ASTReader` doesn't do anything for the second time,
2. the listener is not invoked at all,
3. the module load result is not cached,
4. the compilation fails when attempting to load the module implicitly later on.
This patch solves this problem by attaching the listener to the `ASTReader` for PCH reading as well.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111560
The builtin __rlwnm is currently constrained to accept only constants
for the shift parameter but the instructions emitted for it have no such
constraint, this patch allows the builtins to accept variable shift.
Reviewed By: NeHuang, amyk
Differential Revision: https://reviews.llvm.org/D111229
If the `assume-controlled-environment` is `true`, we should expect `getenv()`
to succeed, and the result should not be considered tainted.
By default, the option will be `false`.
Reviewed By: NoQ, martong
Differential Revision: https://reviews.llvm.org/D111296
The `getenv()` function might return `NULL` just like any other function.
However, in case of `getenv()` a state-split seems justified since the
programmer should expect the failure of this function.
`secure_getenv(const char *name)` behaves the same way but is not handled
right now.
Note that `std::getenv()` is also not handled.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D111245
This reverts commit 97f0c63783.
As discussed in https://reviews.llvm.org/D110684, it increased the
compile time and the binary size of clang more than 1%. I reverted
this patch first to think about a better way to do it.
GCC 9.1 removed Intel MPX support. Linux kernel removed MPX in 2019.
glibc 2.35 will remove MPX.
Our support is limited: we support assembling of bndmov but not bnd.
Just remove it.
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D111517
Clarify the message provided when the analyzer catches the use of memory
that is allocated with size zero.
Differential Revision: https://reviews.llvm.org/D111655
This is the second part of p0388, dealing with overloads of list
initialization to incomplete array types. It extends the handling
added in D103088 to permit incomplete arrays. We have to record that
the conversion involved an incomplete array, and so (re-add) a bit flag
into the standard conversion sequence object. Comparing such
conversion sequences requires knowing (a) the number of array elements
initialized and (b) whether the initialization is of an incomplete array.
This also updates the web page to indicate p0388 is implemented (there
is no feature macro).
Differential Revision: https://reviews.llvm.org/D103908
This implements the new implicit conversion sequence to an incomplete
(unbounded) array type. It is mostly Richard Smith's work, updated to
trunk, testcases added and a few bugs fixed found in such testing.
It is not a complete implementation of p0388.
Differential Revision: https://reviews.llvm.org/D102645
`[[clang::fallthrough]]` has meaning for the CFG, but all other
StmtAttrs we currently have don't. So omit them, as AttributedStatements
with children cause several issues and there's no benefit in including
them.
Fixes PR52103 and PR49454. See PR52103 for details.
Differential Revision: https://reviews.llvm.org/D111568
To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.
In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.
We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.
There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.
Depends on D102923.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102488
For dependency scanning, it would be useful to collect header search paths (provided on command-line via `-I` and friends) that were actually used during preprocessing. This patch adds that feature to `HeaderSearch` along with a new remark that reports such paths as they get used.
Previous version of this patch tried to use the existing `LookupFileCache` to report used paths via `HitIdx`. That doesn't work for `ComputeUserEntryUsage` (which is intended to be called *after* preprocessing), because it indexes used search paths by the file name. This means the values get overwritten when the code contains `#include_next`.
Note that `HeaderSearch` doesn't use `HeaderSearchOptions::UserEntries` directly. Instead, `InitHeaderSearch` pre-processes them (adds platform-specific paths, removes duplicates, removes paths that don't exist) and creates `DirectoryLookup` instances. This means we need a mechanism for translating between those two. It's not possible to go from `DirectoryLookup` back to the original `HeaderSearch`, so `InitHeaderSearch` now tracks the relationships explicitly.
Depends on D111557.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102923
Add atomic_half types and builtins operating on the types from the
cl_ext_float_atomics extension.
Patch by Haonan Yang.
Differential Revision: https://reviews.llvm.org/D109740
Rename vfredsum and vfwredsum to vfredusum and vfwredusum. Add aliases for vfredsum and vfwredsum.
Reviewed By: luismarques, HsiangKai, khchen, frasercrmck, kito-cheng, craig.topper
Differential Revision: https://reviews.llvm.org/D105690
Current btf_tag is applied to declaration only.
Per discussion in https://reviews.llvm.org/D111199,
we plan to introduce btf_type_tag attribute for types.
So rename btf_tag to btf_decl_tag to make it easily
differentiable from btf_type_tag.
Differential Revision: https://reviews.llvm.org/D111588
Sequel patch to https://reviews.llvm.org/D111293.
Remove call to CodeGenFunction::InitTempAlloca() from OpenMP related
codegen part.
Also remove the metadata `!llvm.access.group` from the updated lit
tests.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D111316
In the original design, we levarage _mt intrinsics to define macros for
_m intrinsics. Such as,
```
__builtin_rvv_vadd_vv_i8m1_mt((vbool8_t)(op0), (vint8m1_t)(op1), (vint8m1_t)(op2), (vint8m1_t)(op3), (size_t)(op4), (size_t)VE_TAIL_AGNOSTIC)
```
However, we could not define generic interface for mask intrinsics any
more due to clang_builtin_alias only accepts clang builtins as its
argument.
In the example,
```
__rvv_overloaded
__attribute__((clang_builtin_alias(__builtin_rvv_vadd_vv_i8m1_mt)))
vint8m1_t vadd(vbool8_t op0, vint8m1_t op1, vint8m1_t op2, vint8m1_t
op3, size_t op4, size_t op5);
```
op5 is the tail policy argument. When users want to use vadd generic
interface for masked vector add, they need to specify tail policy in the
previous design. In this patch, we define _m intrinsics as clang
builtins to solve the problem.
Differential Revision: https://reviews.llvm.org/D110684
"darwin" is ambiguous. When there isn't a better source
of truth (e.g., SDKs), the driver will either interpret it
as "iOS" when cross-compiling to a different architecture,
or "the host" when not. That's now the case on AS Macs.
Update the test to more explicitly test the OS.
aarch64-mac-cpus.c already tests the mac-specific driver logic.
This reland commit 1131b1eb35, which
adds support to __attribute__((availability)) annotation for Fuchsia
platform. This patch also adds '-ffuchsia-api-level' to allow specify
Fuchsia API level from the command line.
Differential Revision: https://reviews.llvm.org/D108592
usage of an abstract class type within itself.
We were missing handling for deduction guides (which would assert),
friend declarations, and variable templates. We were mishandling inline
variables and other variables defined inside the class definition.
These diagnostics should be downgraded to warnings, or perhaps removed
entirely, once we implement P0929R2.
This reverts commit b875343873.
Per discussion in https://reviews.llvm.org/D111199, instead to make
existing btf_tag attribute as a type-or-decl attribute, we will
make existing btf_tag attribute as a decl only attribute, and
introduce btf_type_tag as a type only attribute. This will make
it easy for cases like typedef where an attribute may be applied
as either a type attribute or a decl attribute.
This patch adds support to __attribute__((availability)) annotation for
Fuchsia platform. This patch also adds '-ffuchsia-api-level' to allow
specify Fuchsia API level from the command line.
Differential Revision: https://reviews.llvm.org/D108592
When AnnotateAttr is on a function, AddGlobalAnnotations is only called
in CodeGenModule::EmitGlobalFunctionDefinition which means AnnotateAttr
on function declaration without function body will be ignored.
The patch will move AddGlobalAnnotations to
CodeGenModule::SetFunctionAttributes, so with or without function body,
the AnnotateAttr will get code gen for a function.
It'll help case when AnnotateAttr is on external function, and the
AnnotateAttr will be consumed in IR level.
For example, a pass to collect num of uses for functions with
__attribute((annotate("count_use"))) after optimizations,
As long as there's __attribute((annotate("count_use"))), function with
or without function body should be counted.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D111109
Patch by: python3kgae (Xiang Li)
armv9-a, armv9.1-a and armv9.2-a can be targeted using the -march option
both in ARM and AArch64.
- Armv9-A maps to Armv8.5-A.
- Armv9.1-A maps to Armv8.6-A.
- Armv9.2-A maps to Armv8.7-A.
- The SVE2 extension is enabled by default on these architectures.
- The cryptographic extensions are disabled by default on these
architectures.
The Armv9-A architecture is described in the Arm® Architecture Reference
Manual Supplement Armv9, for Armv9-A architecture profile
(https://developer.arm.com/documentation/ddi0608/latest).
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D109517
As for 128-bit floating points on PowerPC, compiler should have three
machine modes:
- IFmode, always IBM extended double
- KFmode, always IEEE 754R 128-bit floating point
- TFmode, matches the semantics for long double
This commit adds support for IF mode with its complex variant, IC mode.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D109950
The SSE4 header (smmintrin.h) should include SSSE3 (tmmintrin.h) instead
of SSE2 (emmintrin.h).
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D111482
Without this, the combination of `-ast-dump=json` and `-ast-dump-filter FILTER` produces invalid JSON: the first line is a string that says `Dumping $SOME_DECL_NAME: `.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108441
C++20 and later allow you to pass no argument for the ... parameter in
a variadic macro, whereas earlier language modes and C disallow it.
We no longer diagnose in C++20 and later modes. This fixes PR51609.
In this case, we know statically that we're destroying the most-derived
class, so the vptr must already point to the current class and never
needs to be updated.
fae0dfa implemented the new __ibm128 type, this patch enables its
complex form.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D109948
This patch adds support for the
`__kmpc_get_hardware_num_threads_in_block` function that returns the
number of threads. This was missing in the new runtime and was used by
the AMDGPU plugin which prevented it from using the new runtime. This
patchs also unified the interface for getting the thread numbers in the
frontend.
Originally authored by jdoerfert.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D111475
There are functions where we do not want function instrumentation which is why we have `__attribute__((no_instrument_function))`. Extending this functionality to disable instrumentation for Objective-C methods as well. Objective C methods like `+load` run premain and having instrumentation on them causes runtime errors depending on the implementation of `__cyg_profile_func_enter` etc. functions
Reviewed By: rjmccall, aaron.ballman
Differential Revision: https://reviews.llvm.org/D111286
Distinct lambda expressions are always considered non-equivalent, so two
token-for-token identical function declarations whose signatures involve
lambda-expressions declare distinct functions.
__builtin_assume_aligned's second parameter is size_t, which may be 32 bits.
We can't pass 2^32 when that happens. Update tests accordingly.
Example broken bot due to D111250:
https://lab.llvm.org/buildbot/#/builders/171/builds/4531
Previously if you passed an absolute path to clang, where only part of
the path to the file was remapped, it would result in the file's DIFile
being stored with a duplicate path, for example:
```
!DIFile(filename: "./ios/Sources/bar.c", directory: "./ios/Sources")
```
This change handles absolute paths, specifically in the case they are
remapped to something relative, and uses the dirname for the directory,
and basename for the filename.
This also adds a test verifying this behavior for more standard uses as
well.
Differential Revision: https://reviews.llvm.org/D111352
The following tests are failing due to missing DWARF sections. This patch sets these tests as XFAIL/DISABLED on AIX until a more permanent solution is implemented.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D111336
non-Darwin ObjC runtimes:
- Use the same logic the Darwin runtime does for inferring that a
receiver is non-null and therefore doesn't require null checks.
Previously we weren't skipping these for non-super dispatch.
- Emit a null check when there's a consumed parameter so that we can
destroy the argument if the call doesn't happen. This mostly
involves extracting some common logic from the Darwin-runtime code.
- Generate a zero aggregate by zeroing the same memory that was used
in the method call instead of zeroing separate memory and then
merging them with a phi. This uses less memory and avoids unnecessary
copies.
- Emit zero initialization, and generate zero values in phis, using
the proper zero-value routines instead of assuming that the zero
value of the result type has a bitwise-zero representation.
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib
It currently doesn't support linking an archive directly, like:
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a
Linking with x86 offload also does not work.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D105191
Original commit message: "
Original commit message: "
Original commit message:"
The current infrastructure in lib/Interpreter has a tool, clang-repl, very
similar to clang-interpreter which also allows incremental compilation.
This patch moves clang-interpreter as a test case and drops it as conditionally
built example as we already have clang-repl in place.
Differential revision: https://reviews.llvm.org/D107049
"
This patch also ignores ppc due to missing weak symbol for __gxx_personality_v0
which may be a feature request for the jit infrastructure. Also, adds a missing
build system dependency to the orc jit.
"
Additionally, this patch defines a custom exception type and thus avoids the
requirement to include header <exception>, making it easier to deploy across
systems without standard location of the c++ headers.
"
This patch also works around PR49692 and finds a way to use llvm::consumeError
in rtti mode.
Differential revision: https://reviews.llvm.org/D107049
At this point it looks like a B extension will never exist. Instead
Zba, Zbb, Zbc, and Zbs are individual extensions being ratified
together as a package. Unknown at this time when or if the other
Zb* extensions will be ratified.
This patch removes references to the B extension. I've updated and
split tests accordingly.
This has been split from D110669 to make review a little easier.
Differential Revision: https://reviews.llvm.org/D111338
This patch adds two flags to be supported for the new runtime. The flags
are `-fopenmp-assume-threads-oversubscription` and
-fopenmp-assume-teams-oversubscription`. These add global values that
can be checked by the work sharing runtime functions to make better
judgements about how to distribute work between the threads.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111348
When have ObjCInterfaceDecl with the same name in 2 different modules,
hitting the assertion
> Assertion failed: (Index < RL->getFieldCount() && "Ivar is not inside record layout!"),
> function lookupFieldBitOffset, file llvm-project/clang/lib/AST/RecordLayoutBuilder.cpp, line 3434.
on accessing an ivar inside a method. The assertion happens because
ivar belongs to one module while its containing interface belongs to
another module and then we fail to find the ivar inside the containing
interface. We already keep a single ObjCInterfaceDecl definition in
redecleration chain and in this case containing interface was correct.
The issue is with ObjCIvarDecl. IVar decl for IRGen is taken from
ObjCIvarRefExpr that is created in `Sema::BuildIvarRefExpr` using ivar
decl returned from `Sema::LookupIvarInObjCMethod`. And ivar lookup
returns a wrong decl because basically we take the first ObjCIvarDecl
found in `ASTReader::FindExternalVisibleDeclsByName` (called by
`DeclContext::lookup`). And in `ASTReader.Lookups` lookup table for a
wrong module comes first because `ASTReader::finishPendingActions`
processes `PendingUpdateRecords` in reverse order and the first
encountered ObjCIvarDecl will end up the last in `ASTReader.Lookups`.
Fix by merging ObjCIvarDecl from different modules correctly and by
using a canonical one in IRGen.
rdar://82854574
Differential Revision: https://reviews.llvm.org/D110280
Some subprojects like compiler-rt define the `darwin` feature in their
lit config, but clang does not do that, so we need to use the global
`system-darwin` here instead.
Differential Revision: https://reviews.llvm.org/D111267
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib
It currently doesn't support linking an archive directly, like:
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a
Linking with x86 offload also does not work.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D105191
This reverts c7f16ab3e3 / r109694 - which
suggested this was done to improve consistency with the gdb test suite.
Possible that at the time GCC did not canonicalize integer types, and so
matching types was important for cross-compiler validity, or that it was
only a case of over-constrained test cases that printed out/tested the
exact names of integer types.
In any case neither issue seems to exist today based on my limited
testing - both gdb and lldb canonicalize integer types (in a way that
happens to match Clang's preferred naming, incidentally) and so never
print the original text name produced in the DWARF by GCC or Clang.
This canonicalization appears to be in `integer_types_same_name_p` for
GDB and in `TypeSystemClang::GetBasicTypeEnumeration` for lldb.
(I tested this with one translation unit defining 3 variables - `long`,
`long (*)()`, and `int (*)()`, and another translation unit that had
main, and a function that took `long (*)()` as a parameter - then
compiled them with mismatched compilers (either GCC+Clang, or
Clang+(Clang with this patch applied)) and no matter the combination,
despite the debug info for one CU naming the type "long int" and the
other naming it "long", both debuggers printed out the name as "long"
and were able to correctly perform overload resolution and pass the
`long int (*)()` variable to the `long (*)()` function parameter)
Did find one hiccup, identified by the lldb test suite - that CodeView
was relying on these names to map them to builtin types in that format.
So added some handling for that in LLVM. (these could be split out into
separate patches, but seems small enough to not warrant it - will do
that if there ends up needing any reverti/revisiting)
Differential Revision: https://reviews.llvm.org/D110455
The patch implements header-only support for testure lookups.
The patch has been tested on a source file with all possible combinations of
argument types supported by CUDA headers, compiled and verified that the
generated instructions and their parameters match the code generated by NVCC.
Unfortunately, compiling texture code requires CUDA headers and can't be tested
in clang itself. The test will need to be added to the test-suite later.
While generated code compiles and seems to match NVCC, I do not have any code
that uses textures that I could test correctness of the implementation. Hence
the experimental status.
Differential Revision: https://reviews.llvm.org/D110089
declaration.
Names starting with an underscore are reserved at the global scope, so
cannot be used as the name of an extern "C" symbol in any scope because
such usages conflict with a name at global scope.
Also do not warn on `#define _foo` or `#undef _foo`.
Only global scope names starting with _[a-z] are reserved, not the use
of such an identifier in any other context.
adjustMemberOfForLambdaCaptures.
The problem is happening when user passes lambda function with reference
type in the map clause.
The natural of the problem when processing generateInfoForCapture,
the BasePointer is generated with new load for a lambda variable with
reference type. It is not expected in adjustMemberOfForLambdaCaptures.
One way to fix this is to skipping call to generateInfoForCapture for
map(to:lambda). The map info will be generated later in the call to
generateDefaultMapInfo samiler as firsprivate clase.
This to fix https://bugs.llvm.org/show_bug.cgi?id=52071
Differential Revision:https://reviews.llvm.org/D111115
This is to save memory for Clang compiles.
Measuring building PassBuilder.cpp under /usr/bin/time, max rss goes from 0.93GB to 0.7GB.
This does not turn it by default yet.
I've turned on the option locally and run it over a good amount of files without any issues.
For more background, see
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111105
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Clang would reject
#pragma omp for
#pragma omp tile sizes(P)
for (int i = 0; i < 128; ++i) {}
where P is a template parameter, but the loop itself is not
template-dependent. Because P context-dependent, the TransformedStmt
cannot be generated and therefore is nullptr (until the template is
instantiated by TreeTransform). The OMPForDirective would still expect
the a loop is the dependent context and trigger an error.
Fix by introducing a NumGeneratedLoops field to OMPLoopTransformation.
This is used to distinguish the case where no TransformedStmt will be
generated at all (e.g. #pragma omp unroll full) and template
instantiation is needed. In the latter case, delay resolving the
iteration space like when the for-loop itself is template-dependent
until the template instatiation.
A more radical solution would always delay the iteration space analysis
until template instantiation, but would also break many test cases.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D111124
There is an error in the implementation of the logic of reaching the `Unknonw` tristate in CmpOpTable.
```
void cmp_op_table_unknownX2(int x, int y, int z) {
if (x >= y) {
// x >= y [1, 1]
if (x + z < y)
return;
// x + z < y [0, 0]
if (z != 0)
return;
// x < y [0, 0]
clang_analyzer_eval(x > y); // expected-warning{{TRUE}} expected-warning{{FALSE}}
}
}
```
We miss the `FALSE` warning because the false branch is infeasible.
We have to exploit simplification to discover the bug. If we had `x < y`
as the second condition then the analyzer would return the parent state
on the false path and the new constraint would not be part of the State.
But adding `z` to the condition makes both paths feasible.
The root cause of the bug is that we reach the `Unknown` tristate
twice, but in both occasions we reach the same `Op` that is `>=` in the
test case. So, we reached `>=` twice, but we never reached `!=`, thus
querying the `Unknonw2x` column with `getCmpOpStateForUnknownX2` is
wrong.
The solution is to ensure that we reached both **different** `Op`s once.
Differential Revision: https://reviews.llvm.org/D110910
The default wchar type is different on AIX vs. Linux. When this test is run on
AIX, WCHAR_T_TYPE ends up being set to int. This is incorrect as the default
wchar type on AIX is actually unsigned short, and setting the type incorrectly
causes the expected errors to not be found.
This patch sets the type correctly (to unsigned short) for AIX.
Differential Revision: https://reviews.llvm.org/D110428
This simple change addresses a special case of structure/pointer
aliasing that produced different symbolvals, leading to false positives
during analysis.
The reproducer is as simple as this.
```lang=C++
struct s {
int v;
};
void foo(struct s *ps) {
struct s ss = *ps;
clang_analyzer_dump(ss.v); // reg_$1<int Element{SymRegion{reg_$0<struct s *ps>},0 S64b,struct s}.v>
clang_analyzer_dump(ps->v); //reg_$3<int SymRegion{reg_$0<struct s *ps>}.v>
clang_analyzer_eval(ss.v == ps->v); // UNKNOWN
}
```
Acks: Many thanks to @steakhal and @martong for the group debug session.
Reviewed By: steakhal, martong
Differential Revision: https://reviews.llvm.org/D110625
The builtin for vec_orc has support for the following two signatures,
but currently the compiler marks it ambiguous:
vector float vec_orc(vector float, vector float)
vector double vec_orc(vector double, vector double)
This patch implements these two builtins.
Differential revision: https://reviews.llvm.org/D110858
This also removes the need to disable the mandatory inlining phase in
tests.
In a departure from the previous remark, we don't output a 'cost' in
this case, because there's no such thing. We just report that inlining
happened because of the attribute.
Differential Revision: https://reviews.llvm.org/D110891
This patch allows the use of __vector_quad and __vector_pair, PPC MMA builtin
types, on all PowerPC 64-bit compilation units. When these types are
made available the builtins that use them automatically become available
so semantic checking for mma and pair vector memop __builtins is also
expanded to ensure these builtin function call are only allowed on
Power10 and new architectures. All related test cases are updated to
ensure test coverage.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D109599
Modify the IfStmt node to suppoort constant evaluated expressions.
Add a new ExpressionEvaluationContext::ImmediateFunctionContext to
keep track of immediate function contexts.
This proved easier/better/probably more efficient than walking the AST
backward as it allows diagnosing nested if consteval statements.
We keep a map from function name to source location so we don't have to
do it via looking up a source location from the AST. However, since
function names can be long, we actually use a hash of the function name
as the key.
Additionally, we can't rely on Clang's printing of function names via
the AST, so we just demangle the name instead.
This is necessary to implement
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D110665
Per the GCC info page:
If the function is declared 'extern', then this definition of the
function is used only for inlining. In no case is the function
compiled as a standalone function, not even if you take its address
explicitly. Such an address becomes an external reference, as if
you had only declared the function, and had not defined it.
Respect that behavior for inline builtins: keep the original definition, and
generate a copy of the declaration suffixed by '.inline' that's only referenced
in direct call.
This fixes holes in c3717b6858.
Differential Revision: https://reviews.llvm.org/D111009
The builtins: `__compare_and_swaplp`, `__fetch_and_addlp`,
` __fetch_and_andlp`, `__fetch_and_orlp`, `__fetch_and_swaplp` are
64 bit only. This patch ensures the compiler produces an error in 32 bit mode.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D110824
by Raul Penacoba.
The size of kmp_depend_info and the number of dependencies are computed multiplying the iterator sizes, which not right.
Now size is computed as:
itersize1*numclausedeps1 + itersize2*numclausedeps2 + ... + itersizeN*numclausedepsN
where itersizeX is the size of the iterator and numclausedepsX the number of dependencies in that depend clause.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D111045
This patch fixes the return value of the builtin __builtin_ppc_load2r to
correctly return short instead of int.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D110771
Looks like exceptions are off-by-default with the PS4 triple.
Since adding -fexceptions defeats the purpose of the test change
in 8dfbe9b0a, pass an explicit triple instead.
These builtins produce inefficient code for CPU's prior to Power8
due to vcmpequd being unavailable. The predicate forms can actually
leverage the available vcmpequw along with xxlxor to produce a better
sequence.
The signatures for the PowerPC builtins lharx and
lbarx are incorrect, and causes issues when used in a function
that requires the return of the builtin to be promoted.
This patch fixes these signatures.
Differential revision: https://reviews.llvm.org/D110273
This consists of 3 compressed instructions, c.not, c.neg, and c.zext.w.
I believe these have been picked up by the Zce effort using different
encodings. I don't think it makes sense to keep them in bitmanip. It
will eventually cause a conflict if/when Zce is implemented in llvm.
Differential Revision: https://reviews.llvm.org/D110871
A followup to D110201.
For example, we'd set OptimizationRemarkMissed's Regex to '.*' when
encountering -Rpass. Normally this doesn't actually affect remarks we
emit because in clang::ProcessWarningOptions() we'll separately look at
all -R arguments and turn on/off corresponding diagnostic groups.
However, this is reproducible with -round-trip-args.
Reviewed By: JamesNagurne
Differential Revision: https://reviews.llvm.org/D110673
When clang crashes, it writes a standalone source file and shell script
to reproduce the crash.
The Driver used to set `Mode = CPPMode` in generateCompilationDiagnostics()
to force preprocessing mode. This has the side effect of making
IsCLMode() return false, which in turn meant Clang::AddClangCLArgs()
didn't get called when creating the standalone source file, which meant
the stand-alone file was preprocessed with the gcc driver's defaults
In particular, exceptions default to on with the gcc driver, but to
off with the cl driver. The .sh script did use the original command
line, so in the reproducer for a clang-cl crash, the standalone source
file could contain exception-using code after preprocessing that the
compiler invocation in the shell script would then complain about.
This patch removes the `Mode = CPPMode;` line and instead additionally
checks for `CCGenDiagnostics` in most places that check `CCCIsCPP().
This also matches the strategy Clang::ConstructJob() uses to add
-frewrite-includes for creating the standalone source file for a crash
report.
Fixes PR52007.
Differential Revision: https://reviews.llvm.org/D110783
Try to address Windows flakes from d87bdc272b
by adding "|| true" as suggested in D110276 so the whole test doesn't
fail when Windows thinks it can't remove the binary.
There is a special situation with templates in local classes,
as can be seen in this example with generic lambdas in function scope:
```
template<class T1> void foo() {
(void)[]<class T2>() {
struct S {
void bar() { (void)[]<class T3>(T2) {}; }
};
};
};
template void foo<int>();
```
As a consequence of the resolution of DR1484, bar is instantiated during the
substitution of foo, and in this context we would substitute the lambda within
it with it's own parameters "injected" (turned into arguments).
This can't be properly dealt with for at least a couple of reasons:
* The 'TemplateTypeParm' type itself can only deal with canonical replacement
types, which the injected arguments are not.
* If T3 were constrained in the example above, our (non-conforming) eager
substitution of type constraints would just leave that parameter dangling.
Instead of substituting with injected parameters, this patch just leaves those
inner levels unreplaced.
Since injected arguments appear to be unused within the users of
`getTemplateInstantiationArgs`, this patch just removes that support there and
leaves a couple of asserts in place.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D110727
This patch adds OpenMP assumption attributes to call sites in applicable
regions. Currently this applies the caller's assumption attributes to
any calls contained within it. So, if a call occurs inside an OpenMP
assumes region to a function outside that region, we will assume that
call respects the assumptions. This is primarily useful for inline
assembly calls used heavily in the OpenMP GPU device runtime, which
allows us to then make judgements about what the ASM will do.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110655
This patch makes sure that the builtins __builtin_ppc_load8r and
__ builtin_ppc_store8r are only available for Power 7 and up.
Currently the builtins seem to produce incorrect code if used for
Power 6 or before.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D110653
{x86_64,aarch64}-unknown-fuchsia and {x86_64,aarch64}-fuchsia should
behave identically as targets, update the test to make sure that's the
case.
Differential Revision: https://reviews.llvm.org/D110687
clang-cl maps /wdNNNN to -Wno-flags for a few warnings that map
cleanly from cl.exe concepts to clang concepts.
This patch adds support for the same numbers to
`#pragma warning(disable : NNNN)`. It also lets
`#pragma warning(push)` and `#pragma warning(pop)` have an effect,
since these are used together with `warning(disable)`.
The optional numeric argument to `warning(push)` is ignored,
as are the other non-`disable` `pragma warning()` arguments.
(Supporting `error` would be easy, but we also don't support
`/we`, and those should probably be added together.)
The motivating example is that a bunch of code (including in LLVM)
uses this idiom to locally disable warnings about calls to deprecated
functions in Windows-only code, and 4996 maps nicely to
-Wno-deprecated-declarations:
#pragma warning(push)
#pragma warning(disable: 4996)
f();
#pragma warning(pop)
Implementation-wise:
- Move `/wd` flag handling from Options.td to actual Driver-level code
- Extract the function mapping cl.exe IDs to warning groups to the
new file clang/lib/Basic/CLWarnings.cpp
- Create a diag::Group enum so that CLWarnings.cpp can refer to
existing groups by ID (and give DllexportExplicitInstantiationDecl
a named group), and add a function to map a diag::Group to the
spelling of it's associated commandline flag
- Call that new function from PragmaWarningHandler
Differential Revision: https://reviews.llvm.org/D110668
This patch is in a series of patches to provide builtins for compatibility with
the XL compiler. This patch implements the software divide builtin as
wrappers for a floating point divide. XL provided these builtins because it
didn't produce software estimates by default at `-Ofast`. When compiled
with `-Ofast` these builtins will produce the software estimate for divide.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D106959
With xlc and xlC pragma align(packed) will pack bitfields the same way
as pragma align(bit_packed). xlclang, xlclang++ and clang will
pack bitfields the same way as pragma pack(1). Issue a warning when
source code using pragma align(packed) is used to alert the user it
may not be compatable with xlc/xlC.
Differential Revision: https://reviews.llvm.org/D107506
The instruction has similar semantics to vbpermq but for doublewords.
It was added in Power9 and the ABI documents the builtin.
Differential revision: https://reviews.llvm.org/D107899
Since XLC only ever shipped on PowerPC AIX and Linux, it is not reasonable to
provide the compatibility macros on any target other than those two. This patch
restricts those macros to AIX/Linux.
Differential revision: https://reviews.llvm.org/D110213
With -fpreserve-vec3-type enabled, a cast was not created when
converting from a non-vec3 type to a vec3 type, even though a
conversion to vec3 was performed. This resulted in creation of
invalid store instructions.
Differential Revision: https://reviews.llvm.org/D108470
On AIX, we relied on LTO to merge the csects for profiling data/counter
sections.
AIX binder now get the namedcsect support to support the merging,
so now we can enable PGO without LTO with the new binder.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D110671
We generate symbols like `profc`/`profd` for each function, and put them into csects.
When there are weak functions, we generate weak symbols for the functions as well,
with ELF (and some others), linker (binder) will discard and only keep one copy of the weak symbols.
However, on AIX, the current binder can NOT discard the weak symbols if we put all of them into the same csect,
as binder can NOT discard a subset of a csect.
This creates a unique challenge for using those symbols to calculate some relative offsets.
This patch changed the linkage of `profc`/`profd` symbols to be private, so that all the profc/profd for each weak symbol will be *local* to objects, and all kept in the csect, so we won't have problem. Although only one of the counters will be used, all the pointer in the profd is correct.
The downside is that we won't be able to discard the duplicated counters and profile data,
but those can not be discarded even if we keep the weak linkage,
due to the binder limitation of not discarding a subsect of the csect either .
Reviewed By: Whitney, MaskRay
Differential Revision: https://reviews.llvm.org/D110422
In looking at the disk space used by a ninja check-all, I found that a
few of the largest files were copies of clang and lld made into temp
directories by a couple of tests. These tests were added in D53021 and
D74811. Clean up these copies after usage.
Differential Revision: https://reviews.llvm.org/D110276
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Previous revisions didn't properly declare the new dependencies.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
Some people downstream are reporting that this test fails. I've been
unable to reproduce, but there is indeed something spooky going on.
Pinning to the new PM suppresses the failure. I'm continuing to
investigate this.
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
(This is a recommit of 3d6f49a569 that should no longer break validation since
bd379915de).
It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.
Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.
Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.
After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite
Differential Revision: https://reviews.llvm.org/D109967
This allows clang to work on Linux distributions like Debian where
<CUDA-PATH>/include may be a symlink to /usr/include. We only need
`cuda_wrappers` to be present before the standard C++ library headers.
The CUDA SDK headers themselves do not need to be found that early.
This addresses https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995122
mentioned in post-commit comments on D108247
Differential Revision: https://reviews.llvm.org/D110596
The -mmcu= option accepts a generic MCU named "msp430", which sets the
CPU to msp430 and disables hardware multiply support.
The current purpose of accepting this value is to allow -mmcu= to be
used as an alias for -mcpu=, however there are some downsides to doing
this. -mmcu= provides additional features that will interfere
with the expected behavior if the user tries to to use it as an alias
for -mcpu=.
-mmcu=msp430 will conflict with -mhwmult=, since the "msp430" MCU is
defined to have no hardware multiply support, so the user will not be
able to set an explicit hardware multiply version.
-mmcu=msp430 will put "-Tmsp430.ld" on the linker command line, however
TI's support files do not provide a linker script with this name and so
the user would have to explicitly create it.
Differential Revision: https://reviews.llvm.org/D108299
(This relands 59337263ab and makes sure comma operator
diagnostics are suppressed in a SFINAE context.)
While at it, add the diagnosis message "left operand of comma operator has no effect" (used by GCC) for comma operator.
This also makes Clang diagnose in the constant evaluation context which aligns with GCC/MSVC behavior. (https://godbolt.org/z/7zxb8Tx96)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D103938
I am looking at constant-folding changes that could affect these tests, so
check that it emits the expected global value instead of just checking
that it doesn't crash.
Looking at this test I did not see why MinGW was using a different command
line until I looked at the git history. Add a comment explaining what this
RUN line is actually testing. Also add two more RUN lines to show that
indirectly passed member pointers don't inhibit the optimization.
This patch is in a series of patches to provide builtins for
compatability with the XL compiler. This patch adds builtins for compare
exponent and test data class operations on floating point values.
Reviewed By: #powerpc, lei
Differential Revision: https://reviews.llvm.org/D109437
Require it to be always_inline, to more closely match how _FORITFY_SOURCE
behaves.
This avoids generation of `.inline` suffixed functions - these should always be
inlined.
After significant problems in our downstream with the previous
implementation, the SYCL standard has opted to make using macros/etc to
change kernel-naming-lambdas in any way UB (even passively). As a
result, we are able to just emit the itanium mangling.
However, this DOES require a little work in the CXXABI, as the microsoft
and itanium mangler use different numbering schemes for lambdas. This
patch adds a pair of mangling contexts that use the normal 'itanium'
mangling strategy to fill in the "DeviceManglingNumber" used previously
by CUDA.
Differential Revision: https://reviews.llvm.org/D110281
C++17 permits using 'typename' or 'class' for a template template
parameter, but the error message in the parser only refers to 'class'.
This patch, in C++17 or newer modes, adds "or 'template'" to the
diagnostic.
It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.
Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.
Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.
After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite
Differential Revision: https://reviews.llvm.org/D109967
This patch adds the following built-ins:
__builtin_vsx_build_pair
__builtin_mma_build_acc
Reviewed By: #powerpc, nemanjai, lei
Differential Revision: https://reviews.llvm.org/D107647
These values allow, for example, `--target=aarch64` and
`--target=aarch64-linux-gnu` to detect `aarch64-linux-android`. This is
confusing. Users should specify `--target=aarch64-linux-android` to get Android GCC
installation.
Reverts D53463.
Reviewed By: nickdesaulniers, danalbert
Differential Revision: https://reviews.llvm.org/D110379
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities.
This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build:
1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities.
2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time.
Implementation-wise this adds the following summary function attributes:
1. noUnwind: function is noUnwind
2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions)
3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well)
Testing:
Clang self-build passes and 2nd stage build passes check-all
ninja check-all with newly added tests passing
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D36850
This patch adds a new preprocessor extension ``#pragma clang final``
which enables warning on undefinition and re-definition of macros.
The intent of this warning is to extend beyond ``-Wmacro-redefined`` to
warn against any and all alterations to macros that are marked `final`.
This warning is part of the ``-Wpedantic-macros`` diagnostics group.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108567
HIP currently uses -mlink-builtin-bitcode to link all bitcode libraries, which
changes the linkage of functions to be internal once they are linked in. This
works for common bitcode libraries since these functions are not intended
to be exposed for external callers.
However, the functions in the sanitizer bitcode library is intended to be
called by instructions generated by the sanitizer pass. If their linkage is
changed to internal, their parameters may be altered by optimizations before
the sanitizer pass, which renders them unusable by the sanitizer pass.
To fix this issue, HIP toolchain links the sanitizer bitcode library with
-mlink-bitcode-file, which does not change the linkage.
A struct BitCodeLibraryInfo is introduced in ToolChain as a generic
approach to pass the bitcode library information between ToolChain and Tool.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D110304
This patch adds a new RTL function for worksharing. Currently we use
`__kmpc_for_static_init` for both the `distribute` and `parallel`
portion of the loop clause. This patch replaces the `distribute` portion
with a new runtime call `__kmpc_distribute_static_init`. Currently this
will be used exactly the same way, but will make it easier in the future
to fine-tune the distribute and parallel portion of the loop.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110429
It appears that this test assumes that the toolchain utilizes the integrated
assembler by default, since the expected output in the CHECKs are
compilation_database.o.
However, this test fails on AIX as AIX does not utilize the integrated assembler.
On AIX, the output instead is of the form /tmp/compilation_database-*.s.
Thus, this patch explicitly adds the -fintegrated-as option to match the
assumption that the integrated assembler is used by default.
Differential Revision: https://reviews.llvm.org/D110431
We used to put the canonical spelling of flags after alias processing
on that line. For clang-cl in particular, that meant that we put flags
on that line that the clang-cl driver doesn't even accept, and the
"Driver args:" line wasn't usable.
Differential Revision: https://reviews.llvm.org/D110458
See PR51872 for the original repro.
This fixes a crash when converting a templated constructor into a deduction
guide, in case any of the template parameters were invalid.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D110460
This reverts commit 03142c5f67.
Breaks check-asan if system ld doesn't support --push-state, even
if lld was built and is used according to lit's output.
See comments on https://reviews.llvm.org/D110128
Based on feedback from Paul Robinson on 38c09ea that the 'mangled' mode
is only useful as an LLVM-developer-internal tool in combination with
llvm-dwarfdump --verify, so demote that to a frontend-only (not driver)
option. The driver support is simply -g{no-,}simple-template-names to
switch on simple template names, without the option to use the mangled
template name scheme there.
- This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation.
- Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend.
Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay
Differential Revision: https://reviews.llvm.org/D109362
cxx_dr_status.html
I noticed that these two DRs are currently working correctly, so I
added a pair of lit tests that check the AST (which is most useful for
CWG1779, since 'dependent' is really only observable in an ast dump) to
make sure __func__ works correctly in dependent cases, and in lambda
operator().
Also noticed that CWG1762, mostly an 'example' change, works correctly,
so added a test so that it gets marked 'done' as well.
Additionally, I regenerated cxx_dr_status.html, updating it for Clang
13's release, based on the cwg_status.html from August 12, 2021.
Differential Revision: https://reviews.llvm.org/D109956
This patch adds range checking for some Power10 altivec builtins. Range
checking is done in SemaChecking.
Reviewed By: #powerpc, lei, Conanap
Differential Revision: https://reviews.llvm.org/D109780
Add the tail policy argument to Clang builtins. There
are two policies for tail elements. Tail agnostic means users do not
care about the values in the tail elements and tail undisturbed means
the values in the tail elements need to be kept after the operation. In
order to let users control the tail policy, we add an additional
argument at the end of the argument list.
For unmasked operations, we have no maskedoff and the tail policy is
always tail agnostic. If users want to keep tail elements under unmasked
operations, they could use all one mask in the masked operations to do
it. So, we only add the additional argument for masked operations for
most cases. There are exceptions listed below.
In this patch, we do not handle the following cases to reduce the
complexity of the patch. There could be two separate patches for them.
Use dest argument to control tail policy
vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest
argument)
vfmerge.vfm (add _t builtins with additional dest argument)
vmv.v.v (add _t builtins with additional dest argument)
vmv.v.x (add _t builtins with additional dest argument)
vmv.v.i (add _t builtins with additional dest argument)
vfmv.v.f (add _t builtins with additional dest argument)
vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest
argument)
vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument)
Always has tail argument for masked/unmasked intrinsics
Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt
builtins)
Vector Widening Integer Multiply-Add Instructions (add _t and _mt
builtins)
Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add
_t and _mt builtins)
Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t
and _mt builtins)
Vector Reduction Operations (add _t and _mt builtins)
Vector Slideup Instructions (add _t and _mt builtins)
Vector Slidedown Instructions (add _t and _mt builtins)
Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101
Differential Revision: https://reviews.llvm.org/D109322
When statically linking C++ standard library, we shouldn't add -Bdynamic
after including the library on the link line because that might override
user settings like -static and -static-pie. Rather, we should surround
the library with --push-state/--pop-state to make sure that -Bstatic
only applies to C++ standard library and nothing else. This has been
supported since GNU ld 2.25 (2014) so backwards compatibility should
no longer be a concern.
Differential Revision: https://reviews.llvm.org/D110128
The __darn family of builtins are only available on Pwr9,
and only __darn_32 is available on both 64 and 32 bit, while the rest
are only available on 64 bit. The patch adds sema checking
for these builtins and separate the __darn_32's 32 bit
test cases.
Differential revision: https://reviews.llvm.org/D110282
This excludes certain names that can't be rebuilt from the available
DWARF:
* Atomic types - no DWARF differentiating int from atomic int.
* Vector types - enough DWARF (an attribute on the array type) to do
this, but I haven't written the extra code to add the attributes
required for this
* Lambdas - ambiguous with any other unnamed class
* Unnamed classes/enums - would need column info for the type in
addition to file/line number
* noexcept function types - not encoded in DWARF
This matches GCC.
Change the CC1 option to encode the unwind table level (1: needed by exceptions,
2: asynchronous) so that we can support two modes in the future.
This patch adds range checking for some Power10 altivec builtins and
changes the signature of a builtin to match documentation. For `vec_cntm`,
range checking is done via SemaChecking. For `vec_splati_ins`, the second
argument is masked to extract the 0th bit so that we always receive either a `0`
or a `1`.
Reviewed By: lei, amyk
Differential Revision: https://reviews.llvm.org/D109710
Clang regression tests should not break when changes are made to
the LLVM optimizer. This file broke on the 1st attempt at D110170,
so I'm trying to prevent that on another try.
Similar to other files in this directory, we make a compromise and
run -mem2reg to reduce noise by about 1000 lines out of 5000+ CHECK lines.
When statically linking C++ standard library, we shouldn't add -Bdynamic
after including the library on the link line because that might override
user settings like -static and -static-pie. Rather, we should surround
the library with --push-state/--pop-state to make sure that -Bstatic
only applies to C++ standard library and nothing else. This has been
supported since GNU ld 2.25 (2014) so backwards compatibility should
no longer be a concern.
Differential Revision: https://reviews.llvm.org/D110128
While at it, add the diagnosis message "left operand of comma operator has no effect" (used by GCC) for comma operator.
This also makes Clang diagnose in the constant evaluation context which aligns with GCC/MSVC behavior. (https://godbolt.org/z/7zxb8Tx96)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D103938
This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110279
Currently, linux kernel has a __user attribute ([1]) defined as
__attribute__((noderef, address_space(__user)))
which is used by sparse tool ([2]) to do some
type checking of pointers to user space memory.
During normal compilation, __user will be defined
to nothing so it won't have an impact on compilation.
The btf_tag attribute, which is motivated by
carrying linux kernel annotations into dwarf/BTF,
is introduced in [3]. We intended to define __user as
__attribute__((btf_tag("user")))
so such information will be encoded in dwarf/BTF
and can be used later by bpf verification or other
tracing tools.
But linux kernel __user attribute is also used during
type conversion which btf_tag doesn't support ([4]) since
such type conversion is only used for compiler analysis
and not encoded in dwarf/btf. Theoretically, it is
possible for clang to understand these tags and
do a sparse-like type checking work. But I would like
to leave that to future work and for now suggest simply
ignore these btf_tag attributes if they are used
as type attributes.
[1] https://github.com/torvalds/linux/blob/master/include/linux/compiler_types.h#L10
[2] https://sparse.docs.kernel.org/en/latest/
[3] https://reviews.llvm.org/D106614
[4] https://github.com/torvalds/linux/blob/master/fs/binfmt_flat.c#L135
Differential Revision: https://reviews.llvm.org/D110116
This is to build the foundation of a new debug info feature to use only
the base name of template as its debug info name (eg: "t1" instead of
the full "t1<int>"). The intent being that a consumer can still retrieve
all that information from the DW_TAG_template_*_parameters.
So gno-simple-template-names is business as usual/previously ("t1<int>")
=simple is the simplified name ("t1")
=mangled is a special mode to communicate the full information, but
also indicate that the name should be able to be simplified. The data
is encoded as "_STNt1|<int>" which will be matched with an
llvm-dwarfdump --verify feature to deconstruct this name, rebuild the
original name, and then try to rebuild the simple name via the DWARF
tags - then compare the latter and the former to ensure that all the
data necessary to fully rebuild the name is present.
Allow multiversioning declarations to match when the actual formal
linkage matches, not just when the storage class is identical.
Additionally, change the ambiguous 'linkage' mismatch to be more
specific and say 'language linkage'.
This applies to -Wunused-but-set-variable and
-Wunused-but-set-parameter.
This addresses bug 51865.
Differential Revision: https://reviews.llvm.org/D109862
This patch is for fixing potential shufflevector-related bugs like D93818.
As D93818, this patch change shufflevector's default placeholder to poison.
To reduce risk, it was divided into several patches, and this patch is for InstCombineVectorOps.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D110230
The execution mode of a kernel is stored in a global variable, whose value means:
- 0 - SPMD mode
- 1 - indicates generic mode
- 2 - SPMD mode execution with generic mode semantics
We are going to add support for SIMD execution mode. It will be come with another
execution mode, such as SIMD-generic mode. As a result, this value-based indicator
is not flexible.
This patch changes to bitset based solution to encode execution mode. Each
position is:
[0] - generic mode
[1] - SPMD mode
[2] - SIMD mode (will be added later)
In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode
execution with generic mode semantics. In the future after we add the support for
SIMD mode, `0b1xx` will be in SIMD mode.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110029
This patch is for fixing potential shufflevector-related bugs like D93818.
As D93818, this patch change shufflevector's default placeholder to poison.
To reduce risk, it was divided into several patches, and this patch is for InstCombineCasts.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D110226
This reverts commit 52832cd917.
The motivating commit 2f6b07316f caused several bots to hit
an infinite loop at stage 2, so that needs to be reverted too
while figuring out how to fix that.
The matrix extension requires the indices for matrix subscript
expression to be valid and it is UB otherwise.
extract/insertelement produce poison if the index is invalid, which
limits the optimizer to not be bale to scalarize load/extract pairs for
example, which causes very suboptimal code to be generated when using
matrix subscript expressions with variable indices for large matrixes.
This patch updates IRGen to emit assumes to for index expression to
convey the information that the index must be valid.
This also adjusts the order in which operations are emitted slightly, so
indices & assumes are added before the load of the matrix value.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D102478
Using the preferred name creates a mismatch between the textual name of
a type and the DWARF tags describing the parameters as well as possible
inconsistency between DWARF producers (like Clang and GCC, or
older/newer Clang versions, etc).
And always print it.
This makes some LLVM diagnostics match up better with Clang's diagnostics.
Updated some AMDGPU uses of DiagnosticInfoResourceLimit and now we print
better diagnostics for those.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D110204
Previously with -Rpass (and friends) we'd have remarks "enabled", but
without an actual regex.
As seen in the test change to line numbers, this can give us better
diagnostics by properly enabling NeedLocTracking with -Rpass.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D110201
This patch implements support for the type vector bool int128
for arguments on vector comparison builtins listed below,
which would otherwise crash due to ambiguity.
The following builtins are added:
vec_all_eq (vector bool __int128, vector bool __int128)
vec_all_ne (vector bool __int128, vector bool __int128)
vec_any_eq (vector bool __int128, vector bool __int128)
vec_any_ne (vector bool __int128, vector bool __int128)
vec_cmpne(vector bool __int128 a, vector bool __int128 b)
vec_cmpeq(vector bool __int128 a, vector bool __int128 b)
Differential revision: https://reviews.llvm.org/D110084
See PR51862.
The consumers of the Elidable flag in CXXConstructExpr assume that
an elidable construction just goes through a single copy/move construction,
so that the source object is immediately passed as an argument and is the same
type as the parameter itself.
With the implementation of P2266 and after some adjustments to the
implementation of P1825, we started (correctly, as per standard)
allowing more cases where the copy initialization goes through
user defined conversions.
With this patch we stop using this flag in NRVO contexts, to preserve code
that relies on that assumption.
This causes no known functional changes, we just stop firing some asserts
in a cople of included test cases.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D109800
This improves diagnostic (& important to me, DWARF) accuracy - otherwise
there could be ambiguities between "std::nullptr_t" and some user-defined
type that's /actually/ "nullptr_t" defined in the global namespace.
Differential Revision: https://reviews.llvm.org/D110044
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D102107
This patch changes the signature of the load and store vector pair
builtins to match their documentation. The type of the `signed long long`
argument is changed to `signed long`. This patch also changes existing testcases
to match the signature change.
Reviewed By: lei, Conanap
Differential Revision: https://reviews.llvm.org/D109996
RUN line representing C++ for OpenCL 2021 added to the test. This
should have been done as part of earlier commit fb321c2ea2 but
was missed during rebasing.
Differential Revision: https://reviews.llvm.org/D109492
When building in C mode, the VC runtime assumes that it can use pointer
aliasing through `char *` for the parameter to `__va_start`. Relax the
checks further. In theory we could keep the tests strict for non-system
header code, but this takes the less strict approach as the additional
check doesn't particularly end up being too much more helpful for
correctness. The C++ type system is a bit stricter and requires the
explicit cast which we continue to verify.
While at it, add the diagnosis message "left operand of comma operator has no effect" (used by GCC) for comma operator.
This also makes Clang diagnose in the constant evaluation context which aligns with GCC/MSVC behavior. (https://godbolt.org/z/7zxb8Tx96)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D103938
D109607 results in a regression in llvm-test-suite.
The reason is we didn't check the size of SourceTy, so that we will
return wrong SSE type when SourceTy is overlapped.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D110037
In ValueTracking.cpp we use a function called
computeKnownBitsFromOperator to determine the known bits of a value.
For the vscale intrinsic if the function contains the vscale_range
attribute we can use the maximum and minimum values of vscale to
determine some known zero and one bits. This should help to improve
code quality by allowing certain optimisations to take place.
Tests added here:
Transforms/InstCombine/icmp-vscale.ll
Differential Revision: https://reviews.llvm.org/D109883
Previous changes like D101202 and D104261 have eliminated the special
status that break and continue once had, since now we're making
decisions purely based on the structure of the CFG without regard for
the underlying source code constructs.
This means we don't gain anything from defering handling for these
blocks. Dropping it moves some diagnostics, though arguably into a
better place. We're working around a "quirk" in the CFG that perhaps
wasn't visible before: while loops have an empty "transition block"
where continue statements and the regular loop exit meet, before
continuing to the loop entry. To get a source location for that, we
slightly extend our handling for empty blocks. The source location for
the transition ends up to be the loop entry then, but formally this
isn't a back edge. We pretend it is anyway. (This is safe: we can always
treat edges as back edges, it just means we allow less and don't modify
the lock set. The other way around it wouldn't be safe.)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106715
Adds support for a feature macro __opencl_c_3d_image_writes in
C++ for OpenCL 2021 enabling a respective optional core feature
from OpenCL 3.0.
This change aims to achieve compatibility between C++ for OpenCL
2021 and OpenCL 3.0.
Differential Revision: https://reviews.llvm.org/D109328
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.
A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h
Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91944
Previously in D104261 we warned about dropping locks from back edges,
this is the corresponding change for exclusive/shared joins. If we're
entering the loop with an exclusive change, which is then relaxed to a
shared lock before we loop back, we have already analyzed the loop body
with the stronger exclusive lock and thus might have false positives.
There is a minor non-observable change: we modify the exit lock set of a
function, but since that isn't used further it doesn't change anything.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106713
The new device runtime uses an internal variable to set debugging. This
variable was originally privately linked because every module will have
a copy of it. This caused problems with merging the device bitcode
library because it would get renamed and there was not a way to refer to
an external, private symbol. This changes the symbol to weak_odr so it
can be defined multiply, but will not be renamed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109997
This fixes a bug in clang where, when clang sees a switch with a
fallthrough to a default like this:
static void funcA(void) {}
static void funcB(void) {}
int main(int argc, char **argv) {
switch (argc) {
case 0:
funcA();
break;
case 10:
default:
funcB();
break;
}
}
It does not add a proper debug location for that switch case, such as
case 10: above.
Patch by Shubham Rastogi!
Differential Revision: https://reviews.llvm.org/D109940
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.
A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h
Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91944
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.
A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h
Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91944
Use irbuilder as default and remove redundant Clang codegen for masked construct and master construct.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D100874
Windows on armv7 is as alignment tolerant as Linux.
The alignment considerations in the Windows on ARM ABI are documented
at https://docs.microsoft.com/en-us/cpp/build/overview-of-arm-abi-conventions?view=msvc-160#alignment.
The document doesn't explicitly say in which state the OS configures
the SCTLR.A register (and it's not accessible from user space to
inspect), but in practice, unaligned loads/stores do work and seem
to be as fast as aligned loads and stores. (Unaligned strd also does
seem to work, contrary to Linux, but significantly slower, as they're
handled by the kernel - exactly as the document describes.)
Differential Revision: https://reviews.llvm.org/D109960
Re-add -fexperimental-new-pass-manager to
Clang::CodeGen/pgo-sample-thinlto-summary.c for the test to work on
builds that still default to the old pass manager.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D109956
Seemingly, names in anonymous namespaces are ALWAYS given the unique
internal linkage name on windows, and I was not aware of this when I put
the names in my test! Replaced them with a wildcard.
Adds support for a feature macro `__opencl_c_read_write_images` in
C++ for OpenCL 2021 enabling a respective optional core feature
from OpenCL 3.0.
This change aims to achieve compatibility between C++ for OpenCL
2021 and OpenCL 3.0.
Differential Revision: https://reviews.llvm.org/D109307
We previously made all multiversioning resolvers/ifuncs have weak
ODR linkage in IR, since we NEED to emit the whole resolver every time
we see a call, but it is not necessarily the place where all the
definitions live.
HOWEVER, when doing so, we neglected the case where the versions have
internal linkage. This patch ensures we do this, so you don't get weird
behavior with static functions.
Adds support for a feature macro `__opencl_c_pipes` in C++ for
OpenCL 2021 enabling a respective optional core feature from
OpenCL 3.0.
This change aims to achieve compatibility between C++ for OpenCL
2021 and OpenCL 3.0.
Differential Revision: https://reviews.llvm.org/D109306
fae0dfa changed code to check 128-bit float availability, since it
introduced a new 128-bit double type on PowerPC. However, there're other
long float types besides IEEE float128 and PPC double-double requiring
this feature.
Reviewed By: ronlieb
Differential Revision: https://reviews.llvm.org/D109943
D105263 adds support for _Float16 type. It introduced a bug (pr51813) that generates a <4 x half> type instead the default double when passing blank structure by SSE registers.
Although I doubt it may expose a bug somewhere other than D105263, it's good to avoid return half type when no half type in arguments.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D109607
AIX and z/OS lack Objective-C support, so mark these tests as unsupported for AIX and z/OS.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D109060
This patch supports construct trait set selector by using the existed
declare variant infrastructure inside `OMPContext` and simd selector is
currently not supported. The goal of this patch is to pass the declare variant
test inside sollve test suite.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109635
Specify the C and C++ standards explicitly for this test. This avoids
failures for drivers that default to older standards.
Differential Revision: https://reviews.llvm.org/D109857
Summary:
Introduce a new frontend flag `-fswift-async-fp={auto|always|never}`
that controls how code generation sets the Swift extended async frame
info bit. There are three possibilities:
* `auto`: which determines how to set the bit based on deployment target, either
statically or dynamically via `swift_async_extendedFramePointerFlags`.
* `always`: default, always set the bit statically, regardless of deployment
target.
* `never`: never set the bit, regardless of deployment target.
Differential Revision: https://reviews.llvm.org/D109451
This patch increases the expected line number for one of the checks so that it doesn't have to be updated for any added/removed lines in the RUN section.
This change is in preparation for the following patch: https://reviews.llvm.org/D109060
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D109541
Remove the previous error and add support for special handling of small
complex types as in PPC64 ELF ABI. As in, generate code to load from
varargs location and pack it in a temp variable, then return a pointer to
the struct.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D106393
Recently a vulnerability issue is found in the implementation of VLLDM
instruction in the Arm Cortex-M33, Cortex-M35P and Cortex-M55. If the
VLLDM instruction is abandoned due to an exception when it is partially
completed, it is possible for subsequent non-secure handler to access
and modify the partial restored register values. This vulnerability is
identified as CVE-2021-35465.
The mitigation sequence varies between v8-m and v8.1-m as follows:
v8-m.main
---------
mrs r5, control
tst r5, #8 /* CONTROL_S.SFPA */
it ne
.inst.w 0xeeb00a40 /* vmovne s0, s0 */
1:
vlldm sp /* Lazy restore of d0-d16 and FPSCR. */
v8.1-m.main
-----------
vscclrm {vpr} /* Clear VPR. */
vlldm sp /* Lazy restore of d0-d16 and FPSCR. */
More details on
developer.arm.com/support/arm-security-updates/vlldm-instruction-security-vulnerability
Differential Revision: https://reviews.llvm.org/D109157