At this point, `F.ImportLoc` has not been initialized by the `ASTReader` yet and using it leads to an assertion failure.
Introduced in 638c673a8c and 4445135109.
If the `assume-controlled-environment` is `true`, we should expect `getenv()`
to succeed, and the result should not be considered tainted.
By default, the option will be `false`.
Reviewed By: NoQ, martong
Differential Revision: https://reviews.llvm.org/D111296
The `getenv()` function might return `NULL` just like any other function.
However, in case of `getenv()` a state-split seems justified since the
programmer should expect the failure of this function.
`secure_getenv(const char *name)` behaves the same way but is not handled
right now.
Note that `std::getenv()` is also not handled.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D111245
This reverts commit 97f0c63783.
As discussed in https://reviews.llvm.org/D110684, it increased the
compile time and the binary size of clang more than 1%. I reverted
this patch first to think about a better way to do it.
GCC 9.1 removed Intel MPX support. Linux kernel removed MPX in 2019.
glibc 2.35 will remove MPX.
Our support is limited: we support assembling of bndmov but not bnd.
Just remove it.
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D111517
CUDA-11 headers rely on these NVCC builtins.
Despite having `__nv` previx, those are *not* provided by libdevice.
Differential Revision: https://reviews.llvm.org/D111665
Clarify the message provided when the analyzer catches the use of memory
that is allocated with size zero.
Differential Revision: https://reviews.llvm.org/D111655
CFGBuilder::addStmt() implicitly passes AddStmtChoice::AlwaysAdd
to Visit() already, so this should have no behavior change.
Differential Revision: https://reviews.llvm.org/D111570
Added support of a "--nvlink-path" option in clang-nvlink-wrapper which
takes the path of nvlink binary.
Static Device Library support for OpenMP (D105191) now searches for
nvlink binary and passes its location via this option. In absence
of this option, nvlink binary is searched in locations in PATH.
Differential Revision: https://reviews.llvm.org/D111488
This is the second part of p0388, dealing with overloads of list
initialization to incomplete array types. It extends the handling
added in D103088 to permit incomplete arrays. We have to record that
the conversion involved an incomplete array, and so (re-add) a bit flag
into the standard conversion sequence object. Comparing such
conversion sequences requires knowing (a) the number of array elements
initialized and (b) whether the initialization is of an incomplete array.
This also updates the web page to indicate p0388 is implemented (there
is no feature macro).
Differential Revision: https://reviews.llvm.org/D103908
This implements the new implicit conversion sequence to an incomplete
(unbounded) array type. It is mostly Richard Smith's work, updated to
trunk, testcases added and a few bugs fixed found in such testing.
It is not a complete implementation of p0388.
Differential Revision: https://reviews.llvm.org/D102645
`[[clang::fallthrough]]` has meaning for the CFG, but all other
StmtAttrs we currently have don't. So omit them, as AttributedStatements
with children cause several issues and there's no benefit in including
them.
Fixes PR52103 and PR49454. See PR52103 for details.
Differential Revision: https://reviews.llvm.org/D111568
To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.
In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.
We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.
There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.
Depends on D102923.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102488
For dependency scanning, it would be useful to collect header search paths (provided on command-line via `-I` and friends) that were actually used during preprocessing. This patch adds that feature to `HeaderSearch` along with a new remark that reports such paths as they get used.
Previous version of this patch tried to use the existing `LookupFileCache` to report used paths via `HitIdx`. That doesn't work for `ComputeUserEntryUsage` (which is intended to be called *after* preprocessing), because it indexes used search paths by the file name. This means the values get overwritten when the code contains `#include_next`.
Note that `HeaderSearch` doesn't use `HeaderSearchOptions::UserEntries` directly. Instead, `InitHeaderSearch` pre-processes them (adds platform-specific paths, removes duplicates, removes paths that don't exist) and creates `DirectoryLookup` instances. This means we need a mechanism for translating between those two. It's not possible to go from `DirectoryLookup` back to the original `HeaderSearch`, so `InitHeaderSearch` now tracks the relationships explicitly.
Depends on D111557.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102923
Add atomic_half types and builtins operating on the types from the
cl_ext_float_atomics extension.
Patch by Haonan Yang.
Differential Revision: https://reviews.llvm.org/D109740
This fixes an LLDB build failure where the `ImportLoc` argument is missing: https://lab.llvm.org/buildbot#builders/68/builds/19975
This change also makes it possible to drop `SourceLocation()` in `Preprocessor::getCurrentModule`.
This patch propagates the import `SourceLocation` into `HeaderSearch::lookupModule`. This enables remarks on search path usage (implemented in D102923) to point to the source code that initiated header search.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111557
Rename vfredsum and vfwredsum to vfredusum and vfwredusum. Add aliases for vfredsum and vfwredsum.
Reviewed By: luismarques, HsiangKai, khchen, frasercrmck, kito-cheng, craig.topper
Differential Revision: https://reviews.llvm.org/D105690
Current btf_tag is applied to declaration only.
Per discussion in https://reviews.llvm.org/D111199,
we plan to introduce btf_type_tag attribute for types.
So rename btf_tag to btf_decl_tag to make it easily
differentiable from btf_type_tag.
Differential Revision: https://reviews.llvm.org/D111588
Sequel patch to https://reviews.llvm.org/D111293.
Remove call to CodeGenFunction::InitTempAlloca() from OpenMP related
codegen part.
Also remove the metadata `!llvm.access.group` from the updated lit
tests.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D111316
In the original design, we levarage _mt intrinsics to define macros for
_m intrinsics. Such as,
```
__builtin_rvv_vadd_vv_i8m1_mt((vbool8_t)(op0), (vint8m1_t)(op1), (vint8m1_t)(op2), (vint8m1_t)(op3), (size_t)(op4), (size_t)VE_TAIL_AGNOSTIC)
```
However, we could not define generic interface for mask intrinsics any
more due to clang_builtin_alias only accepts clang builtins as its
argument.
In the example,
```
__rvv_overloaded
__attribute__((clang_builtin_alias(__builtin_rvv_vadd_vv_i8m1_mt)))
vint8m1_t vadd(vbool8_t op0, vint8m1_t op1, vint8m1_t op2, vint8m1_t
op3, size_t op4, size_t op5);
```
op5 is the tail policy argument. When users want to use vadd generic
interface for masked vector add, they need to specify tail policy in the
previous design. In this patch, we define _m intrinsics as clang
builtins to solve the problem.
Differential Revision: https://reviews.llvm.org/D110684
"darwin" is ambiguous. When there isn't a better source
of truth (e.g., SDKs), the driver will either interpret it
as "iOS" when cross-compiling to a different architecture,
or "the host" when not. That's now the case on AS Macs.
Update the test to more explicitly test the OS.
aarch64-mac-cpus.c already tests the mac-specific driver logic.
This reland commit 1131b1eb35, which
adds support to __attribute__((availability)) annotation for Fuchsia
platform. This patch also adds '-ffuchsia-api-level' to allow specify
Fuchsia API level from the command line.
Differential Revision: https://reviews.llvm.org/D108592
usage of an abstract class type within itself.
We were missing handling for deduction guides (which would assert),
friend declarations, and variable templates. We were mishandling inline
variables and other variables defined inside the class definition.
These diagnostics should be downgraded to warnings, or perhaps removed
entirely, once we implement P0929R2.
This reverts commit b875343873.
Per discussion in https://reviews.llvm.org/D111199, instead to make
existing btf_tag attribute as a type-or-decl attribute, we will
make existing btf_tag attribute as a decl only attribute, and
introduce btf_type_tag as a type only attribute. This will make
it easy for cases like typedef where an attribute may be applied
as either a type attribute or a decl attribute.
This patch adds support to __attribute__((availability)) annotation for
Fuchsia platform. This patch also adds '-ffuchsia-api-level' to allow
specify Fuchsia API level from the command line.
Differential Revision: https://reviews.llvm.org/D108592
When AnnotateAttr is on a function, AddGlobalAnnotations is only called
in CodeGenModule::EmitGlobalFunctionDefinition which means AnnotateAttr
on function declaration without function body will be ignored.
The patch will move AddGlobalAnnotations to
CodeGenModule::SetFunctionAttributes, so with or without function body,
the AnnotateAttr will get code gen for a function.
It'll help case when AnnotateAttr is on external function, and the
AnnotateAttr will be consumed in IR level.
For example, a pass to collect num of uses for functions with
__attribute((annotate("count_use"))) after optimizations,
As long as there's __attribute((annotate("count_use"))), function with
or without function body should be counted.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D111109
Patch by: python3kgae (Xiang Li)
The IR intrinsics use ImmArg for the policy operand so this needs to be enforced as a constant in the frontend.
Differential Revision: https://reviews.llvm.org/D110779
armv9-a, armv9.1-a and armv9.2-a can be targeted using the -march option
both in ARM and AArch64.
- Armv9-A maps to Armv8.5-A.
- Armv9.1-A maps to Armv8.6-A.
- Armv9.2-A maps to Armv8.7-A.
- The SVE2 extension is enabled by default on these architectures.
- The cryptographic extensions are disabled by default on these
architectures.
The Armv9-A architecture is described in the Arm® Architecture Reference
Manual Supplement Armv9, for Armv9-A architecture profile
(https://developer.arm.com/documentation/ddi0608/latest).
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D109517
As for 128-bit floating points on PowerPC, compiler should have three
machine modes:
- IFmode, always IBM extended double
- KFmode, always IEEE 754R 128-bit floating point
- TFmode, matches the semantics for long double
This commit adds support for IF mode with its complex variant, IC mode.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D109950
The SSE4 header (smmintrin.h) should include SSSE3 (tmmintrin.h) instead
of SSE2 (emmintrin.h).
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D111482
Without this, the combination of `-ast-dump=json` and `-ast-dump-filter FILTER` produces invalid JSON: the first line is a string that says `Dumping $SOME_DECL_NAME: `.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108441
NOTE: some files are being removed from those files that are clang-formatted
which means some lack of formatting is slipping through the net on reviews
C++20 and later allow you to pass no argument for the ... parameter in
a variadic macro, whereas earlier language modes and C disallow it.
We no longer diagnose in C++20 and later modes. This fixes PR51609.
Some of the first supported version field were incorrectly attributed to a later branch.
It wasn't possible to correctly determine the "introduced version" with my naive implementation
using git blame alone, (especially if the type had been changed from a bool -> enum)
I saw more things attributed to clang-format 13 than I remembered and reviewed
those options to determine their introduced version.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D110803
CodeGenFunction::InitTempAlloca() inits the static alloca within the
entry block which may *not* necessarily be correct always.
For example, the current instruction insertion point (pointed by the
instruction builder) could be a program point which is hit multiple
times during the program execution, and it is expected that the static
alloca is initialized every time the program point is hit.
Hence remove CodeGenFunction::InitTempAlloca(), and initialize the
static alloca where the instruction insertion point is at the moment.
This patch, as a starting attempt, removes the calls to
CodeGenFunction::InitTempAlloca() which do not have any side effect on
the lit tests.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D111293
In this case, we know statically that we're destroying the most-derived
class, so the vptr must already point to the current class and never
needs to be updated.
fae0dfa implemented the new __ibm128 type, this patch enables its
complex form.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D109948
Currently, there're multiple float types that can be represented by
__attribute__((mode(xx))). It's parsed, and then a corresponding type is
created if available.
This refactor moves the enum for mode into a global enum class visible
to ASTContext.
Reviewed By: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D111391
This patch adds support for the
`__kmpc_get_hardware_num_threads_in_block` function that returns the
number of threads. This was missing in the new runtime and was used by
the AMDGPU plugin which prevented it from using the new runtime. This
patchs also unified the interface for getting the thread numbers in the
frontend.
Originally authored by jdoerfert.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D111475
There are functions where we do not want function instrumentation which is why we have `__attribute__((no_instrument_function))`. Extending this functionality to disable instrumentation for Objective-C methods as well. Objective C methods like `+load` run premain and having instrumentation on them causes runtime errors depending on the implementation of `__cyg_profile_func_enter` etc. functions
Reviewed By: rjmccall, aaron.ballman
Differential Revision: https://reviews.llvm.org/D111286
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
Unfortunately I've not found a way to exercise this code that doesn't
crash elsewhere yet, due to unrelated bugs in how Sema incorrectly
instantiates lambdas in function template signatures.
Distinct lambda expressions are always considered non-equivalent, so two
token-for-token identical function declarations whose signatures involve
lambda-expressions declare distinct functions.
This patch updates the vec_extract builtins to take a signed int as the second
parameter, as defined by the Power Vector Intrinsics Programming Reference.
This patch is NFC and all existing tests pass.
Differential Revision: https://reviews.llvm.org/D110935
__builtin_assume_aligned's second parameter is size_t, which may be 32 bits.
We can't pass 2^32 when that happens. Update tests accordingly.
Example broken bot due to D111250:
https://lab.llvm.org/buildbot/#/builders/171/builds/4531
It may be possible to avoid relying on accessing many individual class pages,
by instead scanning the class index page at
https://clang.llvm.org/doxygen/classes.html. This updates the script to do so,
and includes updates to `LibASTMatchersReference.html` generated by the
modified script.
Reviewed By: aaron.ballman, sammccall
Differential Revision: https://reviews.llvm.org/D111332
Previously if you passed an absolute path to clang, where only part of
the path to the file was remapped, it would result in the file's DIFile
being stored with a duplicate path, for example:
```
!DIFile(filename: "./ios/Sources/bar.c", directory: "./ios/Sources")
```
This change handles absolute paths, specifically in the case they are
remapped to something relative, and uses the dirname for the directory,
and basename for the filename.
This also adds a test verifying this behavior for more standard uses as
well.
Differential Revision: https://reviews.llvm.org/D111352
If we only delete lines that are outer block statements (if, while, etc),
clang-format-diff.py can't format the statements inside the block statements.
An example to repro:
1. Delete the if statment at line 118 in llvm/lib/CodeGen/Analysis.cpp.
2. Run `git diff -U0 --no-color HEAD^ | clang/tools/clang-format/clang-format-diff.py -i -p1`
It fails to format the statement after if.
Differential Revision: https://reviews.llvm.org/D111273
The following tests are failing due to missing DWARF sections. This patch sets these tests as XFAIL/DISABLED on AIX until a more permanent solution is implemented.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D111336
non-Darwin ObjC runtimes:
- Use the same logic the Darwin runtime does for inferring that a
receiver is non-null and therefore doesn't require null checks.
Previously we weren't skipping these for non-super dispatch.
- Emit a null check when there's a consumed parameter so that we can
destroy the argument if the call doesn't happen. This mostly
involves extracting some common logic from the Darwin-runtime code.
- Generate a zero aggregate by zeroing the same memory that was used
in the method call instead of zeroing separate memory and then
merging them with a phi. This uses less memory and avoids unnecessary
copies.
- Emit zero initialization, and generate zero values in phis, using
the proper zero-value routines instead of assuming that the zero
value of the result type has a bitwise-zero representation.
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib
It currently doesn't support linking an archive directly, like:
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a
Linking with x86 offload also does not work.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D105191
As discussed in D109948, pre-computing all complex float types is not
necessary and brings extra overhead. This patch removes these defined
types, and construct them in-place when needed.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D111387
Original commit message: "
Original commit message: "
Original commit message:"
The current infrastructure in lib/Interpreter has a tool, clang-repl, very
similar to clang-interpreter which also allows incremental compilation.
This patch moves clang-interpreter as a test case and drops it as conditionally
built example as we already have clang-repl in place.
Differential revision: https://reviews.llvm.org/D107049
"
This patch also ignores ppc due to missing weak symbol for __gxx_personality_v0
which may be a feature request for the jit infrastructure. Also, adds a missing
build system dependency to the orc jit.
"
Additionally, this patch defines a custom exception type and thus avoids the
requirement to include header <exception>, making it easier to deploy across
systems without standard location of the c++ headers.
"
This patch also works around PR49692 and finds a way to use llvm::consumeError
in rtti mode.
Differential revision: https://reviews.llvm.org/D107049
At this point it looks like a B extension will never exist. Instead
Zba, Zbb, Zbc, and Zbs are individual extensions being ratified
together as a package. Unknown at this time when or if the other
Zb* extensions will be ratified.
This patch removes references to the B extension. I've updated and
split tests accordingly.
This has been split from D110669 to make review a little easier.
Differential Revision: https://reviews.llvm.org/D111338
This patch adds two flags to be supported for the new runtime. The flags
are `-fopenmp-assume-threads-oversubscription` and
-fopenmp-assume-teams-oversubscription`. These add global values that
can be checked by the work sharing runtime functions to make better
judgements about how to distribute work between the threads.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111348
When have ObjCInterfaceDecl with the same name in 2 different modules,
hitting the assertion
> Assertion failed: (Index < RL->getFieldCount() && "Ivar is not inside record layout!"),
> function lookupFieldBitOffset, file llvm-project/clang/lib/AST/RecordLayoutBuilder.cpp, line 3434.
on accessing an ivar inside a method. The assertion happens because
ivar belongs to one module while its containing interface belongs to
another module and then we fail to find the ivar inside the containing
interface. We already keep a single ObjCInterfaceDecl definition in
redecleration chain and in this case containing interface was correct.
The issue is with ObjCIvarDecl. IVar decl for IRGen is taken from
ObjCIvarRefExpr that is created in `Sema::BuildIvarRefExpr` using ivar
decl returned from `Sema::LookupIvarInObjCMethod`. And ivar lookup
returns a wrong decl because basically we take the first ObjCIvarDecl
found in `ASTReader::FindExternalVisibleDeclsByName` (called by
`DeclContext::lookup`). And in `ASTReader.Lookups` lookup table for a
wrong module comes first because `ASTReader::finishPendingActions`
processes `PendingUpdateRecords` in reverse order and the first
encountered ObjCIvarDecl will end up the last in `ASTReader.Lookups`.
Fix by merging ObjCIvarDecl from different modules correctly and by
using a canonical one in IRGen.
rdar://82854574
Differential Revision: https://reviews.llvm.org/D110280
Some subprojects like compiler-rt define the `darwin` feature in their
lit config, but clang does not do that, so we need to use the global
`system-darwin` here instead.
Differential Revision: https://reviews.llvm.org/D111267
mingw-g++ does not correctly support the full `std::errc` namespace as
worded in the standard[1]. As such, we cannot reliably use all names
therein. This patch changes the use of
`std::errc::state_not_recoverable`, to use portable error codes from the
`llvm::errc` equivalent.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71444
Reviewed by v.g.vassilev
Differential Revision: https://reviews.llvm.org/D111315
Desccribe in cxx_status.html the missing parts of the partially
implemented proposals described in cxx_status.html.
Uses <details> blocks so the information appears collapsed
by default.
This patch updates the vec_popcnt builtins to return vector unsigned,
as defined by the Power Vector Intrinsics Programming Reference.
This patch is NFC and all existing tests pass.
Differential Revision: https://reviews.llvm.org/D110934
The code of `ASTImporter::Import(const Attr *)` was repetitive,
it is now simplified. (There is still room for improvement but
probably only after big changes.)
Reviewed By: martong, steakhal
Differential Revision: https://reviews.llvm.org/D110810
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib
It currently doesn't support linking an archive directly, like:
clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a
Linking with x86 offload also does not work.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D105191
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
This reverts c7f16ab3e3 / r109694 - which
suggested this was done to improve consistency with the gdb test suite.
Possible that at the time GCC did not canonicalize integer types, and so
matching types was important for cross-compiler validity, or that it was
only a case of over-constrained test cases that printed out/tested the
exact names of integer types.
In any case neither issue seems to exist today based on my limited
testing - both gdb and lldb canonicalize integer types (in a way that
happens to match Clang's preferred naming, incidentally) and so never
print the original text name produced in the DWARF by GCC or Clang.
This canonicalization appears to be in `integer_types_same_name_p` for
GDB and in `TypeSystemClang::GetBasicTypeEnumeration` for lldb.
(I tested this with one translation unit defining 3 variables - `long`,
`long (*)()`, and `int (*)()`, and another translation unit that had
main, and a function that took `long (*)()` as a parameter - then
compiled them with mismatched compilers (either GCC+Clang, or
Clang+(Clang with this patch applied)) and no matter the combination,
despite the debug info for one CU naming the type "long int" and the
other naming it "long", both debuggers printed out the name as "long"
and were able to correctly perform overload resolution and pass the
`long int (*)()` variable to the `long (*)()` function parameter)
Did find one hiccup, identified by the lldb test suite - that CodeView
was relying on these names to map them to builtin types in that format.
So added some handling for that in LLVM. (these could be split out into
separate patches, but seems small enough to not warrant it - will do
that if there ends up needing any reverti/revisiting)
Differential Revision: https://reviews.llvm.org/D110455
The patch implements header-only support for testure lookups.
The patch has been tested on a source file with all possible combinations of
argument types supported by CUDA headers, compiled and verified that the
generated instructions and their parameters match the code generated by NVCC.
Unfortunately, compiling texture code requires CUDA headers and can't be tested
in clang itself. The test will need to be added to the test-suite later.
While generated code compiles and seems to match NVCC, I do not have any code
that uses textures that I could test correctness of the implementation. Hence
the experimental status.
Differential Revision: https://reviews.llvm.org/D110089
declaration.
Names starting with an underscore are reserved at the global scope, so
cannot be used as the name of an extern "C" symbol in any scope because
such usages conflict with a name at global scope.
Also do not warn on `#define _foo` or `#undef _foo`.
Only global scope names starting with _[a-z] are reserved, not the use
of such an identifier in any other context.
adjustMemberOfForLambdaCaptures.
The problem is happening when user passes lambda function with reference
type in the map clause.
The natural of the problem when processing generateInfoForCapture,
the BasePointer is generated with new load for a lambda variable with
reference type. It is not expected in adjustMemberOfForLambdaCaptures.
One way to fix this is to skipping call to generateInfoForCapture for
map(to:lambda). The map info will be generated later in the call to
generateDefaultMapInfo samiler as firsprivate clase.
This to fix https://bugs.llvm.org/show_bug.cgi?id=52071
Differential Revision:https://reviews.llvm.org/D111115
This is to save memory for Clang compiles.
Measuring building PassBuilder.cpp under /usr/bin/time, max rss goes from 0.93GB to 0.7GB.
This does not turn it by default yet.
I've turned on the option locally and run it over a good amount of files without any issues.
For more background, see
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111105
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Clang would reject
#pragma omp for
#pragma omp tile sizes(P)
for (int i = 0; i < 128; ++i) {}
where P is a template parameter, but the loop itself is not
template-dependent. Because P context-dependent, the TransformedStmt
cannot be generated and therefore is nullptr (until the template is
instantiated by TreeTransform). The OMPForDirective would still expect
the a loop is the dependent context and trigger an error.
Fix by introducing a NumGeneratedLoops field to OMPLoopTransformation.
This is used to distinguish the case where no TransformedStmt will be
generated at all (e.g. #pragma omp unroll full) and template
instantiation is needed. In the latter case, delay resolving the
iteration space like when the for-loop itself is template-dependent
until the template instatiation.
A more radical solution would always delay the iteration space analysis
until template instantiation, but would also break many test cases.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D111124
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently we're limited to 32 bit ints in diagnostics.
With support for 4GB alignments coming soon, we need to report 4GB as the max alignment allowed.
I've tested that this does indeed properly print 2^32.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D111184
There is an error in the implementation of the logic of reaching the `Unknonw` tristate in CmpOpTable.
```
void cmp_op_table_unknownX2(int x, int y, int z) {
if (x >= y) {
// x >= y [1, 1]
if (x + z < y)
return;
// x + z < y [0, 0]
if (z != 0)
return;
// x < y [0, 0]
clang_analyzer_eval(x > y); // expected-warning{{TRUE}} expected-warning{{FALSE}}
}
}
```
We miss the `FALSE` warning because the false branch is infeasible.
We have to exploit simplification to discover the bug. If we had `x < y`
as the second condition then the analyzer would return the parent state
on the false path and the new constraint would not be part of the State.
But adding `z` to the condition makes both paths feasible.
The root cause of the bug is that we reach the `Unknown` tristate
twice, but in both occasions we reach the same `Op` that is `>=` in the
test case. So, we reached `>=` twice, but we never reached `!=`, thus
querying the `Unknonw2x` column with `getCmpOpStateForUnknownX2` is
wrong.
The solution is to ensure that we reached both **different** `Op`s once.
Differential Revision: https://reviews.llvm.org/D110910
Insert OMPLoopTransformationDirective between OMPLoopBasedDirective and the loop transformations OMPTileDirective and OMPUnrollDirective. This simplifies handling of loop transformations not requiring distinguishing between OMPTileDirective and OMPUnrollDirective anymore.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D111119
It's true that docs.microsoft.com says:
"""The _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier compiler
intrinsics and the MemoryBarrier macro are all deprecated and should not
be used. For inter-thread communication, use mechanisms such as
atomic_thread_fence and std::atomic<T>, which are defined in the C++
Standard Library. For hardware access, use the /volatile:iso compiler
option together with the volatile keyword."""
And these attributes have been here since these builtins were added in
r192860.
However:
- cl.exe does not warn on them even with /Wall
- none of the replacements are useful for C code
- we don't add __attribute__((__deprecated__())) to any other
declarations in intrin.h
- intrin0.h in the MSVC headers declares _ReadWriteBarrier() (but
without the deprecation attribute), so you get inconsistent
deprecation warnings depending on if you include intrin.h or intrin0.h
The motivation is that compiling sqlite.h with clang-cl produces a
deprecation warning with clang-cl for _ReadWriteBarrier(), but not with
cl.exe.
Differential Revision: https://reviews.llvm.org/D111232
The default wchar type is different on AIX vs. Linux. When this test is run on
AIX, WCHAR_T_TYPE ends up being set to int. This is incorrect as the default
wchar type on AIX is actually unsigned short, and setting the type incorrectly
causes the expected errors to not be found.
This patch sets the type correctly (to unsigned short) for AIX.
Differential Revision: https://reviews.llvm.org/D110428
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
This simple change addresses a special case of structure/pointer
aliasing that produced different symbolvals, leading to false positives
during analysis.
The reproducer is as simple as this.
```lang=C++
struct s {
int v;
};
void foo(struct s *ps) {
struct s ss = *ps;
clang_analyzer_dump(ss.v); // reg_$1<int Element{SymRegion{reg_$0<struct s *ps>},0 S64b,struct s}.v>
clang_analyzer_dump(ps->v); //reg_$3<int SymRegion{reg_$0<struct s *ps>}.v>
clang_analyzer_eval(ss.v == ps->v); // UNKNOWN
}
```
Acks: Many thanks to @steakhal and @martong for the group debug session.
Reviewed By: steakhal, martong
Differential Revision: https://reviews.llvm.org/D110625
The builtin for vec_orc has support for the following two signatures,
but currently the compiler marks it ambiguous:
vector float vec_orc(vector float, vector float)
vector double vec_orc(vector double, vector double)
This patch implements these two builtins.
Differential revision: https://reviews.llvm.org/D110858
Currently we're limited to 32 bit ints in diagnostics.
With support for 4GB alignments coming soon, we need to report 4GB as the max alignment allowed.
I've tested that this does indeed properly print 2^32.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D111184
This also removes the need to disable the mandatory inlining phase in
tests.
In a departure from the previous remark, we don't output a 'cost' in
this case, because there's no such thing. We just report that inlining
happened because of the attribute.
Differential Revision: https://reviews.llvm.org/D110891