Fixes the issue D118987 by mapping the propagation to the callsite's
LocationContext.
This way we can keep track of the in-flight propagations.
Note that empty propagation sets won't be inserted.
Reviewed By: NoQ, Szelethus
Differential Revision: https://reviews.llvm.org/D119128
Recently we uncovered a serious bug in the `GenericTaintChecker`.
It was already flawed before D116025, but that was the patch that turned
this silent bug into a crash.
It happens if the `GenericTaintChecker` has a rule for a function, which
also has a definition.
char *fgets(char *s, int n, FILE *fp) {
nested_call(); // no parameters!
return (char *)0;
}
// Within some function:
fgets(..., tainted_fd);
When the engine inlines the definition and finds a function call within
that, the `PostCall` event for the call will get triggered sooner than the
`PostCall` for the original function.
This mismatch violates the assumption of the `GenericTaintChecker` which
wants to propagate taint information from the `PreCall` event to the
`PostCall` event, where it can actually bind taint to the return value
**of the same call**.
Let's get back to the example and go through step-by-step.
The `GenericTaintChecker` will see the `PreCall<fgets(..., tainted_fd)>`
event, so it would 'remember' that it needs to taint the return value
and the buffer, from the `PostCall` handler, where it has access to the
return value symbol.
However, the engine will inline fgets and the `nested_call()` gets
evaluated subsequently, which produces an unimportant
`PreCall<nested_call()>`, then a `PostCall<nested_call()>` event, which is
observed by the `GenericTaintChecker`, which will unconditionally mark
tainted the 'remembered' arg indexes, trying to access a non-existing
argument, resulting in a crash.
If it doesn't crash, it will behave completely unintuitively, by marking
completely unrelated memory regions tainted, which is even worse.
The resulting assertion is something like this:
Expr.h: const Expr *CallExpr::getArg(unsigned int) const: Assertion
`Arg < getNumArgs() && "Arg access out of range!"' failed.
The gist of the backtrace:
CallExpr::getArg(unsigned int) const
SimpleFunctionCall::getArgExpr(unsigned int)
CallEvent::getArgSVal(unsigned int) const
GenericTaintChecker::checkPostCall(const CallEvent &, CheckerContext&) const
Prior to D116025, there was a check for the argument count before it
applied taint, however, it still suffered from the same underlying
issue/bug regarding propagation.
This path does not intend to fix the bug, rather start a discussion on
how to fix this.
---
Let me elaborate on how I see this problem.
This pre-call, post-call juggling is just a workaround.
The engine should by itself propagate taint where necessary right where
it invalidates regions.
For the tracked values, which potentially escape, we need to erase the
information we know about them; and this is exactly what is done by
invalidation.
However, in the case of taint, we basically want to approximate from the
opposite side of the spectrum.
We want to preserve taint in most cases, rather than cleansing them.
Now, we basically sanitize all escaping tainted regions implicitly,
since invalidation binds a fresh conjured symbol for the given region,
and that has not been associated with taint.
IMO this is a bad default behavior, we should be more aggressive about
preserving taint if not further spreading taint to the reachable
regions.
We have a couple of options for dealing with it (let's call it //tainting
policy//):
1) Taint only the parameters which were tainted prior to the call.
2) Taint the return value of the call, since it likely depends on the
tainted input - if any arguments were tainted.
3) Taint all escaped regions - (maybe transitively using the cluster
algorithm) - if any arguments were tainted.
4) Not taint anything - this is what we do right now :D
The `ExprEngine` should not deal with taint on its own. It should be done
by a checker, such as the `GenericTaintChecker`.
However, the `Pre`-`PostCall` checker callbacks are not designed for this.
`RegionChanges` would be a much better fit for modeling taint propagation.
What we would need in the `RegionChanges` callback is the `State` prior
invalidation, the `State` after the invalidation, and a `CheckerContext` in
which the checker can create transitions, where it would place `NoteTags`
for the modeled taint propagations and report errors if a taint sink
rule gets violated.
In this callback, we could query from the prior State, if the given
value was tainted; then act and taint if necessary according to the
checker's tainting policy.
By using RegionChanges for this, we would 'fix' the mentioned
propagation bug 'by-design'.
Reviewed By: Szelethus
Differential Revision: https://reviews.llvm.org/D118987
There is different bug types for different types of bugs but the **emitAdditionOverflowbug** seems to use bugtype **BT_NotCSting** but actually it have to use **BT_AdditionOverflow** .
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D119462
This adds support for http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2764.pdf,
which was adopted at the Feb 2022 WG14 meeting. That paper adds
[[noreturn]] and [[_Noreturn]] to the list of supported attributes in
C2x. These attributes have the same semantics as the [[noreturn]]
attribute in C++.
The [[_Noreturn]] attribute was added as a deprecated feature so that
translation units which include <stdnoreturn.h> do not get an error on
use of [[noreturn]] because the macro expands to _Noreturn. Users can
use -Wno-deprecated-attributes to silence the diagnostic.
Use of <stdnotreturn.h> or the noreturn macro were both deprecated.
Users can define the _CLANG_DISABLE_CRT_DEPRECATION_WARNINGS macro to
suppress the deprecation diagnostics coming from the header file.
We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e. we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.
Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.
That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.
This patch extends the `uwtable` attribute with an optional
value:
- `uwtable` (default to `async`)
- `uwtable(sync)`, synchronous unwind tables
- `uwtable(async)`, asynchronous (instruction precise) unwind tables
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D114543
When forming the function type from a declarator, we look for an
overloadable attribute before issuing a diagnostic in C about a
function signature containing only .... When the attribute is present,
we allow such a declaration for compatibility with the overloading
rules in C++. However, we were not looking for the attribute in all of
the places it is legal to write it on a declarator and so we only
accepted the signature in some forms and incorrectly rejected the
signature in others.
We now check for the attribute preceding the declarator instead of only
being applied to the declarator directly.
At import of a member it may require that the record is already set to complete.
(For example 'computeDependence' at create of some Expr nodes.)
The record at this time may not be completely imported, the result of layout
calculations can be incorrect, but at least no crash occurs this way.
A good solution would be if fields of every encountered record are imported
before other members of all records. This is much more difficult to implement.
Differential Revision: https://reviews.llvm.org/D116155
MSVC currently doesn't support 80 bits long double. But ICC does support
it on Windows. Besides, there're also some users asked for this feature.
We can find the discussions from stackoverflow, msdn etc.
Given Clang has already support `-mlong-double-80`, extending it to
support for Windows seems worthwhile.
Reviewed By: rnk, erichkeane
Differential Revision: https://reviews.llvm.org/D115441
Previously D113336 makes RISCVTargetInfo::initFeatureMap return the results
processed by RISCVISAInfo, which only consists of ISA features and misses
non-ISA features like `relax` and `save-restore`.
This patch fixes the problem.
Reviewed By: junparser
Differential Revision: https://reviews.llvm.org/D119541
Fixes https://github.com/llvm/llvm-project/issues/53758.
Braces in loops and in `if` statements with leading (block) comments were formatted according to `BraceWrapping.AfterFunction` and not `AllowShortBlocksOnASingleLine`/`AllowShortLoopsOnASingleLine`/`AllowShortIfStatementsOnASingleLine`.
Previously, the code:
```
while (true) {
f();
}
/*comment*/ while (true) {
f();
}
```
was incorrectly formatted to:
```
while (true) {
f();
}
/*comment*/ while (true) { f(); }
```
when using config:
```
BasedOnStyle: LLVM
BreakBeforeBraces: Custom
BraceWrapping:
AfterFunction: false
AllowShortBlocksOnASingleLine: false
AllowShortLoopsOnASingleLine: false
```
and it was (correctly but by chance) formatted to:
```
while (true) {
f();
}
/*comment*/ while (true) {
f();
}
```
when using enabling brace wrapping after functions:
```
BasedOnStyle: LLVM
BreakBeforeBraces: Custom
BraceWrapping:
AfterFunction: true
AllowShortBlocksOnASingleLine: false
AllowShortLoopsOnASingleLine: false
```
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D119649
Now that the old device runtime has been deleted there is only a single
target that differs by the triple and the architecture. Simplify the
scheme for identifying the library but directly using the triple.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D119638
Recently we observed high memory pressure caused by clang during some parallel builds.
We discovered that we have several projects that have a large number of #define directives
in their TUs (on the order of millions), which caused huge memory consumption in clang due
to a lot of allocations for MacroInfo. We would like to reduce the memory overhead of
clang for a single #define to reduce the memory overhead for these files, to allow us to
reduce the memory pressure on the system during highly parallel builds. This change achieves
that by removing the SmallVector in MacroInfo and instead storing the tokens in an array
allocated using the bump pointer allocator, after all tokens are lexed.
The added unit test with 1000000 #define directives illustrates the problem. Prior to this
change, on arm64 macOS, clang's PP bump pointer allocator allocated 272007616 bytes, and
used roughly 272 bytes per #define. After this change, clang's PP bump pointer allocator
allocates 120002016 bytes, and uses only roughly 120 bytes per #define.
For an example test file that we have internally with 7.8 million #define directives, this
change produces the following improvement on arm64 macOS: Persistent allocation footprint for
this test case file as it's being compiled to LLVM IR went down 22% from 5.28 GB to 4.07 GB
and the total allocations went down 14% from 8.26 GB to 7.05 GB. Furthermore, this change
reduced the total number of allocations made by the system for this clang invocation from
1454853 to 133663, an order of magnitude improvement.
Differential Revision: https://reviews.llvm.org/D117348
Makes lld-link work in a non-MSVC shell by autodetecting MSVC toolchain. Also
adds support for /winsysroot and a few other switches.
All this is done by refactoring to share code with clang-cl's existing support
for the same.
Differential Revision: https://reviews.llvm.org/D118070
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.
If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.
The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.
Reviewed By: jdoerfert, arsenm, kpyzhov
Differential Revision: https://reviews.llvm.org/D119216
OpenCL C 3.0 __opencl_c_subgroups feature is slightly different
then other equivalent features and extensions (fp64 and 3d image writes):
OpenCL C 3.0 device can support the extension but not the feature.
cl_khr_subgroups requires subgroup independent forward progress.
This patch adjusts the check which is used when translating language
builtins to check either the extension or feature is supported.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D118999
OpenCL C 3.0 introduces optionality to some builtins, in particularly
to those which are conditionally supported with pipe, device enqueue
and generic address space features.
The idea is to conditionally support such builtins depending on the language options
being set for a certain feature. This allows users to define functions with names
of those optional builtins in OpenCL (as such names are not reserved).
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D118605
This will be necessary later when we add support for evaluating logic
expressions such as && and ||.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D119447
The pointer is always dereferenced, so assert the cast is correct (which it should be as we just created that ScalableVectorType) instead of returning nullptr
Add the atomic overloads for the `global` and `local` address spaces,
which are new in OpenCL 3.0. Ensure the preexisting `generic`
overloads are guarded by the generic address space feature macro.
Ensure a subset of the atomic builtins are guarded by the
`__opencl_c_atomic_order_seq_cst` and `__opencl_c_atomic_scope_device`
feature macros, and enable those macros for SPIR/SPIR-V targets in
`opencl-c-base.h`.
Also guard the `cl_ext_float_atomics` builtins with the atomic order
and scope feature macros.
Differential Revision: https://reviews.llvm.org/D119420
`CallDescriptions` for builtin functions relaxes the match rules
somewhat, so that the `CallDescription` will match for calls that have
some prefix or suffix. This was achieved by doing a `StringRef::contains()`.
However, this is somewhat problematic for builtins that are substrings
of each other.
Consider the following:
`CallDescription{ builtin, "memcpy"}` will match for
`__builtin_wmemcpy()` calls, which is unfortunate.
This patch addresses/works around the issue by checking if the
characters around the function's name are not part of the 'name'
semantically. In other words, to accept a match for `"memcpy"` the call
should not have alphanumeric (`[a-zA-Z]`) characters around the 'match'.
So, `CallDescription{ builtin, "memcpy"}` will not match on:
- `__builtin_wmemcpy: there is a `w` alphanumeric character before the match.
- `__builtin_memcpyFOoBar_inline`: there is a `F` character after the match.
- `__builtin_memcpyX_inline`: there is an `X` character after the match.
But it will still match for:
- `memcpy`: exact match
- `__builtin_memcpy`: there is an _ before the match
- `__builtin_memcpy_inline`: there is an _ after the match
- `memcpy_inline_builtinFooBar`: there is an _ after the match
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D118388
In Clang we can attach TBAA metadata based on the load/store intrinsics
based on the operation's element type.
This also contains changes to InstCombine where the AArch64-specific
intrinsics are transformed into generic LLVM load/store operations,
to ensure that all metadata is transferred to the new instruction.
There will be some further work after this patch to also emit TBAA
metadata for SVE's gather/scatter- and struct load/store intrinsics.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D119319
An impilt used of C++ module without prebuild path may cause crash.
For example:
```
// ./dir1/C.cppm
export module C;
// ./dir2/B.cppm
export module B;
import C;
// ./A.cpp
import B;
import C;
```
When we compile A.cpp without the prebuild path of C.pcm, the compiler
will crash.
```
clang++ -std=c++20 --precompile -c ./dir1/C.cppm -o ./dir1/C.pcm
clang++ -std=c++20 --precompile -fprebuilt-module-path=./dir2 -c
./dir2/B.cppm -o ./dir2/B.pcm
clang++ -std=c++20 -fprebuilt-module-path=./dir2 A.cpp
```
The prebuilt path of module C is cached when import module B, and in the
function HeaderSearch::getCachedModuleFileName, the compiler try to get
the filename by modulemap without check if modulemap exists, and there
is no modulemap in C++ module.
Reviewed By: ChuanqiXu
Differential review: https://reviews.llvm.org/D119426
The superclass method handles a bunch of useful things. For example
it applies flags such as `-fnew-alignment` which doesn't work without
this patch.
Differential Revision: https://reviews.llvm.org/D118573
Due to the way type units work, this would lead to a declaration in a
type unit of a local type in a CU - which is ambiguous. Rather than
trying to resolve that relative to the CU that references the type unit,
let's just not try to simplify these names.
Longer term this should be fixed by not putting the template
instantiation in a type unit to begin with - since it references an
internal linkage type, it can't legitimately be duplicated/in more than
one translation unit, so skip the type unit overhead. (but the right fix
for that is to move type unit management into a DICompositeType flag
(dropping the "identifier" field is not a perfect solution since it
breaks LLVM IR linking decl/def merging during IR linking))
Lambda names aren't entirely canonical (as demonstrated by the
cross-project-test added here) at the moment (we should fix that for a
bunch of reasons) - even if the template referencing them is
non-simplified, other names referencing /that/ template can't be
simplified either because type units might cause a different template to
be picked up that would conflict with the expected name.
(other than for roundtripping precision, it'd be OK to simplify types
that reference types that reference lambdas - but best be consistent
between the roundtrip/verify mode and the actual simplified template
names mode)
The introduction and some examples are on this page:
https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/
The `/JMC` flag enables these instrumentations:
- Insert at the beginning of every function immediately after the prologue with
a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`.
The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean
global variable (the global variable is initialized to 1) with the name
convention `__<hash>_<filename>`. All such global variables are placed in
the `.msvcjmc` section.
- The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping
with a directory path. MSVC uses some unknown hashing function. Here I
used DJB.
- Add a dummy/empty COMDAT function `__JustMyCode_Default`.
- Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link
option via ".drectve" section. This is to prevent failure in
case `__CheckForDebuggerJustMyCode` is not provided during linking.
Implementation:
All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch).
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D118428
In preparing for module mangling changes I noticed some issues with
the way we check for std::basic_string instantiations and friends.
*) there's a single routine for std::basic_{i,o,io}stream but it is
templatized on the length of the name. Really? just use a
StringRef, rather than clone the entire routine just for
'basic_iostream'.
*) We have a helper routine to check for char type, and call it from
several places. But given all the instantiations are of the form
TPL<char, Other<char> ...> we could just check the first arg is char
and the later templated args are instantiating that same type. A
simpler type comparison.
*) Because basic_string has a third allocator parameter, it is open
coded, which I found a little confusing. But otherwise it's exactly
the same pattern as the iostream ones. Just tell that checker about
whether there's an expected allocator argument.[*]
*) We may as well return in each block of mangleStandardSubstitution
once we determine it is not one of the entities of interest -- it
certainly cannot be one of the other kinds of entities.
FWIW this shaves about 500 bytes off the executable.
[*] I suppose we could also have this routine a tri-value, with one to
indicat 'it is this name, but it's not the one you're looking for', to
avoid later calls trying different names?
Reviewd By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D119333
Even after D86621 <https://reviews.llvm.org/D86621>, `clang -m32` on
Solaris/sparcv9 doesn't inline atomics with 8-byte operands, unlike `gcc`.
This leads to many link failures in the testsuite (undefined references to
`__atomic_load_8` and `__sync_val_compare_and_swap_8`. Until a proper
codegen fix can be implemented, this patch works around the first of those
by linking with `-latomic`.
Tested on `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D118021
Reduce the amount of repetition in the declarations by leveraging more
TableGen constructs. This is in preparation for adding the OpenCL 3.0
atomics feature optionality.
This introduces the new __ARM_FEATURE_MOPS ACLE feature test macro,
which signals the availability of the new Armv8.8-A/Armv9.3-A
instructions for standardising memcpy, memset and memmove operations.
This patch supersedes the one from https://reviews.llvm.org/D116160.
Differential Revision: https://reviews.llvm.org/D118199
Neither LLDB nor GDB seem to work with DWARF 5 debug information on
Windows right now.
This applies the same change as in
9c62728610 (Default to DWARFv4 on Windows)
to the MinGW driver too.
Differential Revision: https://reviews.llvm.org/D119326
ASTReader
This is a cleanup to reduce the lines of code to handle default template
argument in ASTReader.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D118437
Ensure CLANG_PLUGIN_SUPPORT is compatible with llvm_add_library.
Fixes an issue noted in D111100.
Differential Revision: https://reviews.llvm.org/D119199
Fixes https://github.com/llvm/llvm-project/issues/53576.
There was an inconsistency in formatting of delete expressions.
Before:
```
delete (void*)a;
delete[](void*) a;
```
After this patch:
```
delete (void*)a;
delete[] (void*)a;
```
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D119117
An error in the tablegen description affects the declarations
provided by `-fdeclare-opencl-builtins` for `atomic_fetch_add` and
`atomic_fetch_sub`.
The atomic argument should be an atomic_half, not an atomic_float.
LRGraph is the key component of the clang pseudo parser, it is a
deterministic handle-finding finite-state machine, which is used to
generated the LR parsing table.
Separate from https://reviews.llvm.org/D118196.
Differential Revision: https://reviews.llvm.org/D119172
This will allow moving the IncludeCleaner library essentials to Clang
and decoupling them from the majority of clangd.
The patch itself just moves the code, it doesn't change existing
functionality.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D119130
clang-cl MSVC required version is 19.20 now. Update the default
-fms-compatibility-version to 19.14.
Differential Revision: https://reviews.llvm.org/D114639
code object version determines ABI, therefore should not be mixed.
This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code object version (default 4).
The amdgpu_code_object_version value is code object version times 100.
LLVM IR with different amdgpu_code_object_version module flag cannot
be linked.
The -cc1 option -mcode-object-version=none is for ROCm device library use
only, which supports multiple ABI.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D119026
1. Remove computeDefaultABIFromArch and add computeDefaultABI in
RISCVISAInfo.
2. Add parseFeatureBits which may used in D118333.
Differential Revision: https://reviews.llvm.org/D119250
The "-fzero-call-used-regs" option tells the compiler to zero out
certain registers before the function returns. It's also available as a
function attribute: zero_call_used_regs.
The two upper categories are:
- "used": Zero out used registers.
- "all": Zero out all registers, whether used or not.
The individual options are:
- "skip": Don't zero out any registers. This is the default.
- "used": Zero out all used registers.
- "used-arg": Zero out used registers that are used for arguments.
- "used-gpr": Zero out used registers that are GPRs.
- "used-gpr-arg": Zero out used GPRs that are used as arguments.
- "all": Zero out all registers.
- "all-arg": Zero out all registers used for arguments.
- "all-gpr": Zero out all GPRs.
- "all-gpr-arg": Zero out all GPRs used for arguments.
This is used to help mitigate Return-Oriented Programming exploits.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110869
This unifies a couple spots that did it manually by checking the
flag directly.
It does mean that we're now dropping the 5th component, but that's
not used in any of these checks, and to my knowledge it's never been
used in ld64.
When not going through the main Clang->LLVM type cache, we'd
accidentally create multiple different opaque types for a member pointer
type.
This allows us to remove the -verify-type-cache flag now that
check-clang passes with it on. We can do the verification in expensive
builds. Previously microsoft-abi-member-pointers.cpp was failing with
-verify-type-cache.
I suspect that there may be more issues when we have multiple member
pointer types and we clear the cache, but we can leave that for later.
Followup to D118744.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D119215
This introduces the new __ARM_FEATURE_MOPS ACLE feature test macro,
which signals the availability of the new Armv8.8-A/Armv9.3-A
instructions for standardising memcpy, memset and memmove operations.
This patch supersedes the one from https://reviews.llvm.org/D116160.
Differential Revision: https://reviews.llvm.org/D118199
The above change assumed that malloc (and friends) would always
allocate memory to getNewAlign(), even for allocations which have a
smaller size. This is not actually required by spec (a 1-byte
allocation may validly have 1-byte alignment).
Some real-world malloc implementations do not provide this guarantee,
and thus this optimization is breaking programs.
Fixes#53540
This reverts commit c2297544c0.
Differential Revision: https://reviews.llvm.org/D118804
The original warning added in D115501 when pacbti is used with an
incompatible architecture was not exactly correct because it was
not really ignored and can affect codegen.
Therefore reword to say that the pacbti option is incompatible with
the given architecture.
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D119166
These changes make the Clang parser recognize expression parameter pack
expansion and initializer lists in attribute arguments. Because
expression parameter pack expansion requires additional handling while
creating and instantiating templates, the support for them must be
explicitly supported through the AcceptsExprPack flag.
Handling expression pack expansions may require a delay to when the
arguments of an attribute are correctly populated. To this end,
attributes that are set to accept these - through setting the
AcceptsExprPack flag - will automatically have an additional variadic
expression argument member named DelayedArgs. This member is not
exposed the same way other arguments are but is set through the new
CreateWithDelayedArgs creator function generated for applicable
attributes.
To illustrate how to implement support for expression pack expansion
support, clang::annotate is made to support pack expansions. This is
done by making handleAnnotationAttr delay setting the actual attribute
arguments until after template instantiation if it was unable to
populate the arguments due to dependencies in the parsed expressions.
Implement P2128R6 in C++23 mode.
Unlike GCC's implementation, this doesn't try to recover when a user
meant to use a comma expression.
Because the syntax changes meaning in C++23, the patch is *NOT*
implemented as an extension. Instead, declaring an array with not
exactly 1 parameter is an error in older languages modes. There is an
off-by-default extension warning in C++23 mode.
Unlike the standard, we supports default arguments;
Ie, we assume, based on conversations in WG21, that the proposed
resolution to CWG2507 will be accepted.
We allow arrays OpenMP sections and C++23 multidimensional array to
coexist:
[a , b] multi dimensional array
[a : b] open mp section
[a, b: c] // error
The rest of the patch is relatively straight forward: we take care to
support an arbitrary number of arguments everywhere.
The documentation for the official (downstream) Qualcomm Hexagon Clang
states that -mhvx sets the HVX version to be the same as the CPU version.
The current implementation upstream would use the most recent versioned
-mhvx= flag first (if present), then the CPU version. Change the upstream
behavior to match the documented behavior of the downstream compiler.
Among many FoldingSet users most notable seem to be ASTContext and CodeGenTypes.
The reasons that we spend not-so-tiny amount of time in FoldingSet calls from there, are following:
1. Default FoldingSet capacity for 2^6 items very often is not enough.
For PointerTypes/ElaboratedTypes/ParenTypes it's not unlikely to observe growing it to 256 or 512 items.
FunctionProtoTypes can easily exceed 1k items capacity growing up to 4k or even 8k size.
2. FoldingSetBase::GrowBucketCount cost itself is not very bad (pure reallocations are rather cheap thanks to BumpPtrAllocator).
What matters is high collision rate when lot of items end up in same bucket slowing down FoldingSetBase::FindNodeOrInsertPos and trashing CPU cache
(as items with same hash are organized in intrusive linked list which need to be traversed).
This change address both issues by increasing initial size of FoldingSets used in ASTContext and CodeGenTypes.
Extracted from: https://reviews.llvm.org/D118385
Differential Revision: https://reviews.llvm.org/D118608
Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.
This patch introduces some logic checking common misuses involving
`-analyze-function`.
Reviewed-By: martong
Differential Revision: https://reviews.llvm.org/D118690
Following the discussion on D118229, this marks all pointer-typed
kernel arguments as having ABI alignment, per section 6.3.5 of
the OpenCL spec:
> For arguments to a __kernel function declared to be a pointer to
> a data type, the OpenCL compiler can assume that the pointee is
> always appropriately aligned as required by the data type.
Differential Revision: https://reviews.llvm.org/D118894
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions
This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:
__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
// CHECK-LABEL: test_mm256_adds_epi8
// CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
return _mm256_adds_epi8(a, b);
}
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions
This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:
__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
// CHECK-LABEL: test_mm256_adds_epi8
// CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
return _mm256_adds_epi8(a, b);
}