The command traits have a member NumArgs for which all the parsing
infrastructure is in place, but no command was setting it to a value
other than 0. By doing so we get warnings when passing an empty
paragraph to \retval (the first argument is the return value, then comes
the description). We also take \xrefitem along for the ride, although as
the documentation states it's unlikely to be used directly.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D125422
Fixes the `FIXME:` related to adding `forEachTemplateArgument` to the
core AST Matchers library.
Reviewed By: aaron.ballman
Differential Revision: http://reviews.llvm.org/D125383
This adds a late Machine Pass to work around a Cortex CPU Erratum
affecting Cortex-A57 and Cortex-A72:
- Cortex-A57 Erratum 1742098
- Cortex-A72 Erratum 1655431
The pass inserts instructions to make the inputs to the fused AES
instruction pairs no longer trigger the erratum. Here the pass errs on
the side of caution, inserting the instructions wherever we cannot prove
that the inputs came from a safe instruction.
The pass is used:
- for Cortex-A57 and Cortex-A72,
- for "generic" cores (which are used when using `-march=`),
- when the user specifies `-mfix-cortex-a57-aes-1742098` or
`mfix-cortex-a72-aes-1655431` in the command-line arguments to clang.
Reviewed By: dmgreen, simon_tatham
Differential Revision: https://reviews.llvm.org/D119720
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125323
BoolAssignment checker is now taint-aware and warns if a tainted value is
assigned.
Original author: steakhal
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D125360
Summary:
Static libraries need to be handled differently from regular inpout
files, namely they are loaded lazily. Previously we used a flag to
indicate a file camm from a static library. This patch simplifies this
by simply keeping a different array that contains the static libraries
so we don't need to parse them out again.
Summary:
The linker wrapper previously had functionality to strip the sections
manually. We don't use this at all because this is much better done by
the linker via the `SHF_EXCLUDE` flag. This patch simply removes the
support for thi sfeature to simplify the code.
If a closing brace is followed by a non-trailing comment, the
newline before the closing brace must also be removed.
Differential Revision: https://reviews.llvm.org/D125451
This allows using any recognized kind of string for any
__attribute__((format)) archetype. Before this change, for instance,
the printf archetype would only accept char pointer types and the
NSString archetype would only accept NSString pointers. This is
more restrictive than necessary as there exist functions to
convert between string types that can be annotated with
__attribute__((format_arg)) to transfer format information.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D125254
rdar://89060618
With sufficiently tortured code, it's possible to cause a stack
overflow when parsing declarators. Thus, we now check for resource
exhaustion when recursively parsing declarators so that we can at least
warn the user we're about to crash before we actually crash.
Fixes#51642
Differential Revision: https://reviews.llvm.org/D124915
MSVC expects wchar_t to be defined in stddef.h if /Zc:wchar_t- is specified
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D124026
The re-apply includes fixes to clang tests that were missed in
the original commit.
Original message:
Prior to this patch we would only set to undef the unused arguments of the
external functions. The rationale was that unused arguments of internal
functions wouldn't need to be turned into undef arguments because they
should have been simply eliminated by the time we reach that code.
This is actually not true because there are plenty of cases where we can't
remove unused arguments. For instance, if the internal function is used in
an indirect call, it may not be possible to change the function signature.
Yet, for statically known call-sites we would still like to mark the unused
arguments as undef.
This patch enables the "set undef arguments" optimization on internal
functions when we encounter cases where internal functions cannot be
optimized. I.e., whenever an internal function is marked "live".
Differential Revision: https://reviews.llvm.org/D124699
This diff changes the serialization of the `ORIGINAL_PCH_DIR`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This will allow for the module to be relocatable
across machines.
The path is restored relative to the module's BaseDirectory on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124946
This diff changes the serialization of the `SUBMODULE_TOPHEADER`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This matches the behavior of the
`SUBMODULE_HEADER` entry and will allow for the module to be
relocatable across machines.
The path is restored relative to the module's `BaseDirectory` on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124938
This diff adds a new frontend flag `-fmodule-file-home-is-cwd`.
The behavior of this flag is similar to
`-fmodule-map-file-home-is-cwd` but does not require the module
map files to be modified to have inputs relative to the cwd.
Instead the output modules will have their `BaseDirectory` set
to the cwd and will try and resolve paths relative to that.
The motiviation for this change is to support relocatable pcm
files that are built on different machines with different paths
without having to alter module map files, which is sometimes not
possible as they are provided by 3rd parties.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124874
This PR changes the `SymIntExpr` so the expression that uses a
negative value as `RHS`, for example: `x +/- (-N)`, is modeled as
`x -/+ N` instead.
This avoids producing a very large `RHS` when the symbol is cased to
an unsigned number, and as consequence makes the value more robust in
presence of casts.
Note that this change is not applied if `N` is the lowest negative
value for which negation would not be representable.
Reviewed By: steakhal
Patch By: tomasz-kaminski-sonarsource!
Differential Revision: https://reviews.llvm.org/D124658
This adds an extension warning when using the preprocessor conditionals
in a language mode they're not officially supported in, and an opt-in
warning for compatibility with previous standards.
Fixes#55306
Differential Revision: https://reviews.llvm.org/D125178
Reimplement the RemoveBracesLLVM feature which handles a
single-statement block that would get wrapped.
Fixes#53543.
Differential Revision: https://reviews.llvm.org/D125137
It turns out that the llvm buildbots run the test with
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4, which would cause this
test to fail as the test assumed that the default target is Windows. To
fix this, we explicitly set -target for the Windows testcases.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D125425
When Clang generates the path prefix (i.e. the path of the directory
where the file is) when generating FILE, __builtin_FILE(), and
std::source_location, Clang uses the platform-specific path separator
character of the build environment where Clang _itself_ is built. This
leads to inconsistencies in Chrome builds where Clang running on
non-Windows environments uses the forward slash (/) path separator
while Clang running on Windows builds uses the backslash (\) path
separator. To fix this, we add a flag -ffile-reproducible (and its
inverse, -fno-file-reproducible) to have Clang use the target's
platform-specific file separator character.
Additionally, the existing flags -fmacro-prefix-map and
-ffile-prefix-map now both imply -ffile-reproducible. This can be
overriden by setting -fno-file-reproducible.
[0]: https://crbug.com/1310767
Differential revision: https://reviews.llvm.org/D122766
Summary:
We use embedded binaries to extract offloading device code from the host
fatbinary. This uses a binary format whose necessary alignment is
eight bytes. The alignment is included within the ELF section type so
the data extracted from the ELF should always be aligned at that amount.
However, if this file was extraqcted from a static archive, it was being
sent as an offset in the archive file which did not have the same
alignment guaruntees as the ELF file. This was causing errors in the
UB-sanitizer build as it would occasionally try to access a misaligned
address. To fix this, I simply copy the memory directly to a new buffer
which is guarnteed to have worst-case alignment of 16 in the case that
it's not properly aligned.
Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If
there is a need to have highly optimized codegen and kernel developers have
knowledge of inter-wave runtime behavior which is unknown to the compiler this
builtin can be used to tune scheduling.
This intrinsic creates a barrier between scheduling regions. The immediate
parameter is a mask to determine the types of instructions that should be
prevented from crossing the sched_barrier. In this initial patch, there are only
two variations. A mask of 0 means that no instructions may be scheduled across
the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing
instructions may cross the sched_barrier.
Note that this intrinsic is only meant to work with the scheduling passes. Any
other transformations that may move code will not be impacted in the ways
described above.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D124700
Update KEYALL to cover KEYCUDA. Introduce KEYMAX and
a generic way to update KEYALL.
Reviewed by: Dan Liew
Differential Revision: https://reviews.llvm.org/D125396
Create dxc_D as alias to option D which Define <macro> to <value> (or 1 if <value> omitted).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125338
D87451 added -mignore-xcoff-visibility for AIX targets and made it the default (which mimicked the behaviour of the XL 16.1 compiler on AIX).
However, ignoring hidden visibility has unwanted side effects and some libraries depend on visibility to hide non-ABI facing entities from user headers and
reserve the right to change these implementation details based on this (https://libcxx.llvm.org/DesignDocs/VisibilityMacros.html). This forces us to use
internal linkage fallbacks for these cases on AIX and creates an unwanted divergence in implementations on the plaform.
For these reasons, it's preferable to not add -mignore-xcoff-visibility by default, which is what this patch does.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D125141
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125165
Due to how parseBracedList always stopped on the first closing angle
bracket and was used in parsing angle bracketed expression inside concept
definition, nested brackets inside concepts were parsed incorrectly.
nextToken() call before calling parseBracedList is required because
we were processing opening angle bracket inside parseBracedList second
time leading to incorrect logic after my fix.
Fixes https://github.com/llvm/llvm-project/issues/54943
Fixes https://github.com/llvm/llvm-project/issues/54837
Reviewed By: HazardyKnusperkeks, curdeius
Differential Revision: https://reviews.llvm.org/D123896
This patch adds the necessary code generation to create the wrapper code
that registers all the globals in CUDA. We create the necessary
functions and iterate through the list of
`__start_cuda_offloading_entries` to find which globals must be
registered. This is very similar to the code generation done currently
in Clang for non-rdc builds, but here we are registering a fully linked
fatbinary and finding the globals via the above sections.
With this we should be able to fully support basic RDC / LTO building of CUDA
code.
It's also worth noting that this does not include the necessary PTX to JIT the
image, so to use this support the offloading architecture must match the
system's architecture.
Depends on D123810
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D123812
This patch adds the initial support for wrapping CUDA images. This
requires changing some of the logic for how we bundle images. We now
need to copy the image for all kinds that are active for the
architecture. Then we need to run a separate wrapping job if the Kind is
Cuda. For cuda wrapping we need to use the `fatbinary` program from the
CUDA SDK to bundle all the binaries together. This is then passed to a
new function to perfom the actual module code generation that will be
implemented in a later patch.
Depends on D120273 D123471
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D123810
The changes made in D123460 generalized the code generation for OpenMP's
offloading entries. We can use the same scheme to register globals for
CUDA code. This patch adds the code generation to create these
offloading entries when compiling using the new offloading driver mode.
The offloading entries are simple structs that contain the information
necessary to register the global. The struct used is as follows:
```
Type struct __tgt_offload_entry {
void *addr; // Pointer to the offload entry info.
// (function or global)
char *name; // Name of the function or global.
size_t size; // Size of the entry info (0 if it a function).
int32_t flags;
int32_t reserved;
};
```
Currently CUDA handles RDC code generation by deferring the registration
of globals in the current TU to a callback function containing the
modules ID. Later all the module IDs will be used to register all of the
globals at once. Rather than mimic this, offloading entries allow us to
mimic the way OpenMP registers globals. That is, we create a simple
global struct for each device global to be registered. These are placed
at a special section `cuda_offloading_entires`. Because this section is
a valid C-identifier, the linker will profide a `__start` and `__stop`
pointer that we can use to iterate and register all globals at runtime.
the registration requires a flag variable to indicate which registration
function to use. I have assigned the flags somewhat arbitrarily, but
these use the following values.
Kernel: 0
Variable: 0
Managed: 1
Surface: 2
Texture: 3
Depends on D120272
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D123471
This adds the -Wgnu-line-marker diagnostic flag, grouped under -Wgnu,
to warn about use of the GNU linemarker preprocessor extension.
Fixes#55067
Differential Revision: https://reviews.llvm.org/D124534
This adds support for variable stride with the val, uval, and ref linear
modifiers. Previously only the no modifer type ls<argno> was supported.
val -> Ls<argno>
uval -> Us<argno>
ref -> Rs<argno>
Differential Revision: https://reviews.llvm.org/D125330