These command-line flags are alternates to providing the -x
c++-*-header indicators that we are building a header unit.
Act on fmodule-header= for headers on the c/l:
If we have x.hh -fmodule-header, then we should treat that header
as a header unit input (equivalent to -xc++-header-unit-header x.hh).
Likewise, for fmodule-header={user,system} the source should be now
recognised as a header unit input (since this can affect the job list
that we need).
It's not practical to recognise a header without any suffix so
-fmodule-header=system foo isn't going to happen. Although
-fmodule-header=system foo.hh will work OK. However we can make it
work if the user indicates that the item without a suffix is a valid
header. (so -fmodule-header=system -xc++-header vector)
Differential Revision: https://reviews.llvm.org/D121589
Summary:
The `-fopenmp-target-new-runtime` flag has not been used for awhile. It
was present in a previous release so we shouldn't remove it for
backwards compatibility, but we shouldn't have documentation or a help
message for it.
--overlay-platform-toolchain inserts a whole new toolchain path with
higher priority than system default, which could be achieved by
composing smaller options. We need to figure out alternative solution
and what is missing among these basic options.
In some cases, we need to set alternative toolchain path other than the
default with system (headers, libraries, dynamic linker prefix, ld path,
etc.), e.g., to pick up newer components, but keep sysroot at the same
time (to pick up extra packages).
This change introduces a new option --overlay-platform-toolchain to set
up such alternative toolchain path.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D121992
This reverts commit edb7ba714a.
This changes BLR_BTI to take variable_ops meaning that we can accept
a register or a label. The pattern still expects one argument so we'll
never get more than one. Then later we can check the type of the operand
to choose BL or BLR to emit.
(this is what BLR_RVMARKER does but I missed this detail of it first time around)
Also require NoSLSBLRMitigation which I missed in the first version.
Some implementations of setjmp will end with a br instead of a ret.
This means that the next instruction after a call to setjmp must be
a "bti j" (j for jump) to make this work when branch target identification
is enabled.
The BTI extension was added in armv8.5-a but the bti instruction is in the
hint space. This means we can emit it for any architecture version as long
as branch target enforcement flags are passed.
The starting point for the hint number is 32 then call adds 2, jump adds 4.
Hence "hint #36" for a "bti j" (and "hint #34" for the "bti c" you see
at the start of functions).
The existing Arm command line option -mno-bti-at-return-twice has been
applied to AArch64 as well.
Support is added to SelectionDAG Isel and GlobalIsel. FastIsel will
defer to SelectionDAG.
Based on the change done for M profile Arm in https://reviews.llvm.org/D112427Fixes#48888
Reviewed By: danielkiss
Differential Revision: https://reviews.llvm.org/D121707
AArch32/Armv8A introduced the performance deprecation of certain patterns
of IT instructions. After some debate internal to ARM, this is now being
reverted; i.e. no IT instruction patterns are performance deprecated
anymore, as the perfomance degredation is not significant enough.
This reverts the following:
"ARMv8-A deprecates some uses of the T32 IT instruction. All uses of
IT that apply to instructions other than a single subsequent 16-bit
instruction from a restricted set are deprecated, as are explicit
references to the PC within that single 16-bit instruction. This permits
the non-deprecated forms of IT and subsequent instructions to be treated
as a single 32-bit conditional instruction."
The deprecation no longer applies, but the behaviour may be controlled
by the -arm-restrict-it and -arm-no-restrict-it command-line options,
with the latter being the default. No warnings about complex IT blocks
will be generated.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118044
Summary:
The name of the AMDGPU device library was changes. Previously it was
called 'libomptarget-amdgcn'. This patch changes fixes the tests to use
the new name of the library and adds a new flag with the same name.
This patch adds more documentation for the OpenMP offloading driver.
This includes a new file that describes the overall pipeline becuase
that was not previously explained in full elsewhere.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D118815
When passing a set of flags to configure defaults for a specific
target (similar to the cmake settings `CLANG_DEFAULT_RTLIB`,
`CLANG_DEFAULT_UNWINDLIB`, `CLANG_DEFAULT_CXX_STDLIB` and
`CLANG_DEFAULT_LINKER`, but without hardcoding them in the binary),
some of the flags may cause warnings (e.g. `-stdlib=` when compiling C
code). Allow requesting selectively ignoring unused arguments among
some of the arguments on the command line, without needing to resort
to `-Qunused-arguments` or `-Wno-unused-command-line-argument`.
Fix up the existing diagnostics.c testcase. It was added in
response to PR12181 to fix handling of
`-Werror=unused-command-line-argument`, but the command line option
in the test (`-fzyzzybalubah`) now triggers "error: unknown argument"
instead of the intended warning. Change it into a linker input
(`-lfoo`) which triggers the intended diagnostic. Extend the
existing test case to check more cases and make sure that it keeps
testing the intended case.
Add testing of the new option to this existing test.
Differential Revision: https://reviews.llvm.org/D116503
Allow toggling of -fnew-infallible so last instance takes precedence
Testing:
ninja check-all
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D113523
Trying to update some options that don't at least have an inclusive language version.
This patch adds `objcmt-allowlist-dir-path` as a default alternative.
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D112591
Recently a vulnerability issue is found in the implementation of VLLDM
instruction in the Arm Cortex-M33, Cortex-M35P and Cortex-M55. If the
VLLDM instruction is abandoned due to an exception when it is partially
completed, it is possible for subsequent non-secure handler to access
and modify the partial restored register values. This vulnerability is
identified as CVE-2021-35465.
The mitigation sequence varies between v8-m and v8.1-m as follows:
v8-m.main
---------
mrs r5, control
tst r5, #8 /* CONTROL_S.SFPA */
it ne
.inst.w 0xeeb00a40 /* vmovne s0, s0 */
1:
vlldm sp /* Lazy restore of d0-d16 and FPSCR. */
v8.1-m.main
-----------
vscclrm {vpr} /* Clear VPR. */
vlldm sp /* Lazy restore of d0-d16 and FPSCR. */
More details on
developer.arm.com/support/arm-security-updates/vlldm-instruction-security-vulnerability
Differential Revision: https://reviews.llvm.org/D109157
d8faf03807 implemented general-regs-only for X86 by disabling all features
with vector instructions. But the CRC32 instruction in SSE4.2 ISA, which uses
only GPRs, also becomes unavailable. This patch adds a CRC32 feature for this
instruction and allows it to be used with general-regs-only.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D105462
This patch implements Clang support for an original OpenMP extension
we have developed to support OpenACC: the `ompx_hold` map type
modifier. The next patch in this series, D106510, implements OpenMP
runtime support.
Consider the following example:
```
#pragma omp target data map(ompx_hold, tofrom: x) // holds onto mapping of x
{
foo(); // might have map(delete: x)
#pragma omp target map(present, alloc: x) // x is guaranteed to be present
printf("%d\n", x);
}
```
The `ompx_hold` map type modifier above specifies that the `target
data` directive holds onto the mapping for `x` throughout the
associated region regardless of any `target exit data` directives
executed during the call to `foo`. Thus, the presence assertion for
`x` at the enclosed `target` construct cannot fail. (As usual, the
standard OpenMP reference count for `x` must also reach zero before
the data is unmapped.)
Justification for inclusion in Clang and LLVM's OpenMP runtime:
* The `ompx_hold` modifier supports OpenACC functionality (structured
reference count) that cannot be achieved in standard OpenMP, as of
5.1.
* The runtime implementation for `ompx_hold` (next patch) will thus be
used by Flang's OpenACC support.
* The Clang implementation for `ompx_hold` (this patch) as well as the
runtime implementation are required for the Clang OpenACC support
being developed as part of the ECP Clacc project, which translates
OpenACC to OpenMP at the directive AST level. These patches are the
first step in upstreaming OpenACC functionality from Clacc.
* The Clang implementation for `ompx_hold` is also used by the tests
in the runtime implementation. That syntactic support makes the
tests more readable than low-level runtime calls can. Moreover,
upstream Flang and Clang do not yet support OpenACC syntax
sufficiently for writing the tests.
* More generally, the Clang implementation enables a clean separation
of concerns between OpenACC and OpenMP development in LLVM. That
is, LLVM's OpenMP developers can discuss, modify, and debug LLVM's
extended OpenMP implementation and test suite without directly
considering OpenACC's language and execution model, which can be
handled by LLVM's OpenACC developers.
* OpenMP users might find the `ompx_hold` modifier useful, as in the
above example.
See new documentation introduced by this patch in `openmp/docs` for
more detail on the functionality of this extension and its
relationship with OpenACC. For example, it explains how the runtime
must support two reference counts, as specified by OpenACC.
Clang recognizes `ompx_hold` unless `-fno-openmp-extensions`, a new
command-line option introduced by this patch, is specified.
Reviewed By: ABataev, jdoerfert, protze.joachim, grokos
Differential Revision: https://reviews.llvm.org/D106509
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.
With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`. `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.
Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:
thinlto/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
"dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
"dwarfehprepare.NumNoUnwind": 28744,
a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.
Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D105225
This patch adds the -fminimize-whitespace with the following effects:
* If combined with -E, remove as much non-line-breaking whitespace as
possible.
* If combined with -E -P, removes as much whitespace as possible,
including line-breaks.
The motivation is to reduce the amount of insignificant changes in the
preprocessed output with source files where only whitespace has been
changed (add/remove comments, clang-format, etc.) which is in particular
useful with ccache.
A patch for ccache for using this flag has been proposed to ccache as well:
https://github.com/ccache/ccache/pull/815, which will use
-fnormalize-whitespace when clang-13 has been detected, and additionally
uses -P in "unify_mode". ccache already had a unify_mode in an older
version which was removed because of problems that using the
preprocessor itself does not have (such that the custom tokenizer did
not recognize C++11 raw strings).
This patch slightly reorganizes which part is responsible for adding
newlines that are required for semantics. It is now either
startNewLineIfNeeded() or MoveToLine() but never both; this avoids the
ShouldUpdateCurrentLine workaround and avoids redundant lines being
inserted in some cases. It also fixes a mandatory newline not inserted
after a _Pragma("...") that is expanded into a #pragma.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D104601
Renaming the option is based on discussions in https://reviews.llvm.org/D101122.
It is normally not a good idea to rename driver flags but this flag is
new enough and obscure enough that it is very unlikely to have adopters.
While we're here also drop the `<kind>` metavar. It's not necessary and
is actually inconsistent with the documentation in
`clang/docs/ClangCommandLineReference.rst`.
Differential Revision: https://reviews.llvm.org/D101491
Now that the WebAssembly SIMD specification is finalized and engines are
generally up-to-date, there is no need for a separate target feature for gating
SIMD instructions that engines have not implemented. With this change,
v128.const is now enabled by default with the simd128 target feature.
Differential Revision: https://reviews.llvm.org/D98457
The new `-fsanitize-address-destructor-kind=` option allows control over how module
destructors are emitted by ASan.
The new option is consumed by both the driver and the frontend and is propagated into
codegen options by the frontend.
Both the legacy and new pass manager code have been updated to consume the new option
from the codegen options.
It would be nice if the new utility functions (`AsanDtorKindToString` and
`AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be
consumed by other language frontends. Unfortunately that doesn't work because
the clang driver doesn't link against the LLVM instrumentation library.
rdar://71609176
Differential Revision: https://reviews.llvm.org/D96572
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
D94700 removed the static library so we no longer need to pass
`-llibomptarget-nvptx` to `nvlink`. Since the bitcode library is the only device
runtime for now, instead of emitting a warning when it is not found, an error
should be raised. We also set a new option `libomptarget-nvptx-bc-path` to let
user choose which bitcode library is being used.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D95161
The help text for `-I` was recently expanded in [1]. The expanded
version focuses on explaining the semantics of `-I` in Clang. We are now
in the process of adding support for `-I` in Flang and this new
description is incompatible with the semantics of `-I` in Flang. This
was brought up in this review:
* https://reviews.llvm.org/D93453
This patch reverts the original change in Options.td. This way the help
text for `-I` remains generic enough so that it applies to both Clang
and Flang.
The expanded description of `-I` from [1] is moved to the
`DocBrief` field for `-I`. This field is prioritised over the help text
when generating ClangCommandLineReference.rst, so the user facing
documentation for Clang retains the expanded description:
* https://clang.llvm.org/docs/ClangCommandLineReference.html
`DocBrief` fields are currently not used in Flang.
As requested in the reviews, the help text and the expanded description
are slightly refined.
[1] Commit: 8dd4e3ceb8
Differential Revision: https://reviews.llvm.org/D94169
PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.
This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.
[1] Core reference:
https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf
Differential revision: https://reviews.llvm.org/D92935
Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX.
The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc.
Reviewed By: Xiangling_L, DiggerLin
Differential Revision: https://reviews.llvm.org/D89684
This patch mainly made the following changes:
1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.
Differential Revision: https://reviews.llvm.org/D89105