Search avr-libc path according to avr-gcc installation at first,
then other possible installed pathes.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107682
The Linux kernel has a macro called IS_ENABLED(), which evaluates to a
constant 1 or 0 based on Kconfig selections, allowing C code to be
unconditionally enabled or disabled at build time. For example:
int foo(struct *a, int b) {
switch (b) {
case 1:
if (a->flag || !IS_ENABLED(CONFIG_64BIT))
return 1;
__attribute__((fallthrough));
case 2:
return 2;
default:
return 3;
}
}
There is an unreachable warning about the fallthrough annotation in the
first case because !IS_ENABLED(CONFIG_64BIT) can be evaluated to 1,
which looks like
return 1;
__attribute__((fallthrough));
to clang.
This type of warning is pointless for the Linux kernel because it does
this trick all over the place due to the sheer number of configuration
options that it has.
Add -Wunreachable-code-fallthrough, enabled under -Wunreachable-code, so
that projects that want to warn on unreachable code get this warning but
projects that do not care about unreachable code can still use
-Wimplicit-fallthrough without having to make changes to their code
base.
Fixes PR51094.
Reviewed By: aaron.ballman, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D107933
@arichardson pointed out in post-commit review for
https://reviews.llvm.org/D95583 (b714f73def) that `-verify` has an
optional argument that works a lot like `FileCheck`'s `-check-prefix`.
Use it to simplify the test for `-fno-implicit-modules-use-lock`!
The patch adds ELF notes into SHT_NOTE sections of ELF offload images
passed to clang-offload-wrapper.
The new notes use a null-terminated "LLVMOMPOFFLOAD" note name.
There are currently three types of notes:
VERSION: a string (not null-terminated) representing the ELF offload
image structure. The current version '1.0' does not put any restrictions
on the structure of the image. If we ever need to come up with a common
structure for ELF offload images (e.g. to be able to analyze the images
in libomptarget in some standard way), then we will introduce new versions.
PRODUCER: a vendor specific name of the producing toolchain.
Upstream LLVM uses "LLVM" (not null-terminated).
PRODUCER_VERSION: a vendor specific version of the producing toolchain.
Upstream LLVM uses LLVM_VERSION_STRING with optional <space> LLVM_REVISION.
All three notes are not mandatory currently.
Differential Revision: https://reviews.llvm.org/D99551
LoopLoadElimination, LoopVersioning and LoopVectorize currently
fetch MemorySSA when construction LoopAccessAnalysis. However,
LoopAccessAnalysis does not actually use MemorySSA and we can pass
nullptr instead.
This saves one MemorySSA calculation in the default pipeline, and
thus improves compile-time.
Differential Revision: https://reviews.llvm.org/D108074
Two standalone LoopRotate passes scheduled using
createFunctionToLoopPassAdaptor() currently enable MemorySSA.
However, while LoopRotate can preserve MemorySSA, it does not use
it, so requiring MemorySSA is unnecessary.
This change doesn't have a practical compile-time impact by itself,
because subsequent passes still request MemorySSA.
Differential Revision: https://reviews.llvm.org/D108073
This is a rather common feedback we get from out leak checkers: bug reports are
really short, and are contain barely any usable information on what the analyzer
did to conclude that a leak actually happened.
This happens because of our bug report minimizing effort. We construct bug
reports by inspecting the ExplodedNodes that lead to the error from the bottom
up (from the error node all the way to the root of the exploded graph), and mark
entities that were the cause of a bug, or have interacted with it as
interesting. In order to make the bug report a bit less verbose, whenever we
find an entire function call (from CallEnter to CallExitEnd) that didn't talk
about any interesting entity, we prune it (click here for more info on bug
report generation). Even if the event to highlight is exactly this lack of
interaction with interesting entities.
D105553 generalized the visitor that creates notes for these cases. This patch
adds a new kind of NoStateChangeVisitor that leaves notes in functions that
took a piece of dynamically allocated memory that later leaked as parameter,
and didn't change its ownership status.
Differential Revision: https://reviews.llvm.org/D105553
Preceding discussion on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2021-June/068450.html
NoStoreFuncVisitor is a rather unique visitor. As VisitNode is invoked on most
other visitors, they are looking for the point where something changed -- change
on a value, some checker-specific GDM trait, a new constraint.
NoStoreFuncVisitor, however, looks specifically for functions that *didn't*
write to a MemRegion of interesting. Quoting from its comments:
/// Put a diagnostic on return statement of all inlined functions
/// for which the region of interest \p RegionOfInterest was passed into,
/// but not written inside, and it has caused an undefined read or a null
/// pointer dereference outside.
It so happens that there are a number of other similar properties that are
worth checking. For instance, if some memory leaks, it might be interesting why
a function didn't take ownership of said memory:
void sink(int *P) {} // no notes
void f() {
sink(new int(5)); // note: Memory is allocated
// Well hold on, sink() was supposed to deal with
// that, this must be a false positive...
} // warning: Potential memory leak [cplusplus.NewDeleteLeaks]
In here, the entity of interest isn't a MemRegion, but a symbol. The property
that changed here isn't a change of value, but rather liveness and GDM traits
managed by MalloChecker.
This patch moves some of the logic of NoStoreFuncVisitor to a new abstract
class, NoStateChangeFuncVisitor. This is mostly calculating and caching the
stack frames in which the entity of interest wasn't changed.
Descendants of this interface have to define 3 things:
* What constitutes as a change to an entity (this is done by overriding
wasModifiedBeforeCallExit)
* What the diagnostic message should be (this is done by overriding
maybeEmitNoteFor.*)
* What constitutes as the entity of interest being passed into the function (this
is also done by overriding maybeEmitNoteFor.*)
Differential Revision: https://reviews.llvm.org/D105553
Previously we just used {}, but that doesn't work in situations
like this.
if (1)
_MM_EXTRACT_FLOAT(d, x, n);
else
...
The semicolon would terminate the if.
clang/docs/tool/dump_format_style.py was not run as part of {D99840}
Bring ClangFormatStyleOptions.rst back in line.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D107958
This covers the SSE and AVX/AVX2 headers. AVX512 has a lot more macros
due to rounding mode.
Fixes part of PR51324.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D107843
Some Clang diagnostics could only report OpenCL C version. Because
C++ for OpenCL can be used as an alternative to OpenCL C, the text
for diagnostics should reflect that.
Desrciptions modified for these diagnostics:
`err_opencl_unknown_type_specifier`
`warn_option_invalid_ocl_version`
`err_attribute_requires_opencl_version`
`warn_opencl_attr_deprecated_ignored`
`ext_opencl_ext_vector_type_rgba_selector`
Differential Revision: https://reviews.llvm.org/D107648
This fixes the 'unused linker option: -lm' warning when compiling
program with -c.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D107952
'armv7-pc-win32-macho'
It is incorrect to select the hardware floating point ABI on Mach-O
platforms using the Windows triple if the ABI is "apcs-gnu".
rdar://81810554
Differential Revision: https://reviews.llvm.org/D107939
Add in-source documentation on how CanonicalLoopInfo is intended to be used. In particular, clarify what parts of a CanonicalLoopInfo is considered part of the loop, that those parts must be side-effect free, and that InsertPoints to instructions outside those parts can be expected to be preserved after method calls implementing loop-associated directives.
CanonicalLoopInfo are now invalidated after it does not describe canonical loop anymore and asserts when trying to use it afterwards.
In addition, rename `createXYZWorkshareLoop` to `applyXYZWorkshareLoop` and remove the update location to avoid that the impression that they insert something from scratch at that location where in reality its InsertPoint is ignored. createStaticWorkshareLoop does not return a CanonicalLoopInfo anymore. First, it was not a canonical loop in the clarified sense (containing side-effects in form of calls to the OpenMP runtime). Second, it is ambiguous which of the two possible canonical loops it should actually return. It will not be needed before a feature expected to be introduced in OpenMP 6.0
Also see discussion in D105706.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107540
Also clang ClangAttrEmitter for -gen-clang-attr-doc-table to be
like all other tablegen: Produce a .inc file with the generated bits
and put the static parts into a regular .cpp file that includes the
.inc file.
A new attribute btf_tag is added. The syntax looks like
__attribute__((btf_tag(<string>)))
Users may tag a particular structure/member/function/func_parameter/variable
declaration with an arbitrary string and the intention is
that this string is passed to dwarf so it is available for
post-compilation analysis. The string will be also passed
to .BTF section if the target is BPF. For each permitted
declaration, multiple btf_tag's are allowed.
For detailed use cases, please see
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151009.html
In case that there exist redeclarations, the btf_tag attributes
will be accumulated along with different declarations, and the
last declaration will contain all attributes.
Differential Revision: https://reviews.llvm.org/D106614
Add -cc1 flags `-fmodules-uses-lock` and `-fno-modules-uses-lock` to
allow the lock manager to be turned off when building implicit modules.
Add `-Rmodule-lock` so that we can see when it's being used.
Differential Revision: https://reviews.llvm.org/D95583
This happens in createInvocationWithCommandLine but only clangd currently passes
ShouldRecoverOnErorrs (sic).
One cause of this (with correct command) is several -arch arguments for mac
multi-arch support.
Fixes https://github.com/clangd/clangd/issues/827
Differential Revision: https://reviews.llvm.org/D107632
This renames `compileModuleAndReadAST`, adding a `BehindLock` suffix,
and refactors it to significantly reduce nesting.
- Split out helpers `compileModuleAndReadASTImpl` and
`readASTAfterCompileModule` which have straight-line code that doesn't
worry about locks.
- Use `break` in the interesting cases of `switch` statements to reduce
nesting.
- Use early `return`s to reduce nesting.
Detangling the compile-and-read logic from the check-for-locks logic
should be a net win for readability, although I also have a side
motivation of making the locks optional in a follow-up.
No functionality change here.
Differential Revision: https://reviews.llvm.org/D95581
Only the bare name is completed, with no args.
For args to be useful we need arg names. These *are* in the tablegen but
not currently emitted in usable form, so left this as future work.
C++11, C2x, GNU, declspec, MS syntax is supported, with the appropriate
spellings of attributes suggested.
`#pragma clang attribute` is supported but not terribly useful as we
only reach completion if parens are balanced (i.e. the line is not truncated)
There's no filtering of which attributes might make sense in this
grammatical context (e.g. attached to a function). In code-completion context
this is hard to do, and will only work in few cases :-(
There's also no filtering by langopts: this is because currently the
only way of checking is to try to produce diagnostics, which requires a
valid ParsedAttr which is hard to get.
This should be fairly simple to fix but requires some tablegen changes
to expose the logic without the side-effect.
Differential Revision: https://reviews.llvm.org/D107696
Add builtin and intrinsic for `__addex`.
This patch is part of a series of patches to provide builtins for
compatibility with the XL compiler.
Reviewed By: stefanp, nemanjai, NeHuang
Differential Revision: https://reviews.llvm.org/D107002
Clang test CodeGen/thinlto-clang-diagnostic-handler-in-be.c fails on
some non x86 targets, e.g. hexagon. Since the test already requires x86
to be available as a target this commit forces the target to x86_64.
Reviewed By: steven_wu
Differential Revision: https://reviews.llvm.org/D107667
This reverts the revert 28c04794df.
The failing MLIR test that caused the revert should be fixed in this
version.
Also includes a PPC test fix previously in 1f87c7c478.
Instead of using scalar size divided by 8 for segment loads, get
the alignment from clang's type system.
Make vleff match for consistency.
Also replace uses of getPointerElementType() which will be removed as part of the OpaquePtr changes.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D106738
Fixes miscompile of calls into ocml. Bug 51445.
The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.
This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107971
Previoulsy debug-info-for-profiling and pseudo-probe-for-profiling are mutual exclusive because they compete the dwarf discrimnator for callsites on the IR. This changes allows to use the two switches together. The side effect is that callsite discriminators will be taken by pseudo probe, while discriminators for other instructions are still available for AutoFDO use. This is less than ideal, however, it still allows us a chance to smoothly transition from AutoFDO to CSSPGO, by collecting both profiles from a CSSPGO binary.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D107876
Introducing a plugin API and a simple HelloWorld Plugin example.
This patch adds the `-load` and `-plugin` flags to frontend driver and
the code around using custom frontend actions from within a plugin
shared library object.
It also adds to the Driver-help test to check the help option with the
updated driver flags.
Additionally, the patch creates a plugin-example test to check the
HelloWorld plugin example runs correctly. As part of this, a new CMake
flag (`FLANG_BUILD_EXAMPLES`) is added to allow the example to be built
and for the test to run.
This Plugin API has only been tested on Linux.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D106137
The existing logic for per-target libc++ include directories only
seem to exist for the Gnu and Fuchsia drivers, added in
ea12d779bc / D89013.
This is less generic than the corresponding case in the Gnu driver,
but matches the existing level of genericity in the MinGW driver
(and others too).
Differential Revision: https://reviews.llvm.org/D107893
This patch adjusts the intrinsics definition of
llvm.matrix.column.major.load and llvm.matrix.column.major.store to
allow overloading the type of the stride. The bitwidth of the stride is
used to perform the offset computation.
This fixes a crash when using __builtin_matrix_column_major_load or
__builtin_matrix_column_major_store on 32 bit platforms. The stride argument
of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit
platforms.
Note that we still perform offset computations with 64 bit width on 32
bit platforms for accesses that do not take a user-specified stride.
This can be fixed separately.
Fixes PR51304.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D107349
After
9da70ab3d4
we saw a few regressions around trailing attribute definitions and in
typedefs (examples in the added test cases). There's some tension
distinguishing K&R definitions from attributes at the parser level,
where we have to decide if we need to put the type of the K&R definition
on a new unwrapped line before we have access to the rest of the line,
so we're scanning backwards and looking for a pattern like f(a, b). But
this type of pattern could also be an attribute macro, or the whole
declaration could be a typedef itself. I updated the code to check for a
typedef at the beginning of the line and to not consider raw identifiers
as possible first K&R declaration (but treated as an attribute macro
instead). This is not 100% correct heuristic, but I think it should be
reasonably good in practice, where we'll:
* likely be in some very C-ish code when using K&R style (e.g., stuff
that uses `struct name a;` instead of `name a;`
* likely be in some very C++-ish code when using attributes
* unlikely mix up the two in the same declaration.
Ideally, we should only decide to add the unwrapped line before the K&R
declaration after we've scanned the rest of the line an noticed the
variable declarations and the semicolon, but the way the parser is
organized I don't see a good way to do this in the current parser, which
only has good context for the previously visited tokens. I also tried
not emitting an unwrapped line there and trying to resolve the situation
later in the token annotator and the continuation indenter, and that
approach seems promising, but I couldn't make it to work without
messing up a bunch of other cases in unit tests.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D107950
A follow-up to
f6bc614546
where we handle the case where the semicolon is followed by a trailing
comment.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D107907
We do not want to define __PRIVILEGED__. There is no use case for the
definition and gcc does not define it. This patch removes that definition.
Reviewed By: lei, NeHuang
Differential Revision: https://reviews.llvm.org/D107461
Temporary files created by the offloading device toolchain are not removed
after compilation when using a two-step compilation. The offload-bundler uses a
different filename for the device binary than the `.o` file present in the
Job's input list. This is not listed as a temporary file so it is never
removed. This patch explicitly adds the device binary as a temporary file to
consume it. This fixes PR50336.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107668
When none of the translation units in the binary have been instrumented
we shouldn't need to link the profile runtime. However, because we pass
-u__llvm_profile_runtime on Linux and Fuchsia, the runtime would still
be pulled in and incur some overhead. On Fuchsia which uses runtime
counter relocation, it also means that we cannot reference the bias
variable unconditionally.
This change modifies the InstrProfiling pass to pull in the profile
runtime only when needed by declaring the __llvm_profile_runtime symbol
in the translation unit only when needed. For now we restrict this only
for Fuchsia, but this can be later expanded to other platforms. This
approach was already used prior to 9a041a7522, but we changed it
to always generate the __llvm_profile_runtime due to a TAPI limitation,
but that limitation may no longer apply, and it certainly doesn't apply
on platforms like Fuchsia.
Differential Revision: https://reviews.llvm.org/D98061
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.
Differential Revision: https://reviews.llvm.org/D107528
This change follows up on a FIXME submitted with D105974. This change simply let's the reference case fall through to return a concrete 'true'
instead of a nonloc pointer of appropriate length set to NULL.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D107720
Clang diagnostics refer to identifier names in quotes.
This patch makes inline remarks conform to the convention.
New behavior:
```
% clang -O2 -Rpass=inline -Rpass-missed=inline -S a.c
a.c:4:25: remark: 'foo' inlined into 'bar' with (cost=-30, threshold=337) at callsite bar:0:25; [-Rpass=inline]
int bar(int a) { return foo(a); }
^
```
Reviewed By: hoy
Differential Revision: https://reviews.llvm.org/D107791
%%%
This patch defines the macro __HOS_AIX__ when the target is AIX and without any dependency on the host. The macro indicates that the host is AIX. Defining the macro will help minimize porting pain for existing code compiled with xlc/xlC. xlC never shipped cross-compiling support, so the difference is not observable anyway.
%%%
This is a follow up to the discussion in https://reviews.llvm.org/D107242.
Reviewed By: cebowleratibm, joerg
Differential Revision: https://reviews.llvm.org/D107825
Summary: Move the test case to existing test file. Remove test file as duplicated. The file was mistakenly added due to concerns of a hidden bug (see https://reviews.llvm.org/D104381). After it turned out, that the bug was already fixed with another revision (https://reviews.llvm.org/D85817) and corresponding test was added as well, we can remove this file.
Differential Revision: https://reviews.llvm.org/D106152
This patch implements paper P0692R1 from the C++20 standard. Disable usual access checking rules to template argument names in a declaration of partial specializations, explicit instantiation or explicit specialization (C++20 13.7.5/10, 13.9.1/6).
Fixes: https://llvm.org/PR37424
This patch also implements option *A* from this paper P0692R1 from the C++20 standard.
This patch follows the @rsmith suggestion from D78404.
Reviewed By: krisb
Differential Revision: https://reviews.llvm.org/D92024
Explicitely set x86_64-linux-gnu as a target for asan-use-callbacks
clang test since some target do not support -fsanitize=address (e.g.
i386-pc-openbsd). Also remove redundant -fsanitize=address and move
-emit-llvm right after -S.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D107633
Before this patch, CXXCtorInitializers that don't typecheck get discarded in
most cases. In particular:
- typos that can't be corrected don't turn into RecoveryExpr. The full expr
disappears instead, and without an init expr we discard the node.
- initializers that fail initialization (e.g. constructor overload resolution)
are discarded too.
This patch addresses both these issues (a bit clunkily and repetitively, for
member/base/delegating initializers)
It does not preserve any AST nodes when the member/base can't be resolved or
other problems of that nature. That breaks invariants of CXXCtorInitializer
itself, and we don't have a "weak" RecoveryCtorInitializer like we do for Expr.
I believe the changes to diagnostics in existing tests are improvements.
(We're able to do some analysis on the non-broken parts of the initializer)
Differential Revision: https://reviews.llvm.org/D101641
Code added by D106688 has a problem. It passes the option -bxyz to the system linker as -b xyz xyz (duplication of the string 'xyz' is incorrect). This patch fixes that oversight.
Reviewed by: hubert.reinterpretcast, jsji
Differential Revision: https://reviews.llvm.org/D107786
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.
It is NFC for non-HIP languages.
Differential Revision: https://reviews.llvm.org/D102405
Xcode uses `#pragma mark -` to draw a divider in the outline view
and `#pragma mark Note` to add `Note` in the outline view. For more
information, see https://nshipster.com/pragma/.
Since the LSP spec doesn't contain dividers for the symbol outline,
instead we treat `#pragma mark -` as a group with children - the
decls that come after it, implicitly terminating when the symbol's
parent ends.
The following code:
```
@implementation MyClass
- (id)init {}
- (int)foo;
@end
```
Would give an outline like
```
MyClass
> Overrides
> init
> Public Accessors
> foo
```
Differential Revision: https://reviews.llvm.org/D105904
- They need to be preserved even if there's no reference within the
device code as the host code may need to initialize them based on the
application logic.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D107718
Linking `libclang.so` is currently broken on Solaris:
ld: fatal: option --version-script requires option -z gnu-version-script-compat to be specified
While Solaris `ld` supports a considerable subset of `--version-script`,
there are some elements of the syntax that aren't.
The fix is equivalent to D78510 <https://reviews.llvm.org/D78510>.
Additionally, use of C-style comments is a GNU extension
that can easily be avoided by using `#` as comment character, which is
supported by GNU `ld`, `gold`, and `lld`.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`,
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D107559
We were using an OpaqueValueExpr allocated on the stack to store
the size of a VLA. Because the VLASizeMap in CodegenFunction
uses the address of the expression to avoid recomputing VLAs,
we were accidentally reusing an earlier llvm::Value. This led to
invalid LLVM IR.
This is a temporary solution until VLASizeMap can be pushed and popped
based on the context.
Differential Revision: https://reviews.llvm.org/D107666
After D94315 we add the `NoInline` attribute to the outlined function to handle
data environments in the OpenMP if clause. This conflicted with the `AlwaysInline`
attribute added to the outlined function. for better performance in D106799.
The data environments should ideally not require NoInline, but for now this
fixes PR51349.
Reviewed By: mikerice
Differential Revision: https://reviews.llvm.org/D107649
The diagnostic texts for warning on attributes that don't appear on the
initial declaration is generally useful. We'd like to re-use it in
D106030, but first let's combine two that already are very similar so we
may re-use it a third time in that commit.
Also, fix a few places that were using notePreviousDefinition to point
to declarations, to instead use diag::note_previous_declaration.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D107613
Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that
there are users that do not expect LLVM to materialize `memset` intrinsic.
While other passes can do that, too, MemCpyOpt triggers it more frequently and
breaks sanitizers and some downstream users.
For now introduce a flag to force-enable the flag and opt-in only CUDA
compilation with NVPTX back-end.
Differential Revision: https://reviews.llvm.org/D106401
This reverts commit 5181be344a.
Break libcxx type_traits header which uses aligned storage with
alignments greater than 4096. Reverting untill we can fix the header.
%%%
This patch defines __HOS_AIX__ macro for AIX in case of a cross compiler implementation.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107242
%%%
This patch defines the macro __THW_PPC__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107243
%%%
This patch defines the macro __THW_BIG_ENDIAN__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107241
D31709 added an assertion was added to `FullSourceLoc::hasManager()` that ensured a valid `SourceLocation` is always paired with a `SourceManager`, and missing `SourceManager` is always paired with an invalid `SourceLocation`.
This appears to be incorrect, since clients never cared about constructing `FullSourceLoc` to uphold that invariant, or always checking `isValid()` before calling `hasManager()`.
The assertion started failing when serializing diagnostics pointing into an explicit module. Explicit modules don't have valid `SourceLocation` for the `import` statement, since they are "imported" from the command-line argument `-fmodule-name=x.pcm`.
This patch removes the assertion, since `FullSourceLoc` was never intended to uphold any kind of invariants between the validity of `SourceLocation` and presence of `SourceManager`.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106862
This introduces a new flag ignored-reference-qualifiers for the
existing "'A' qualifier on reference type B has no effect" diagnostic,
as a child of ignored-qualifiers.
Rationale:
This particular diagnostic is enabled by default, but other parts of
ignored-qualifiers are not. Anecdotally, a user may encounter this
diagnostic in the wild, and, seeing it to be valuable, might try to
raise it to error with -Werror=ignored-qualifiers, whereupon the other
diagnostics the flag covers will also be raised, to the user's surprise
and confusion. By splitting this diagnostic out into a separate flag,
and marking it as a child of ignored-qualifiers, we allow the user more
granular control of the diagnostics they care about, while maintaining
backwards compatibility with existing build scripts.
This change provides a way to conveniently declare types that have
address space qualifiers removed.
Since OpenCL adds address spaces implicitly even when they are not
specified in source, it is useful to allow deriving address space
unqualified types.
Fixes llvm.org/PR45326
Differential Revision: https://reviews.llvm.org/D106785
As we are trying to reach parity between opencl-c.h and
-fdeclare-opencl-builtins, ensure the documentation mentions that new
builtins should be added to both.
Reviewed by: Anastasia Stulova
This is recommit of the patch 16ff91ebcc,
reverted in 0c28a7c990 because it had
an error in call of getFastMathFlags (base type should be FPMathOperator
but not Instruction). The original commit message is duplicated below:
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by
instruction 'fcmp uno'. This simple solution is target-independent
and works well in most cases. It however is not suitable if floating
point exceptions are tracked. Corresponding IEEE 754 operation and C
function must never raise FP exception, even if the argument is a
signaling NaN. Compare instructions usually does not have such
property, they raise 'invalid' exception in such case. So this
mechanism is unsuitable when exception behavior is strict. In
particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948.
It is used in the cases when raising FP exceptions by 'isnan' is not
allowed. This solution implements 'isnan' using integer operations.
It solves the problem of exceptions, but offers one solution for all
targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
specific code into IR. Now only SystemZ implements this hook and it
generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or
target-specific intrinsics. It complicates analysis and can prevent
some optimizations.
* IR can be created by tools other than clang, in this case treatment
of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
On AVR, '.ctors' is used, not '.init_array'. Make this the default
unless specifically overridden by driver argument.
This matches gcc, and it matches the behavior in (e.g.) the NetBSD
driver (for certain OS variants).
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D107610
`__alignof__(x)` always returns `ABIAlign` if the "x" is marked `__attribute__((aligned()))`. However, the "aligned" attribute should only increase the alignment of a struct, or struct member, unless it's used together with the "packed" attribute, or used as a part of a typedef, in which case, the "aligned" attribute can both increase and decrease alignment.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D107598
GCC supports multiple forms of -falign-loops=.
-falign-loops= is currently ignored in Clang.
This patch implements the simplest but the most useful form where N is a
power of 2.
The underlying implementation uses a `llvm::TargetOptions` option for now.
Bitcode generation ignores this option.
Differential Revision: https://reviews.llvm.org/D106701
Asm is a gnu extension for C, so at present -fopenmp -std=c99
and similar fail to compile on nvptx, bug 51344
Changing to `__asm__` or `__asm` works for openmp, all three appear to work
for cuda. Suggesting `__asm__` here as `__asm` is used by MSVC with different
syntax, so this should make for better error diagnostics if the header is
passed to a compiler other than clang.
Reviewed By: tra, emankov
Differential Revision: https://reviews.llvm.org/D107492
The root problem is a null pointer is accessed during the call to
checkOpenMPLoop, because loop up bound expr is an error expression
due to error diagnostic was emit early.
To fix this, in setLCDeclAndLB, setUB and setStep instead return false,
return true when LB, UB or Step contains Error, so that the checking is
stopped in checkOpenMPLoop.
Differential Revision: https://reviews.llvm.org/D107385
On AIX an aligned attribute cannot decrease the alignment of a variable
when placed on a variable declaration of vector type.
Differential Revision: https://reviews.llvm.org/D107522
LLVMLineEditor library is part of the LLVM dylib. Move it into
LLVM_LINK_COMPONENTS to avoid duplicate linking when dylib is being
used. This fixes building standalone clang against installed LLVM
without static libraries.
Differential Revision: https://reviews.llvm.org/D107558
Limit the maximum alignment for attribute aligned to 4096 to match
the limit of the .align pseudo op in the system assembler.
Differential Revision: https://reviews.llvm.org/D107497
Clang diagnostics should not start with a capital letter or use
trailing punctuation (https://clang.llvm.org/docs/InternalsManual.html#the-format-string),
but quite a few driver diagnostics were not following this advice. This
corrects the grammar and punctuation to improve consistency, but does
not change the circumstances under which the diagnostics are produced.
For boolean options, e.g. `-fxor-operator`/`-fno-xor-operator`, we ought
to be using TableGen multi-classes. This way, we only have to write one
definition to have both forms auto-generated. This patch refactors all
of Flang's boolean options to use two new multi-classes:
`OptInFC1FFOption` and `OptOutFC1FFOption`. These multi-classes are
based on `OptInFFOption`/`OptOutFFOption`, respectively. I've also
simplified the processing of the updated options in
CompilerInvocation.cpp.
With the new approach, "empty" help text (i.e. no `HelpText`) is now
replaced with an empty string (i.e. HelpText<"">). When running
flang-new --help, that's considered as non-empty help messages, which is
then printed (that's controlled by `printHelp` from
llvm/lib/Option/OptTable.cpp). This means that with this patch,
flang-new --help will start printing e.g. -fno-backslash, even though
there is no actual help text to print for this option (apart from the
empty string ""). Tests are updated accordingly.
Note that with this patch, both `-fxor-operator` and `-fno-xor-operator`
(and other boolean options refactored here) remain available in
`flang-new` and `flang-new -fc1`. In this respect, nothing changes. In a
forthcoming patch, I will refine this so that `flang-new -fc1` only
accepts `-ffoo` (`OptInFC1FFOption`) or `-fno-foo` (`OptOutCC1FFOption`).
For clarity, `OptInFFOption`/`OptOutFFOption` are renamed as
`OptInCC1FFOption`/`OptOutCC1FFOption`, respectively. Otherwise, this is
an NFC from Clang's perspective.
Differential Revision: https://reviews.llvm.org/D105881
Builtin definitions with pointer arguments were duplicated to provide
overloads differing in the pointer argument's address space.
Reduce this duplication by capturing the definitions in multiclasses.
This still results in the same number of builtins in the generated
tables, but the description is more concise now.
Differential Revision: https://reviews.llvm.org/D107151
Implement target builtins for gfx90a including fadd64, fadd32, add2h,
max and min on various global, flat and ds address spaces for which
intrinsics are implemented.
Differential Revision: https://reviews.llvm.org/D106909
This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107393
For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.
Differential Revision: https://reviews.llvm.org/D106860
Zero-width bitfields on AIX pad out to the natral alignment boundary but
do not change the containing records alignment.
Differential Revision: https://reviews.llvm.org/D106900
The lit tests for `clang-scan-deps` invoke the tool without going through the substitution system. While the test runner correctly picks up the `clang-scan-deps` binary from the build directory, it doesn't print its absolute path. When copying the invocations when reproducing test failures, this can result in `command not found: clang-scan-deps` errors or worse yet: pick up the system `clang-scan-deps`. This patch adds new local `%clang-scan-deps` substitution.
Reviewed By: lxfind, dblaikie
Differential Revision: https://reviews.llvm.org/D107155
For some use-cases, it might be useful to be able to turn off modules for C++ in `-cc1`. (The feature is implied by `-std=C++20`.)
This patch exposes the `-fno-cxx-modules` option in `-cc1`.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106864
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by
instruction 'fcmp uno'. This simple solution is target-independent
and works well in most cases. It however is not suitable if floating
point exceptions are tracked. Corresponding IEEE 754 operation and C
function must never raise FP exception, even if the argument is a
signaling NaN. Compare instructions usually does not have such
property, they raise 'invalid' exception in such case. So this
mechanism is unsuitable when exception behavior is strict. In
particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948.
It is used in the cases when raising FP exceptions by 'isnan' is not
allowed. This solution implements 'isnan' using integer operations.
It solves the problem of exceptions, but offers one solution for all
targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
specific code into IR. Now only SystemZ implements this hook and it
generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or
target-specific intrinsics. It complicates analysis and can prevent
some optimizations.
* IR can be created by tools other than clang, in this case treatment
of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
See PR48656.
The implementation of the template instantiation of requires expressions
was incorrectly trying to get the expression from an 'ExprRequirement'
before checking if it was an error state.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D107399
See PR47174.
When canonicalizing nested name specifiers of the type kind,
the prefix for 'DependentTemplateSpecialization' types was being
dropped, leading to malformed types which would cause failures
when rebuilding template names.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D107311
where should not.
Currently we are using QTy->isIncompleteType(&ND) to check incomplete
type. But before doing that, need to instantiate for a class template
specialization or a class member of a class template specialization,
or an array with known size of such..., so that we know it is really
incomplete type.
To fix this using RequireCompleteType instead.
The new test is added into "test/OpenMP/target_update_messages.cpp"
The different of using RequireCompleteType is when emit incomplete type,
an additional note is also emitted to point to where incomplete type
is declared. Because this change, many tests are needed to be fixed
by adding additional note.
This is to fix https://bugs.llvm.org/show_bug.cgi?id=50508
Differential Revision: https://reviews.llvm.org/D107200
https://reviews.llvm.org/D105964 updated the detection of function
definitions. It had the unfortunate effect to start marking object
definitions with attribute-like macros as function definitions.
This addresses this issue.
Reviewed By: owenpan
Differential Revision: https://reviews.llvm.org/D107269
This patch implements P2092
Simple requirements in requirement body shall not start with requires.
A warning was already in place so we just turn this warning into an error.
In addition, we add tests to make sure typename is optional in
requirement-parameter-list as per the same paper.
Previously we would show an error, but keep the member, and also the
CXXRrecordDecl, valid. This could lead to crashes when attempting to
access the record layout or size.
Differential Revision: https://reviews.llvm.org/D105478
The patch https://reviews.llvm.org/D105964 (58494c856a)
updated detection of function declaration names. It had the unfortunate
consequence that it started breaking between `function` and the function
name in some cases in JavaScript code.
This patch addresses this.
Reviewed By: MyDeveloperDay, owenpan
Differential Revision: https://reviews.llvm.org/D107267
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.
With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`. `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.
Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:
thinlto/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
"dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
"dwarfehprepare.NumNoUnwind": 28744,
a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.
Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D105225
Add more checks, info on -fno-sanitize=..., and reference to 5/2021 UBSan Oracle blog.
Authored By: DianeMeirowitz
Reviewed By: hctim
Differential Revision: https://reviews.llvm.org/D106908
The new -mtargetos= option is a replacement for the existing, OS-specific options
like -miphoneos-version-min=. This allows us to introduce support for new darwin OSes
easier as they won't require the use of a new option. The older options will be
deprecated and the use of the new option will be encouraged instead.
Differential Revision: https://reviews.llvm.org/D106316
In some cases, when the execution path of the diagnostic
goes back and forth, arrows can overlap and create a mess.
Dimming arrows that are not relevant at the moment, solves this issue.
They are still visible, but don't draw too much attention.
Differential Revision: https://reviews.llvm.org/D92928
This commit adds a very first version of this feature.
It is off by default and has to be turned on by checking the
corresponding box. For this reason, HTML reports still keep
control notes (aka grey bubbles).
Further on, we plan on attaching arrows to events and having all arrows
not related to a currently selected event barely visible. This will
help with reports where control flow goes back and forth (eg in loops).
Right now, it can get pretty crammed with all the arrows.
Differential Revision: https://reviews.llvm.org/D92639
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
For some reason, Microsoft declares _m_prefetch to take a const void*,
but _m_prefetchw to take a /volatile/ const void*.
Do the same for compatibility.
Differential revision: https://reviews.llvm.org/D106790
Definition of `__cpp_threadsafe_static_init` macro is controlled by
language option Opts.ThreadsafeStatics. This patch sets language
option to false by default in OpenCL mode, resulting in macro
`__cpp_threadsafe_static_init` being undefined. Default value can be
overridden using command line option -fthreadsafe-statics.
Change is supposed to address portability because not all OpenCL
vendors support thread safe implementation of static initialization.
Fixes llvm.org/PR48012
Differential Revision: https://reviews.llvm.org/D107163
The -fms-extensions converts __pragma (and _Pragma) into a #pragma that
has to occur at the beginning of a line and end with a newline. This
patch ensures that the newline after the #pragma is added even if
Token::isAtStartOfLine() indicated that we should not start a newline.
Committing relying post-commit review since the change is small, some
downstream uses might be blocked without this fix, and to make clear the
decision of the new -fminimize-whitespace feature (fix on main, revert
on clang-13.x branch) suggested by @aaron.ballman in D104601.
Differential Revision: https://reviews.llvm.org/D107183
Use `clang_target_link_libraries` to avoid duplicate libraries when
the same symbol is provided both by a static library and a larger
dylib, fixing linking with win32 dylibs. This fixes errors like
these:
ld.lld: error: duplicate symbol: llvm::createStringError(std::__1::error_code, char const*)
>>> defined at libLLVMSupport.a(Error.cpp.obj)
>>> defined at libLLVM-14git.dll
This matches how other clang tools declare their dependencies.
Differential Revision: https://reviews.llvm.org/D107231
Currently, the default alignment is much larger than the actual size of
the vector in memory. Fix this to use a sane default.
For SVE, temporarily remove lowering of load/store operations for
predicates with less than 16 elements. The layout the backend was
assuming for SVE predicates with less than 16 elements doesn't agree
with the frontend. More work probably needs to be done here.
This change is, strictly speaking, not backwards-compatible at the
bitcode level. But probably nobody is actually depending on that; i1
vectors in memory are rare, and the code that does use them probably
ends up forcing the alignment to something sane anyway. If we think
this is a concern, I can restrict this to scalable vectors for now
(where it's actually causing issues for me at the moment).
Differential Revision: https://reviews.llvm.org/D88994
Target-dependent constant folding will fold these down to simple
constants (or at least, expressions that don't involve a GEP). We don't
need heroics to try to optimize the form of the expression before that
happens.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 .
Differential Revision: https://reviews.llvm.org/D107116
In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate
type { [8 x i64] }. When emitting code for inline assembly operands,
clang tries to scalarize aggregate types to an integer of the equivalent
length, otherwise it passes them by-reference. This patch adds a target
hook to tell whether a given inline assembly operand is scalarizable
so that clang can emit code to pass/return it by-value.
Differential Revision: https://reviews.llvm.org/D94098
Rename the current -E option to "-E -Xflang -fno-reformat".
Add a new Parsing::EmitPreprocessedSource() routine to convert the
cooked character stream output of the prescanner back to something
more closely resembling output from a traditional preprocessor;
call this new routine when -E appears.
The new -E output is suitable for use as fixed form Fortran source to
compilation by (one hopes) any Fortran compiler. If the original
top-level source file had been free form source, the output will be
suitable for use as free form source as well; otherwise there may be
diagnostics about missing spaces if they were indeed absent in the
original fixed form source.
Unless the -P option appears, #line directives are interspersed
with the output (but be advised, f18 will ignore these if presented
with them in a later compilation).
An effort has been made to preserve original alphabetic character case
and source indentation.
Add -P and -fno-reformat to the new drivers.
Tweak test options to avoid confusion with prior -E output; use
-fno-reformat where needed, but prefer to keep -E, sometimes
in concert with -P, on most, updating expected results accordingly.
Differential Revision: https://reviews.llvm.org/D106727
The builtins vec_xl_len_r and vec_xst_len_r actually use the
wrong side of the vector on big endian Power9 systems. We never
spotted this before because there was no such thing as a big
endian distro that supported Power9. Now we have AIX and the
elements are in the wrong part of the vector. This just fixes
it so the elements are loaded to and stored from the right
side of the vector.
Change `CountersPtr` in `__profd_` to a label difference, which is a link-time
constant. On ELF, when linking a shared object, this requires that `__profc_` is
either private or linkonce/linkonce_odr hidden. On COFF, we need D104564 so that
`.quad a-b` (64-bit label difference) can lower to a 32-bit PC-relative relocation.
```
# ELF: R_X86_64_PC64 (PC-relative)
.quad .L__profc_foo-.L__profd_foo
# Mach-O: a pair of 8-byte X86_64_RELOC_UNSIGNED and X86_64_RELOC_SUBTRACTOR
.quad l___profc_foo-l___profd_foo
# COFF: we actually use IMAGE_REL_AMD64_REL32/IMAGE_REL_ARM64_REL32 so
# the high 32-bit value is zero even if .L__profc_foo < .L__profd_foo
# As compensation, we truncate CountersDelta in the header so that
# __llvm_profile_merge_from_buffer and llvm-profdata reader keep working.
.quad .L__profc_foo-.L__profd_foo
```
(Note: link.exe sorts `.lprfc` before `.lprfd` even if the object writer
has `.lprfd` before `.lprfc`, so we cannot work around by reordering
`.lprfc` and `.lprfd`.)
With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`)
`ld -pie` linked clang is 1.74% smaller due to fewer R_X86_64_RELATIVE relocations.
```
% readelf -r pie | awk '$3~/R.*/{s[$3]++} END {for (k in s) print k, s[k]}'
R_X86_64_JUMP_SLO 331
R_X86_64_TPOFF64 2
R_X86_64_RELATIVE 476059 # was: 607712
R_X86_64_64 2616
R_X86_64_GLOB_DAT 31
```
The absolute function address (used by llvm-profdata to collect indirect call
targets) can be converted to relative as well, but is not done in this patch.
Differential Revision: https://reviews.llvm.org/D104556
Don't try to run the non-integrated assembler; just verify that the
invocations look like what we expect. Do verify that the integrated
assembler handles warnings as expected.
This patch will re-enable the patch posted under https://reviews.llvm.org/D106688 originally which was reverted due to buildbreak that was caused by mismatched diagnostic message arguments.
Reviewed By: Zarko Todorovski
Differential Revision: https://reviews.llvm.org/D107105
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
This is the same patch as in D106748 but with a tiny fix in checking of diagnostic messages.
Also added tests when program scope global variables are not supported.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D107154
Commit 83df122 (r368334) added 'REQUIRES: linux' to this test, but
because triples are not respected by REQUIRES, that meant it was
invariably Unsupported. The correct keyword would be 'system-linux'
(checking the host rather than the target).
Because the test was always skipped, commit 0cfd9e5 (r375439) did not
notice that the test modification was incorrect.
This patch corrects the REQUIRES clause and fixes the incorrect
previous patch.
Found after implementing https://reviews.llvm.org/D107162
With this patch, OpenMP on AMDGCN will use the math functions
provided by ROCm ocml library. Linking device code to the ocml will be
done in the next patch.
Reviewed By: JonChesterfield, jdoerfert, scchan
Differential Revision: https://reviews.llvm.org/D104904
Under the -faltivec-src-compat=gcc option, AltiVec vector initialization should
be treated as if they were compiled with gcc - which is, to emit an error when
the vectors are initialized in the parenthesized or non-parenthesized manner.
This patch implements this behaviour.
Differential Revision: https://reviews.llvm.org/D106410
In a post-commit message to https://reviews.llvm.org/D102343
@MaskRay pointed out syntax errors in one of the test cases. This
patch fixes those problems, I had forgotten the colon after the CHECK- strings.
Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.
Reviewed By: JonChesterfield, yaxunl
Differential Revision: https://reviews.llvm.org/D105981
Renamed language standard from openclcpp to openclcpp10.
Added new std values i.e. '-cl-std=clc++1.0' and
'-cl-std=CLC++1.0'.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D106266
Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D107025
CL 2.0 introduced atomics and generic address space so there were
only one set of APIs for doing atomics, however since CL 3.0
makes generic address space optional, there has to be new sets
of atomic interfaces to handle that cases.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106778
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106748
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well.
Also, ensure that cl_khr_3d_image_writes feature macro is set to the same value.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106260
Parse the -b option in the driver and pass it to the linker if the target OS is AIX. This will establish compatibility with the other AIX compilers.
Reviewed By: Zarko Todorovski
Differential Revision: https://reviews.llvm.org/D106688
This patch adds `#pragma clang deprecated` to enable deprecation of
preprocessor macros.
The macro must be defined before `#pragma clang deprecated`. When
deprecating a macro a custom message may be optionally provided.
Warnings are emitted at the use site of a deprecated macro, and can be
controlled via the `-Wdeprecated` warning group.
This patch takes some rough inspiration and a few lines of code from
https://reviews.llvm.org/D67935.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106732
@kpn pointed out that the global variable initialization functions didn't
have the "strictfp" metadata set correctly, and @rjmccall said that there
was buggy code in SetFPModel and StartFunction, this patch is to solve
those problems. When Sema creates a FunctionDecl, it sets the
FunctionDeclBits.UsesFPIntrin to "true" if the lexical FP settings
(i.e. a combination of command line options and #pragma float_control
settings) correspond to ConstrainedFP mode. That bit is used when CodeGen
starts codegen for a llvm function, and it translates into the
"strictfp" function attribute. See bugs.llvm.org/show_bug.cgi?id=44571
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D102343
Summary:
Set the TargetCPUName for AIX to default to pwr7, removing the setting
of it based on the major/minor of the OS version, which previously
set it to pwr4 for AIX 7.1 and earlier. The old code would also set it to
pwr4 when the OS version was not specified and with the change, it will
default it to pwr7 in all cases.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By:hubert.reinterpretcast (Hubert Tong)
Differential Revision: https://reviews.llvm.org/D107063
This function is marked with CINDEX_LINKAGE, but was never added to the
export list / linker script.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D106974
The implementation of -fminimize-whitespace (D104601) revised the logic
when to emit newlines. There was no case to handle when more than
8 lines were skippped in -P (DisableLineMarkers) mode and instead fell
through the case intended for -fminimize-whitespace, i.e. emit nothing.
This patch will emit one newline in this case.
The newline logic is slightly reorganized. The `-P -fminimize-whitespace`
case is handled explicitly and emitting at least one newline is the new
fallback case. The choice between emitting a line marker or up to
7 empty lines is now a choice only with enabled line markers. The up to
8 newlines likely are fewer characters than a line directive, but
in -P mode this had the paradoxic effect that it would print up to
7 empty lines, but none at all if more than 8 lines had to be skipped.
Now with DisableLineMarkers, we don't consider printing empty lines
(just start a new line) which matches gcc's behavior.
The line-directive-output-mincol.c test is replaced with a more
comprehensive test skip-empty-lines.c also testing the more than
8 skipped lines behaviour with all flag combinations.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106924
When substitution failed on the first constrained template argument (but
only the first), we would assert / crash. Checking for failure was only
being performed from the second constraint on.
This changes it so the checking is performed in that case,
and the code is also now simplified a little bit to hopefully
avoid this confusion.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D106907
This cleanup patch refactors a bunch of functional duplicates of
getDecltypeForParenthesizedExpr into a common implementation.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: aaronpuchert
Differential Revision: https://reviews.llvm.org/D100713
On ELF, an SHT_INIT_ARRAY outside a section group is a GC root. The current
codegen abuses SHT_INIT_ARRAY in a section group to mean a GC root.
On PE/COFF, the dynamic initialization for `__declspec(selectany)` in a comdat
can be garbage collected by `-opt:ref`.
Call `addUsedGlobal` for the two cases to fix the abuse/bug.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D106925
The Clang interpreter's bytecode uses a packed stream of bytes
representation, but also wants to have some opcodes take pointers as
arguments, which are currently embedded in the bytecode directly.
However, CHERI, and thus Arm's upcoming experimental Morello prototype,
provide spatial memory safety for C/C++ by implementing language-level
(and sub-language-level) pointers as capabilities, which track bounds,
permissions and validity in hardware. This uses tagged memory with a
single tag bit at every capability-aligned address, and so storing
pointers to unaligned addresses results in the tag being stripped,
leading to a tag fault when the pointer is ultimately dereferenced at a
later point.
In order to support a stricter C/C++ implementation like CHERI, we no
longer store pointers directly in the bytecode, instead storing them in
a table and embedding the index in the bytecode.
Reviewed By: nand
Differential Revision: https://reviews.llvm.org/D97606
ClassTemplateSpecializationDecl not within a ClassTemplateDecl
represents an explicit instatiation of a template and so should be
handled as if it were a normal CXXRecordDecl. Unfortunately, having an
equivalent for FunctionTemplateDecl remains a TODO in ASTDumper's
VisitFunctionTemplateDecl, with all the explicit instantiations just
being emitted inside the FunctionTemplateDecl along with all the other
specializations, meaning we can't easily support explicit function
instantiations in update_cc_test_checks.
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D106243
The Intel compiler ICC supports the option "-fp-model=(source|double|extended)"
which causes the compiler to use a wider type for intermediate floating point
calculations. Also supported is a way to embed this effect in the source
program with #pragma float_control(source|double|extended).
This patch extends pragma float_control syntax, and also adds support
for a new floating point option "-ffp-eval-method=(source|double|extended)".
source: intermediate results use source precision
double: intermediate results use double precision
extended: intermediate results use extended precision
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D93769
Currently, we prohibit this pragma from appearing within a language
linkage specification, but this is useful functionality that is
supported by MSVC (which is where we inherited this feature from).
This patch allows you to use the pragma within an extern "C" {} (etc)
block.
Previously, with AllowShortEnumsOnASingleLine disabled, enums that would have otherwise fit on a single line would always put the opening brace on its own line.
This patch ensures that these enums will only put the brace on its own line if the existing attachment rules indicate that it should.
Reviewed By: HazardyKnusperkeks, curdeius
Differential Revision: https://reviews.llvm.org/D99840
The device runtime contains several calls to __kmpc_get_hardware_num_threads_in_block
and __kmpc_get_hardware_num_blocks. If the thread_limit and the num_teams are constant,
these calls can be folded to the constant value.
In commit D106033 we have the optimization phase. This commit adds the attributes to
the outlined function for the grid size. the two attributes are `omp_target_num_teams` and
`omp_target_thread_limit`. These values are added as long as they are constant.
Two functions are created `getNumThreadsExprForTargetDirective` and
`getNumTeamsExprForTargetDirective`. The original functions `emitNumTeamsForTargetDirective`
and `emitNumThreadsForTargetDirective` identify the expresion and emit the code.
However, for the Device version of the outlined function, we cannot emit anything.
Therefore, this is a first attempt to separate emision of code from deduction of the
values.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106298
There are some platform that might not have version script support,
don't try to use version script on those.
Reviewed By: MaskRay, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D106914
With the old PM, the stub for __hwasan_generate_tag is still generated
in the IR, but never called.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D106858
Remove overriding MinGlobalAlign to 0 for z/OS target to be consistent with SystemZ.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D106890
Change the ffp-model=precise to enables -ffp-contract=on (previously
-ffp-model=precise enabled -ffp-contract=fast). This is a follow-up
to Andy Kaylor's comments in the llvm-dev discussion "Floating Point
semantic modes". From the same email thread, I put Andy's distillation
of floating point options and floating point modes into UsersManual.rst
Also fixes bugs.llvm.org/show_bug.cgi?id=50222
I had to revert this a few times because of failures on the x86-64
buildbot but I think we finally have that fixed by LNT/79f2b03c51.
Reviewed By: rjmccall, andrew.kaylor
Differential Revision: https://reviews.llvm.org/D74436
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.
Differential Revision: https://reviews.llvm.org/D106724
Redefines NULL as nullptr instead of ((void*)0)
in C++ for OpenCL.
Such internal representation of NULL provides
compatibility with C++11 and later language
standards.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D105987
> `#pragma clang include_instead(<header>)` is a pragma that can be used
> by system headers (and only system headers) to indicate to a tool that
> the file containing said pragma is an implementation-detail header and
> should not be directly included by user code.
>
> The library alternative is very messy code that can be seen in the first
> diff of D106124, and we'd rather avoid that with something more
> universal.
>
> This patch takes the first step by warning a user when they include a
> detail header in their code, and suggests alternative headers that the
> user should include instead. Future work will involve adding a fixit to
> automate the process, as well as cleaning up modules diagnostics to not
> suggest said detail headers. Other tools, such as clangd can also take
> advantage of this pragma to add the correct user headers.
>
> Differential Revision: https://reviews.llvm.org/D106394
This caused compiler crashes in Chromium builds involving PCH and an include
directive with macro expansion, when Token::getLiteralData() returned null. See
the code review for details.
This reverts commit e8a64e5491.
We have the basic infrastructure in place. We can recover from simple errors
(recovering from errors in template instantiations is not yet supported). It
looks like we are in a reasonably functional state for llvm13.
Differential revision: https://reviews.llvm.org/D106813
I don't know how well this works with clang-cl, but people want to try
it out, and I think we want to make it work, so exposing the flags seems
reasonable.
Differential revision: https://reviews.llvm.org/D106791
When `-fno-integrated-as` is passed to the Clang driver (or set by default by a specific toolchain), it will construct an assembler job in addition to the cc1 job. Similarly, the `-fembed-bitcode` driver flag will create additional cc1 job that reads LLVM IR file.
The Clang tooling library only cares about the job that reads a source file. Instead of relying on the fact that the client injected `-fsyntax-only` to the driver invocation to get a single `-cc1` invocation that reads the source file, this patch filters out such jobs from `Compilation` automatically and ignores the rest.
This fixes a test failure in `ClangScanDeps/headerwithname.cpp` and `ClangScanDeps/headerwithnamefollowedbyinclude.cpp` on AIX reported here: https://reviews.llvm.org/D103461#2841918 and `clang-scan-deps` failures with `-fembed-bitcode`.
Depends on D106788.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D105695
This patch exposes `InputInfo` in `Job` instead of plain filenames. This is useful in a follow-up patch that uses this to recognize `-cc1` commands interesting for Clang tooling.
Depends on D106787.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D106788